View source | View content page | Page history | Printable version   

Projects:Purge Data From Database/Technical Specification

Contents

Purge script overview

This entry will describe the technical aspects of the purge script.

Auxiliary tables

Five tables are part of the purge script. The first two are automatically populated by the purge script, while the other three are configuration tables that must be populate to configure how the purge should behave.

obpurg_temp_purg_data

Stores information about the records that have been purged, whose dependencies has not been removed yet. Let's consider this schema, and suppose that the data that will be removed is this.

If the script starts removing the cash ups, the evolution of the obpurg_temp_purg_data will be like this:

The cashups are removed, and info about the purged is included in obpurg_temp_purg_data:


table_name record_id deleting_dependencies
obpos_app_cashup #1 N
obpos_app_cashup #2 N


After this initial steps the scripts goes in a loop. While the obpurg_temp_purg_data is not empty, it picks any record, and sets the deleting_depencies flag of all the records with the same table_name to 'Y', and will start deleting its dependencies.


table_name record_id deleting_dependencies
obpos_app_cashup #1 Y
obpos_app_cashup #2 Y


The purge script removes the records that reference the purged cashups, and add them to obpurg_temp_purg_data.


table_name record_id deleting_dependencies
obpos_app_cashup #1 Y
obpos_app_cashup #2 Y
c_order #1 N
c_order #2 N


Once all dependencies of obpos_app_cashup has been removed, its records are removed from obpurg_temp_purg_data.


table_name record_id deleting_dependencies
c_order #1 N
c_order #2 N


The next table whose dependencies will be removed is c_order:


table_name record_id deleting_dependencies
c_order #1 Y
c_order #2 Y


The purge scripts removes the records that reference the purged orders and add them to obpurg_temp_purg_data.


table_name record_id deleting_dependencies
c_order #1 Y
c_order #2 Y
c_orderline #1 N
c_orderline #2 N
c_orderline #3 N
c_orderline #4 N


Once all dependencies of c_order has been removed, its records are removed from obpurg_temp_purg_data.


table_name record_id deleting_dependencies
c_orderline #1 N
c_orderline #2 N
c_orderline #3 N
c_orderline #4 N


The next table whose dependencies will be removed is c_orderline


table_name record_id deleting_dependencies
c_orderline #1 Y
c_orderline #2 Y
c_orderline #3 Y
c_orderline #4 Y


The purge scripts removes the records that reference the purged order lines and add them to obpurg_temp_purg_data.


table_name record_id deleting_dependencies
c_orderline #1 Y
c_orderline #2 Y
c_orderline #3 Y
c_orderline #4 Y
c_invoiceline #1 N
c_invoiceline #2 N
c_invoiceline #3 N
c_invoiceline #4 N


Once all dependencies of c_orderline has been removed, its records are removed from obpurg_temp_purg_data.


table_name record_id deleting_dependencies
c_invoiceline #1 N
c_invoiceline #2 N
c_invoiceline #3 N
c_invoiceline #4 N


The next table whose dependencies will be removed is c_invoiceline:


table_name record_id deleting_dependencies
c_invoiceline #1 Y
c_invoiceline #2 Y
c_invoiceline #3 Y
c_invoiceline #4 Y


Because of a custom configuration, when an invoice line is removed, its invoices are also removed. Info about the removed records is added to obpurg_temp_purg_data:


table_name record_id deleting_dependencies
c_invoiceline #1 Y
c_invoiceline #2 Y
c_invoiceline #3 Y
c_invoiceline #4 Y
c_invoice #1 N
c_invoice #2 N


Once all dependencies of c_invoiceline has been removed, its records are removed from obpurg_temp_purg_data.


table_name record_id deleting_dependencies
c_invoice #1 N
c_invoice #2 N


The next table whose dependencies will be removed is c_invoice:

table_name record_id deleting_dependencies
c_invoice #1 Y
c_invoice #2 Y


Once all dependencies of c_invoice has been removed, its records are removed from obpurg_temp_purg_data.


table_name record_id deleting_dependencies
c_invoice #1 Y
c_invoice #2 Y
c_file #1 N
c_file #2 N


Once all dependencies of c_invoice has been removed, its records are removed from obpurg_temp_purg_data.


table_name record_id deleting_dependencies
c_file #1 N
c_file #2 N


The next table whose dependencies will be removed is c_file:


table_name record_id deleting_dependencies
c_file #1 Y
c_file #2 Y


There are no records that reference the purged c_files. Its records are removed from obpurg_temp_purg_data.


table_name record_id deleting_dependencies


When obpurg_temp_purg_data is empty, the purge script is finished.

obpurg_purge_summary

This table will keep track of how many records has been purged per table.

At the end of the purge script execution its contents will be exported to the output folder

obpurg_external_foreign_key

This configuration table must be used to let the purge script know of foreign keys that are not defined in the database catalog.

For instance, to define a virtual foreign key from c_order to obpos_app_cashup, the following row must be inserted:

INSERT INTO obpurg_external_foreign_key(source_table_name, source_column_name, target_table_name) 
VALUES ('c_order', 'em_obpos_app_cashup_id', 'obpos_app_cashup');

The table admits more complex virtual foreign keys. For instances, it allows to add a foreign key from c_file to c_invoice like this:

INSERT INTO obpurg_external_foreign_key(source_table_name, source_column_name, target_table_name, where_clause) 
VALUES ('c_file', 'ad_record_id', 'c_invoice', 'ad_table_id = ''318''');



obpurg_table_parent_definition

This configuration table must be used to let the purge script know of foreign keys that should be used to remove data referenced FROM a table that has been purged (by default the purge script automatically that that references a table that has been purged, the other way around).

For instance, to determine that when a c_invoice line is removed its c_invoice should be deleted, the following configuration is needed:

INSERTO INTO obpurg_table_parent_definition
VALUES ('c_invoiceline', 'c_invoice_id', 'c_invoice_id','c_invoice');

Four columns must be filled in:

In most cases the name of the column in that child and parent tables will be the same, but not in all.

obpurg_on_delete_null_refs

When a record is purged, there might be that some of the records that referenced it should not be removed, but the foreign key set to null.

The foreign keys that must be set to null when its referenced record is purged must be configured in the obpurg_on_delete_null_refs, specifying the foreign key by its table name and column name:

INSERT INTO obpurg_on_delete_null_refs VALUES ('c_orderline', 'm_inoutline_id');

obpurg_extra_output_config

When the purge script is executed, it generates files with info about the records that have been removed. By default it includes the table name and the record id. The obpurg_extra_output_config table can be populated to include more information in those files.

Example:

INSERT INTO obpurg_extra_output_config
VALUES ('c_file', 'path, name', 'path || ''/'' || name');

Script functions

The script is comprised of several smaller functions. This section will summarize them.

create_purge_temp_tables

Signature:

create_purge_temp_tables() RETURNS void

Creates the tables used by the purge script:

initialize_purge_summary_tables

Signature:

initialize_purge_summary_tables() RETURNS void

Populates the obpurg_purge_summary table, it creates an entry for each table initializing the n_rows_delete column to 0.

nullify_references_of_table

Signature:

nullify_references_of_table(table_name character varying, dependent_table_name character varying, dependent_column_name character varying) RETURNS void

Function invoked to nullify the references to the purged record of a table (table_name) of a given foreign key, described by its table (dependent_table_name) and column (dependent_column_name).

For instance, if an order that has been returned is purged, its return should not be removed by the purge script, but turned into a blind return. A returned order line references the original order via the c_orderline.m_inoutline_id property.

The following query will be executed to nullify those references:

UPDATE c_orderline SET m_inoutline_id = NULL 
FROM obpurg_temp_purg_data p
WHERE m_inoutline_id = p.record_id
AND p.table_name = 'm_inoutline'
AND p.deleting_dependencies = 'O'

purge_dependencies_of_table

Signature:

purge_dependencies_of_table(table_name character varying, dependent_table_name character varying, dependent_column_name character varying, where_clause character varying, what_to_delete character varying) RETURNS void

This function is invoked after purging data from a table (table_name), to remove its dependencies per foreign key (determined by dependent_table_name, dependent_column_name).

For instance, when entries from c_order are purged, the records from c_orderline that reference them are removed using this query:

WITH deleted AS (DELETE FROM c_orderline 
				USING   obpurg_temp_purg_data p
				WHERE c_order_id = p.record_id
				AND p.table_name = 'c_order' 
				AND p.deleting_dependencies = 'O' RETURNING c_orderline_id),  
inserted AS (INSERT INTO obpurg_temp_purg_data (obpurg_temp_purg_data_id, table_name, record_id)
				SELECT get_uuid(), 'c_orderline', c_orderline_id FROM deleted RETURNING 1);

Note that info about the deleted records is inserted into obpurg_temp_purg_data, so that its dependencies are removed by the purge script in a later step.

If dependent_table_name has been configured so that its parent record must automatically be deleted, the query looks a little bit different to take this into account:

WITH deleted AS (DELETE FROM c_invoiceline 
				USING   obpurg_temp_purg_data p
				WHERE c_orderline_id = p.record_id
				AND p.table_name = 'c_orderline' 
				AND p.deleting_dependencies = 'O' RETURNING c_invoiceline_id, c_invoice_id),  
inserted AS (INSERT INTO obpurg_temp_purg_data (obpurg_temp_purg_data_id, table_name, record_id, parent_table_name, parent_record_id)
				SELECT get_uuid(), 'c_invoiceline', c_invoiceline_id, 'c_invoice', c_invoice_id FROM deleted RETURNING 1);

purge_table

purge_table_and_remove_fks

Signature:

purge_table_and_remove_fks(table_name character varying, where_clause character varying)

Same as the purge_table function, the difference being that purge_table_and_remove_fks removes the foreign keys. It intented to be used just before running update.database, to ensure that the purge script execution has not broken the referencial integrity (otherwise update.database will fail when trying to recreate the foreign key).

Retrieved from "http://wiki.openbravo.com/wiki/Projects:Purge_Data_From_Database/Technical_Specification"

This page has been accessed 627 times. This page was last modified on 30 August 2019, at 11:53. Content is available under Creative Commons Attribution-ShareAlike 2.5 Spain License.