Projects:Import Export Sample Data Using PostgreSQL Copy
Contents |
Introduction
One requirement of the Multiserver project is to be able to, given a central server, be able to export the data relative to a given store. The ExportStoreData [1] class takes care of this, and it does it by extending the ExportSampledata class, which is part of dbsourcemanager.
Given the big size of the exported data, some changes are required in ExportSampledata to make it suitable to be extended to export stores. Mainly there are three points that needed to be modified in order to support the big volumes:
- Being able to decide whether the exported data should be ordered using the key columns. This is not needed in order to export a store, and it takes a lot of time and memory to order the results if the exported table has big volumes.
- Being able to export the data using PostgreSQL's COPY functionality. It is faster and the generated file size is much smaller.
Tracking issue: https://issues.openbravo.com/view.php?id=33875
PostgreSQL's COPY
PostgreSQL offers a way to import and export the contents of a table directly to a file using the COPY command [1].
For instance, this command can be invoked to export the rows of the F&B client of c_uom to a file:
COPY (SELECT * FROM c_uom WHERE ad_client_id = '23C59575B9CF467C9620760EB255B389') TP '/tmp/c_uom.copy' WITH (FORMAT CSV, HEADER true)
To import it a copy file to the database this command can be invoked:
COPY c_uom FROM '/tmp/c_uom.copy' WITH (FORMAT CSV, HEADER true)
[1] https://www.postgresql.org/docs/9.5/static/sql-copy.html
The standard format split the column values with tabs and uses \N to represent null values, while the CSV format uses a comma to split the column values, as does not use any character to replace null values. This results in the files exported with the CSV format being smaller. The size difference is proportional to the number of null values in the exported table.
It is possible to exclude some columns from being exported. This makes necessary to include the HEADER parameter, so that the full list of exported column is present in the exported file, and can be used to import it in the target database.
How to export the sample data using PostgreSQL's COPY
To export the sample data using PostgreSQL's COPY, the exportFormat ant parameter must be included with the value 'copy', for instance:
ant export.sample.data -Dclient="F&B International Group" -Dmodule=org.openbravo -DexportFormat=copy
How to import the sample data using PostgreSQL's COPY
There is no need to do any extra steps to import export that that has been exported in the copy format. The ImportSampledata task will know how to import the data based on the file extension.
If the database where the data is being imported is Oracle, the .copy files will be ignored.
QA
- There is no difference in the output when export.sample.data executed is with the default export format. I have exported the sample data with and without the project changes and in both cases the result is exactly the same. There is also no noticiable difference in the time it takes to export the query.
- If sample data is exported with copy, then imported running install.source and then exported back to xml, the results are the same as when the sample data is exported without the project changes.
- Code review