Projects:DBSM Performance Improvements/Technical Specifications
Contents |
Introduction
This project groups performance improvements in different DBSM tasks.
Improvements
Here are listed the different areas to be improved.
Prevent table recreation
When updating database, there are some cases tables are recreated, this process is slow because requires not only to recreate the table preserving its data, but also to recreate all its indexes, which in case of big data volumes can be time consuming.
In 3.0PR15Q3, most of these recreations were prevented (see Table recreation prevention project). Though there are still some (not that common) cases when it is still required to recreate.
From those cases, these are the most common ones. No recreation will be optimized as possible depending on the actual RDBMS.
Old type | New type | PG | ORA | Comments |
---|---|---|---|---|
char | nchar | yes | yes | |
varchar | nvarchar | yes | yes | |
nchar | char | yes | no | |
nvarchar | varchar | yes | no | |
varchar(10) | varchar(20) | yes | yes | |
char(10) | char(20) | yes | yes | |
varchar | text | yes | no | |
char | text | yes | no | |
numeric(10,0) | numeric(20,0) | yes | yes | |
numeric(10,0) | numeric(10,5) | no | no | |
numeric(10,5) | numeric(10,0) | yes | no | |
numeric | numeric(10,0) | yes | no | PG: As far as all data can be migrated |
varchar(20) | nvarchar(40) | yes | yes | |
varchar(20) | nvarchar(40) + onCreateDefault | yes | yes | |
text | varchar | no | no | PG could support it depending on actual data, ORA doesn't support. Unlikely to happen so won't be implemented |
Parallelization
Some processes can be improved by parallelizing their execution.
For all these cases:
- By default, if no setting is included, the number of threads to use will be the half of available processors in the JVM launching the process. In machines with Hyper-Threading, virtual cores are counted as available, this is the reason to use only a half of detected ones.
- A new setting
dbsm.maxThreads
can be used to override this default, this setting can be included at system level withinOpenbravo.properties
or passed as parameter for individual tasks, ie.ant update.database -Ddbsm.maxThreads=8
.
Adding indexes on existing tables
New indexes added to existing tables can be parallelized. Because index creation is a heavy CPU operation (by DB processes), the improvement obtained by this parallelization can be very relevant in cases where multiple indexes are going to be created, specially if each of them is slow to create.
PostgreSQL: PL standardization
When importing PL code in PostgreSQL, it is gets automatically transformed from its actual definition in xml which is ORA compatible to PG specific PL. This process is heavy in CPU so it can get benefit of parallelization.
The improvement here will not be relevant when performing actual customers updates, but it will be significant while developing where typically many update.database and export.database commands are executed without big amounts of data but with all PL code present.
PostgreSQL: PL standardization check
When exporting PL code in PostgreSQL, it requires to be transformed from PL specific to ORA compatible before saving to xml. Correctness of this exportation is checked in a parallelizable manner.
This optimization, again, targets developers exporting database metadata.