Projects:Include Logs Clean Up Utility In Distribution/Technical Specifications
Log Clean Up Utility module is an already existing module first published now 16 months ago (Oct 2013) which has been matured in production environments proving to be useful to maintain log tables. It allows to define some tables to be periodically clean up by deleting old records.
The reasons for cleaning up log tables are:
- These tables, based on Openbravo activity, trend to quickly grow up, consuming, in some cases, big portion of database disk space. Due to this reason bigger storage capacity is required as well it impacts time taken to generate backups and data recovery.
- Log information life cycle is usually short. Though they are useful when they are created to now about sessions, process executions and so on. They become rapidly useless as, normally, it is not interesting to check very old log entries.
As this is a common task, the goal of this project is to include this module within Openbravo 3 distribution including some standard pre-configuration to make easier to use it.
More details on how it works can be found here.
Log Clean Up Utility module configuration is maintained per instance and consists in 2 parts:
- Tables to be cleaned up. It is the list of tables that will be periodically cleaned up, including configuration for which are the records to be deleted and kept, based on creation date and, optionally, some extra filters.
- Scheduled process to perform the clean up. The actual clean up, can be scheduled as a regular Process Request. This process cleans up the tables defined by the configuration.
The idea of adding this module to the distribution is to enable it by default based on some standard configuration, so everyone installing or updating to PR15Q2 will automatically take advantage of it.
|Table||Keep records newer than n days/Filters||Notes|
| ||15 days|| Keeps track of process executions, deletions in this table cause also deletions in |
| ||15 days, only for completed requests||Table used to schedule processes, it typically grows up with immediate executions.|
||Keeps information of executions of background scheduled processes|
| ||40 days|| Stores a record per each session in Openbravo. Deleting this table causes also deletions on |
Processes scheduled to be run weekly Sunday 0:00 (server local time)
Default configuration for retail
|Table||Keep records newer than n days/Filters||Notes|
||Keeps track of logs client logs generated in WebPOS|
In addition to existing features, two new capabilities will be implemented within the scope of this project.
When wanting to delete all data in a table, executing
truncate statement has several advantages than a
- It is much faster and consumes less system resources
- It doesn't require of analyzing statistics as it is known table is completely empty
This will be implemented as a new optional flag in the table configuration.
Table truncation can be only done in some cases:
- All data in the table wants to be completely removed. Therefore, when this flag is marked, any other configuration option (keep newer than x days, filter clause, etc...) will not be taken into account and hidden in UI.
- The table to be deleted cannot be referenced as part of any foreign key. As this can change over the time when installing/updating modules, this check will not be performed when the table is marked to be truncated but when the process is executed, in case there are incoming references, truncation will be marked as fail though the rest of the process will continue.
Vacuum when working in PostgreSQL
PostgreSQL, when a row is deleted it is considered a dead tuple but it continues using disk space that cannot be reused. Only after executing
vacuum command this space is reclaimed and can be re-used.
Due to performance problems in the
vacuum full implementation in 8.x PostgreSQL versions, the selected option to execute is
vacuum analyze. Which will reclaim the space to be re-used, but will not free this space, in instances with stable log creation and periodical clean up this should keep stable the amount of disk used for logs.
Default configuration does not execute table truncation but it keeps log data generated during the last days. In order to perform these deletions log tables need to be queried. We consider to add indexes to the columns (created) used for this queries in order to speed up the process. Finally, we decided not to go for this option because adding indexes would penalize in more space consumed for existent records and slower insertions for new ones. As the clean up processes should be run when the system is lowly used, we understand the benefit of these new indexes wouldn't pay for the overhead they would add.
New module vs merge within an existing one
It is pending to decide whether the inclusion will be done by adding the module to the distribution or by merging it within one already there.
- Keeping the existent API
- Adds some (small) overhead to the release cycle
Decided: new module
New configuration active by default
Should the configuration be active by default?
- Easier to use: just install/update Openbravo and go
- Behavioral change in case of update: removing data that before was kept
- Default execution time (Sunday 0:00 server time) might not be adequate for some users
Decided: active by default
Conflicting new configuration with existent configurations
Adding a default configuration might collide with instances that have already the module installed as new deletion rules will be added.
The solution is to mark the new ones as inactive in the existent instances, but this needs to be notified and executed before 1st execution of the clean up process.
Decided: document it
Review/approve proposed default configurations