Projects:AD IsOrgIncluded Performance Improvements
About this project
The aim of this project is to develop a full refactor of the AD_IsOrgIncluded database function to improve performance, which should collaterally improve the performance of any process that calls it.
Evolution of the project can be tracked at feature request 35590
AD_IsOrgIncluded is a database function that receives three parameters: ad_org_id, parent_org_id and ad_client_id and returns the level in the organization tree for the given ad_client_id where ad_org_id is with respect to parent_org_id.
If not in the natural tree, it returns -1
This function is used in many places as part of the where clause specially in functional flow related SQL queries.
The current function must browse over the AD_TreeNode records to calculate the level, raising performance issues in the following main scenarios:
- When the number of calls to this function is too high
- When the number of levels in the organization tree is too high
- When the number of organizations is too high and the given parameters return -1
The objective of this project is to refactor this function to reduce the performance impact, thus automatically improving every places where this function is used.
The main idea is to get exactly the same behavior of this function after the refactor, but with a much better performance.
It is out of scope for this project to review the places where this function is used and to refactor them (if possible) to avoid this usage.
These are the ideas discarded for the refactor
Transform current function to STABLE
According to PostgreSQL documentation, a STABLE function cannot modify the database and is guaranteed to return the same results given the same arguments for all rows within a single statement. This category allows the optimizer to optimize multiple calls of the function to a single call.
This standalone change is quite easy to develop, but after some testing there is no performance improvement, so it has been discarded.
Transform current function to IMMUTABLE
According to PostgreSQL documentation, an IMMUTABLE function cannot modify the database and is guaranteed to return the same results given the same arguments forever. This category allows the optimizer to pre-evaluate the function when a query calls it with constant arguments.
This change does not show performance improvements, and moreover it’s not a real solution for Openbravo because the organization tree can be changed, so it has been discarded.
Usage of recursive CTE
Recursive CTEs is a standard way to iterate over a tree structure in SQL.
This idea is to rewrite this function to use recursive CTE and set it as STABLE.
Two CTEs have been developed, one to iterate from child to parent, and another one from parent to child.
Both show improvements when executed directly into a SQL query, however when they are embedded into a db function they both show the same figures as the original function, so the solution has been discarded.
For reference only, this is the CTE that iterates from child to parent (preferred way):
CREATE OR REPLACE FUNCTION ad_isorgincluded_cte( p_orgid character varying, p_parentorgid character varying, p_clientid character varying) RETURNS numeric AS $BODY$ DECLARE v_Level NUMERIC:=0; v_treeID VARCHAR(32) ; --OBTG:varchar2-- v_Parent VARCHAR(32) ; v_Node VARCHAR(32) ; BEGIN SELECT MAX(AD_TREE_ORG_ID) INTO v_treeID FROM AD_CLIENTINFO WHERE AD_CLIENT_ID=p_clientID; v_Parent:=p_OrgID; WITH RECURSIVE organizationtree(levelno, orgId, parentOrgId) AS ( SELECT 2, node_id, PARENT_ID FROM AD_TREENODE WHERE AD_TREE_ID=v_treeID AND node_id = p_orgid union ALL SELECT levelno+1, tn.NODE_ID, tn.PARENT_ID FROM organizationtree ot JOIN ad_treenode tn ON (ot.parentOrgId = tn.NODE_ID AND tn.AD_TREE_ID=v_treeID AND tn.PARENT_ID NOT IN ('0', v_Parent)) ) SELECT ot.levelno INTO v_Level FROM organizationtree ot WHERE ot.parentOrgId = p_parentorgid; RETURN v_Level; END ; $BODY$ LANGUAGE plpgsql STABLE COST 100; ALTER FUNCTION ad_isorgincluded_cte(character varying, character varying, character varying)
After several testing, the best performance improvements that we get are by using a table to store the plain organization tree. This idea in combination with good defined indexes, shows very important performance improvements.
A query like this:
SELECT *, ad_isorgincluded(ad_org_id, '0', '23C59575B9CF467C9620760EB255B389') FROM ad_org;
Shows the following results in the same environment:
- Before the refactor (original code): 59 ms
'Seq Scan on ad_org (cost=0.00..89.74 rows=299 width=420) (actual time=0.129..59.064 rows=299 loops=1)' 'Total runtime: 59.103 ms'
- After the refactor: 7ms [>8 times faster]
'Seq Scan on ad_org (cost=0.00..89.74 rows=299 width=420) (actual time=0.270..7.001 rows=299 loops=1)' 'Total runtime: 7.051 ms'
The table called AD_ORG_TREE will have these important columns:
- AD_ORG_ID [Mandatory, VARCHAR(32)] which represents a child organization
- AD_PARENT_ORG_ID [Mandatory, VARCHAR(32)] which represents a parent of this organization (father, grandfather…)
- LEVELNO [MANDATORY, DECIMAL(10,0)] which represents the level in the tree for AD_ORG_ID compared to AD_PARENT_ORG_ID
A check constraint will be defined to allow only records with LEVELNO > 0
A unique constraint [AD_ORG_ID, AD_PARENT_ORG_ID, AD_CLIENT_ID, LEVELNO] will be added to inform the database no duplicate records can be found. This constraint is used by the database as an index, and it’s the key point to get the performance improvement, as the database has no need to visit the table because all the information commonly used can be found directly into the index.
It is important to declare the unique columns in this order to get the best performance.
The index size is not a problem regardless the number of columns. In an environment with 312 organizations, the AD_ORG_TREE table size is '9160 kB' [45785 records] and the index size is '5952 kB'.
The code inside this function will be fully refactor to get rid of the AD_TreeNode code, and to use the new AD_ORG_TREE table instead.
It’s very important to ensure we get the same behavior before and after the refactor, so a full test suite of JUnit tests must be developed to cover all the scenarios, specially the corner cases (null parameters, org_ids that don’t exist, org_ids that belong to different clients, etc.)
This is the basic idea for this new code:
CREATE OR REPLACE FUNCTION ad_isorgincluded( p_orgid character varying, p_parentorgid character varying, p_clientid character varying) RETURNS numeric AS $BODY$ DECLARE V_RETURN AD_ORG_TREE.LEVELNO%TYPE:=-1 ; BEGIN -- Special case for * org IF (p_orgid='0' AND p_parentorgid='0' AND p_clientid IS NOT NULL) THEN SELECT 1 INTO V_RETURN FROM AD_CLIENT WHERE AD_CLIENT_ID = p_clientid; RETURN COALESCE(V_RETURN, -1); END IF; -- Take advantage of AD_ORG_TREE.AD_ORG_TREE_ORGPARENTCLIENTLEV unique constraint SELECT LEVELNO INTO V_RETURN FROM AD_ORG_TREE WHERE AD_ORG_ID = p_orgid AND AD_PARENT_ORG_ID = p_parentorgid AND AD_CLIENT_ID = p_clientid; RETURN COALESCE(V_RETURN, -1); EXCEPTION WHEN OTHERS THEN RETURN -1; END ; $BODY$ LANGUAGE plpgsql VOLATILE COST 100;
- The db function is called AD_IsOrgIncluded, exactly the same as the original function. We want all the places that use the function to automatically get benefit of the improvement
- It is declared as VOLATILE. Actually STABLE has more sense here, however it has been tested and it doesn’t represent a significant performance improvement; besides DBSM doesn’t currently support it, so it has been discarded.
- When both orgs are * we check the client_id really exists and returns 1 if so.
- For the normal scenario, we get benefit of the Unique Constraint explained before.
- The function should cover the corner scenarios.
Populating AD_Org_Tree table
A module script must be developed to populate the new table. Important notes for the implementation:
- The module script must use the information at AD_TreeNode table to properly populate the table. This can be done by renaming the old AD_IsOrgIncluded function and call it, or by implementing a recursive CTE in the module script.
It’s preferred to avoid creating a new function with the old code, but it could be difficult (or Platform dependent) to work with recursive CTE, so a new AD_IsOrgIncluded_TreeNode function as a copy of the current AD_IsOrgIncluded function will be created.
- This module script can take long in environments with many organizations. In an environment with 700 organization it takes around 10 minutes
- Only organizations set as ready must be populated
- It should be run just one time
- It should be run at install.source to control when sampledata modules are available
- It must not duplicate records in AD_Org_Tree table (verify whether exists a similar record before inserting it)
New instances or new organizations
The best moment to populate this table is when the organization is set as ready. In that moment we ensure its parent organization tree won’t change anymore in the future. The child tree could change, but this is not a problem because this table only stores parent relationships.
Developing it in this process has many advantages:
- We avoid mutating table issues in triggers
- We calculate the information when we are sure it won’t change in the future, so we avoid the maintenance
However this way has an important problem: the new AD_IsOrgIncluded code won’t work for organizations NOT set yet as ready.
This will require to review the flows where the AD_IsOrgIncluded is used when the organization is not set as ready yet, and transform them to use the AD_IsOrgIncluded_TreeNode function.
After a quick review, only AD_ORG_READY and AD_ORG_CHK_CALENDAR functions (and the functions called inside) seem to be really affected, so not too much impact.
As part of the delivery a set of automatic tests have been created. They can be found at src-test/src/org/openbravo/test/IsOrgIncluded/ADOrgTreeTest.java
Note that they are not included in TestLink because they are technical tests which can’t be tested using functional approaches.
This is the list of tests:
- testSingleNewOrganization: Creates a new organization just below * and set it as ready with cascade = ‘N’. Verifies the AD_Org_Tree table has been properly populated.
- testSingleNewOrganizationCascade: Creates a new organization just below * and set it as ready with cascade = ‘Y’. Verifies the AD_Org_Tree table has been properly populated.
- testTwoNewOrgsIndividually: Test creation of two new organizations (one below the other) and set as ready individually. Verifies the AD_Org_Tree table has been properly populated.
- testTwoNewOrgsCascade: Test creation of two new organizations (one below the other) and set as ready in cascade. Verifies the AD_Org_Tree table has been properly populated.
- testChildOrgValidation: Creates parent and child organization and tries to set as ready the child organization (cascade and not cascade). Verify it fails because parent org is not set as ready yet.
- testCreateExtendedOrgTreeFewLevels: Creates an organization subtree with 7 new orgs in 2 levels. Verifies the AD_Org_Tree table has been properly populated.
- testCreateExtendedOrgTreeManyLevels: Creates an organization subtree with 23 new orgs in 6 levels. Verifies the AD_Org_Tree table has been properly populated.
- testCallAllCombinations: Creates a cartesian product of several organization ids and client ids and checks AD_IsOrgIncluded and AD_IsOrgIncluded_TreeNode return exactly the same values. The org and client list include also corner cases like nulls or ids that don’t exist. A total of 1014 combinations are tested.
- testPerformanceI: Runs a sequencial scan over FACT_ACCT table filtering by AD_IsOrgIncluded and AD_IsOrgIncluded_TreeNode function. Verifies we get the same number of records and that the performance is better for the AD_IsOrgIncluded function. [In local testing: 5,20x in PostgreSQL and 2,70x in Oracle].
- testPerformanceII: Calls individually the AD_IsOrgIncluded and AD_IsOrgIncluded_TreeNode functions and verify the performance in both cases is similar [In local testing: -20ms in PostgreSQL and -5ms in Oracle].
Recursive CTE works fine in both Oracle and PostgreSQL, however it is not supported yet by the DBSM.
In PostgreSQL the syntaxis is: WITH RECURSIVE …
In Oracle the syntaxis is: WITH …
If we want to use it, the feature request #35497 must be fixed before. Otherwise the only possibility is to keep a legacy copy of current AD_IsOrgIncluded function and use it in those places.
We will use AD_IsOrgIncluded_TreeNode instead of working with Recursive CTE
AD_IsOrgIncluded content before running module script
What is the code inside the AD_IsOrgIncluded function when executing the module script?
It’s the new code (based on AD_Org_Tree table), so the module script should call AD_IsOrgIncluded_TreeNode to populate the new table.