Projects:Support Indexes To Speed Up Contains Search/QA
Automatic Tests
Along the development of this process, the Test Driven Development approach has been used. As result, a set of JUnit test cases were created and included within the DBSM Test Suite.
The tests which cover the contains search index behavior can be found here and with them we are testing what follows:
- An index to speed up a contains search can be imported/exported properly.
- An index to speed up an icontains (case insensitive) search can be imported/exported properly.
- A basic index can be changed to contains search (by recreating it) and vice versa.
- In Oracle contains search indexes are not supported but the table comments created to keep consistency between the XML model and the database must be created/deleted properly.
- If an index is changed as contains search, that index is recreated in PostgreSQL but not in Oracle.
- If a contains search index is changed as basic, that index is recreated in PostgreSQL but not in Oracle.
Performance Tests
In order to measure the performance of a partial index, we have used an Openbravo environment with 19.154.904 business partners on a PostgresSQL database.
We are going to compare different metrics between the usage of a regular index and an indexes intended for improve contains and icontains searching:
<!-- B-TREE INDEX --> <index name="C_BPARTNER_VALUE_BASIC" unique="false"> <index-column name="VALUE"/> </index>
<!-- CONTAINS INDEX --> <index name="C_BPARTNER_VALUE_CONTAINS" unique="false" containsSearch="true"> <index-column name="VALUE"/> </index>
<!-- ICONTAINS INDEX --> <index name="C_BPARTNER_VALUE_ICONTAINS" unique="false" containsSearch="true"> <index-column name="functionBasedColumn" functionExpression="UPPER(VALUE)"/> </index>
And these are the select queries executed for the tests:
-- QUERY 1 (CONTAINS SEARCH, < 3 CHARACTERS) SELECT count(*) FROM C_BPARTNER WHERE value LIKE('%10%');
-- QUERY 2 (CONTAINS SEARCH, > 3 CHARACTERS) SELECT count(*) FROM C_BPARTNER WHERE value LIKE('%100063%');
-- QUERY 3 (ICONTAINS SEARCH, < 3 CHARACTERS) SELECT count(*) FROM C_BPARTNER WHERE UPPER(value) LIKE(UPPER('%an%'));
-- QUERY 4 (ICONTAINS SEARCH, > 3 CHARACTERS) SELECT count(*) FROM C_BPARTNER WHERE UPPER(value) LIKE(UPPER('%ank%'));
-- QUERY 5 (EXACT MATCH) SELECT count(*) FROM C_BPARTNER WHERE value = 'Bank';
The results are summarized in the following table:
INDEX TYPE | INDEX CREATION TIME (ms) | INDEX SIZE (MB) | QUERY 1 TIME (ms) | QUERY 2 TIME (ms) | QUERY 3 TIME (ms) | QUERY 4 TIME (ms) | QUERY 5 TIME (ms) |
B-TREE | 279304.87 | 742 | 3575.67 | 3461.16 | 10878.85 | 10881.71 | 0.28 |
CONTAINS INDEX | 116659.46 | 433 | 5378.16 | 18.08 | 12405.88 | 12428.05 | 1215.81 |
ICONTAINS INDEX | 106216.58 | 433 | 6010.07 | 5818.73 | 17598.99 | 0.52 | 1218.62 |
Note that the contains indexes are speeding up the search when the sub-string we are looking for has three characters or more. In that case, we are reducing the time spent on contains and icontains searching in a 99,47% and a 99,99% respectively.
Finally, it is important to note also that these indexes do not support equality nor simple comparison operators, and in case of exact matching a regular B-tree index provides better results.