Data Access Layer Performance
Contents |
Overview
This document focuses on topics related to Data Access Layer performance.
The example code used in this page can be found in this junit testcase.
Record based querying versus Object based querying
Hibernate follows a different approach to querying and caching compared to JDBC/SQL. Hibernate adds an object-abstraction layer on top of JDBC/SQL.
With Hibernate you query for objects. Hibernate maintains/loads the objects into a larger object graph of connected objects. This is very different from direct JDBC/SQL which reads 'flat records' and does not build an object graph. Hibernate keeps loaded objects in memory, in the first-level cache, which is also called the session cache or PersistenceContext.
Hibernate maintains this in-memory object cache to ensure that an object is only read once from the database, if it is read more than once always the same object instance is returned. This is illustrated in the following code snippet:
final OBQuery<BusinessPartner> bpQuery = OBDal.getInstance().createQuery(BusinessPartner.class, ""); bpQuery.setMaxResult(1); // read one business partner final BusinessPartner bp = bpQuery.list().get(0); final Category category = bp.getBusinessPartnerCategory(); // now load the category directly final OBQuery<Category> categoryQuery = OBDal.getInstance().createQuery(Category.class, "id=:id"); categoryQuery.setNamedParameter("id", category.getId()); final Category category2 = categoryQuery.list().get(0); // category2 and category are the same object: Assert.assertTrue(category == category2);
So the main thing to remember is that Hibernate loads query results in memory and maintains the objects in memory during the lifetime of a Hibernate session. Typically there is one Hibernate session per HTTP request, see here for more information.
This loading/maintaining of objects in memory is the main cause of Out Of Memory exceptions when working with large datasets. The subsequent sections in this page will provide efficient solutions for working with large datasets.
Hibernate Concepts
Proxy concept - getting the id/entityname without loading
Openbravo maps all single-(many-to-one)-associations as lazy. This means that when an object is loaded from the database that its references are Hibernate proxy objects. The Hibernate proxy concept is used to prevent unnecessary queries if data is never accessed.
For example, consider the following code:
final OBQuery<BusinessPartner> bpQuery = OBDal.getInstance().createQuery(BusinessPartner.class, ""); bpQuery.setMaxResult(1); // read one business partner final BusinessPartner bp = bpQuery.list().get(0); final Category category = bp.getBusinessPartnerCategory();
The category object will be a proxy object as it is shown here in the debugger:
Note:
- a proxy object implements both the HibernateProxy interface as the class which is expected by the application (in this case Category).
- touching a proxied object in the debugger will also load it from the database!
The proxied object initially only has an Id and class, if you call any method on the proxied object then a query is fired to read the rest of the state from the database:
// now call a method directly on the object final Object id2 = category.getId();
This will fire this sql query:
SELECT businesspa0_.c_bp_group_id AS C1_129_0_, businesspa0_.ad_client_id AS AD2_129_0_, businesspa0_.ad_org_id AS AD3_129_0_, businesspa0_.isactive AS IsActive129_0_, businesspa0_.created AS Created129_0_, businesspa0_.createdby AS CreatedBy129_0_, businesspa0_.updated AS Updated129_0_, businesspa0_.updatedby AS UpdatedBy129_0_, businesspa0_.value AS Value129_0_, businesspa0_.name AS Name129_0_, businesspa0_.description AS Descrip11_129_0_, businesspa0_.isdefault AS IsDefault129_0_ FROM c_bp_group businesspa0_ WHERE businesspa0_.c_bp_group_id = ?
Openbravo has 2 utility methods which can retrieve the id and entity name of an object without loading it:
// get the id and entityname in a way which does not load the object final Object id = DalUtil.getId(category); final String entityName = DalUtil.getEntityName(category);
![]() | Starting from 3.0PR16Q4, getId method on a proxy does not load the object from database. Therefore, DalUtil.getId is no longer required.
|
And here is some more sample code combining the concepts described above:
final OBQuery<BusinessPartner> bpQuery = OBDal.getInstance().createQuery(BusinessPartner.class, ""); bpQuery.setMaxResult(1); // read one business partner final BusinessPartner bp = bpQuery.list().get(0); final Category category = bp.getBusinessPartnerCategory(); // this category is an uninitialized proxy Assert.assertTrue(category instanceof HibernateProxy); Assert.assertTrue(((HibernateProxy) category).getHibernateLazyInitializer().isUninitialized()); // get the id and entityname in a way which does not load the object final Object id = DalUtil.getId(category); final String entityName = DalUtil.getEntityName(category); // still unloaded Assert.assertTrue(((HibernateProxy) category).getHibernateLazyInitializer().isUninitialized()); // now call a method directly on the object final Object id2 = category.getId(); // now it is loaded! Assert.assertFalse(((HibernateProxy) category).getHibernateLazyInitializer().isUninitialized()); // and the id's are the same ofcourse Assert.assertEquals(id, id2);
Collection Loading
Many Openbravo entities/objects have references to child objects, for example an Order object has OrderLines. When an object is loaded from the database its collections are managed by Hibernate. For this Hibernate uses special implementations of the java.util.List interface. You can see this in the debugger:
These collections are loaded when they are touched, so not earlier. Here is some example code:
final OBQuery<Order> orderQuery = OBDal.getInstance().createQuery(Order.class, ""); orderQuery.setMaxResult(1); Order order = orderQuery.uniqueResult(); final List<OrderLine> orderLineList = order.getOrderLineList(); // is a hibernate special thing Assert.assertTrue(orderLineList instanceof PersistentBag); // this will load the list, all OrderLines are loaded in Memory System.err.println(orderLineList.size());
When the last line is executed then the collection is loaded and the following sql is fired:
SELECT orderlinel0_.c_order_id AS C9_427_1_, orderlinel0_.c_orderline_id AS C1_1_, orderlinel0_.c_orderline_id AS C1_429_0_, orderlinel0_.ad_client_id AS AD2_429_0_, orderlinel0_.ad_org_id AS AD3_429_0_, orderlinel0_.isactive AS IsActive429_0_, orderlinel0_.created AS Created429_0_, orderlinel0_.createdby AS CreatedBy429_0_, orderlinel0_.updated AS Updated429_0_, orderlinel0_.updatedby AS UpdatedBy429_0_, orderlinel0_.c_order_id AS C9_429_0_, orderlinel0_.line AS Line429_0_, orderlinel0_.c_bpartner_id AS C11_429_0_, orderlinel0_.c_bpartner_location_id AS C12_429_0_, orderlinel0_.dateordered AS DateOrd13_429_0_, orderlinel0_.datepromised AS DatePro14_429_0_, orderlinel0_.datedelivered AS DateDel15_429_0_, orderlinel0_.dateinvoiced AS DateInv16_429_0_, orderlinel0_.description AS Descrip17_429_0_, orderlinel0_.m_product_id AS M18_429_0_, orderlinel0_.m_warehouse_id AS M19_429_0_, orderlinel0_.directship AS DirectShip429_0_, orderlinel0_.c_uom_id AS C21_429_0_, orderlinel0_.qtyordered AS QtyOrdered429_0_, orderlinel0_.qtyreserved AS QtyRese23_429_0_, orderlinel0_.qtydelivered AS QtyDeli24_429_0_, orderlinel0_.qtyinvoiced AS QtyInvo25_429_0_, orderlinel0_.m_shipper_id AS M26_429_0_, orderlinel0_.c_currency_id AS C27_429_0_, orderlinel0_.pricelist AS PriceList429_0_, orderlinel0_.priceactual AS PriceAc29_429_0_, orderlinel0_.pricelimit AS PriceLimit429_0_, orderlinel0_.linenetamt AS LineNetAmt429_0_, orderlinel0_.discount AS Discount429_0_, orderlinel0_.freightamt AS FreightAmt429_0_, orderlinel0_.c_charge_id AS C34_429_0_, orderlinel0_.chargeamt AS ChargeAmt429_0_, orderlinel0_.c_tax_id AS C36_429_0_, orderlinel0_.s_resourceassignment_id AS S37_429_0_, orderlinel0_.ref_orderline_id AS Ref38_429_0_, orderlinel0_.m_attributesetinstance_id AS M39_429_0_, orderlinel0_.isdescription AS IsDescr40_429_0_, orderlinel0_.quantityorder AS Quantit41_429_0_, orderlinel0_.m_product_uom_id AS M42_429_0_, orderlinel0_.m_offer_id AS M43_429_0_, orderlinel0_.pricestd AS PriceStd429_0_, orderlinel0_.cancelpricead AS CANCELP45_429_0_, orderlinel0_.c_order_discount_id AS C46_429_0_, orderlinel0_.iseditlinenetamt AS Iseditl47_429_0_, orderlinel0_.taxbaseamt AS Taxbaseamt429_0_, orderlinel0_.m_inoutline_id AS M49_429_0_, orderlinel0_.c_return_reason_id AS C50_429_0_ FROM c_orderline orderlinel0_ WHERE orderlinel0_.c_order_id = ? ORDER BY orderlinel0_.line ASC
DML Operations
Hibernate provides the option to update/insert/delete records in the database using DML style operations. There are some things to take into account when using these functions:
- no Data Access Layer triggers are executed, so when inserting you have to set the client and organization
- no Data Access Layer security checks are done
- the business event layer is by-passed
If these aspects are no problem then you can use the DML operations provided by Hibernate. Let's show an example, adding one character after each Category name:
String hqlVersionedUpdate = "update BusinessPartnerCategory set name = CONCAT(name, 'a') where name <> null"; int updatedEntities = OBDal.getInstance().getSession().createQuery(hqlVersionedUpdate).executeUpdate();
Use cases
Mass read, clearing the session
When reading large volumes of objects from the database it is important to use specific querying/scrolling techniques to prevent out-of-memory exceptions. When querying, normally for smaller data sets the following approach is used:
final OBCriteria<Product> productCriteria = OBDal.getInstance().createCriteria(Product.class); for (Product product : productCriteria.list()) { System.err.println(product.getId()); }
or using OBQuery:
final OBQuery<Product> productQuery = OBDal.getInstance().createQuery(Product.class, ""); for (Product product : productQuery.list()) { System.err.println(product.getId()); }
But this approach will not work if the dataset is large enough! The call to list() will load all objects in memory, resulting in Out Of Memory exceptions and slow responsiveness as the java garbage collector will continuously work.
If you read more than a 1000 objects or so then it makes sense to use a different approach: scroll. The scroll method does not read all data in memory and does not read all data from the database, this is done in separate fetches. Here is some sample code using OBCriteria and OBQuery:
final OBCriteria<Product> productCriteria = OBDal.getInstance().createCriteria(Product.class); // 1000 is normally a good fetch size productCriteria.setFetchSize(1000); final ScrollableResults productScroller1 = productCriteria.scroll(ScrollMode.FORWARD_ONLY); int i = 0; while (productScroller1.next()) { final Product product = (Product) productScroller1.get()[0]; System.err.println(product.getId()); // clear the session every 100 records if ((i % 100) == 0) { OBDal.getInstance().getSession().clear(); } i++; } productScroller1.close();
This code shows a number of things:
- you can set the fetch size to control the volume of the data read in batches from the database to the server, in general 1000 is a good value to choose here
- the session cache should be cleared once in a while, because while scrolling the data (which has been scrolled) is still read and kept in memory
And here is the same example using an OBQuery:
i = 0; final OBQuery<Product> productQuery = OBDal.getInstance().createQuery(Product.class, ""); // 1000 is normally a good fetch size productQuery.setFetchSize(1000); final ScrollableResults productScroller2 = productQuery.scroll(ScrollMode.FORWARD_ONLY); while (productScroller2.next()) { final Product product = (Product) productScroller2.get()[0]; System.err.println(product.getId()); // clear the session every 100 records if ((i % 100) == 0) { OBDal.getInstance().getSession().clear(); } i++; } productScroller2.close();
Remember to close the ScrollableResults objects after using them to prevent memory leaks.
Mass read, with associated/referenced entities
A variant of the mass read, assume now a case where you load an object but you also need the referenced objects for your logic. In a standard approach this will result in many queries:
- one query to load the main objects
- for each main object and each referenced object an additional query
This can lead to an explosion of the number of sql queries being fired. Especially if the main objects don't have common referenced objects (these are cached) and when you clear the session (which is adviced for larger datasets).
However, there is a solution, you can explicitly load the referenced objects in the same query as the main object, using left joining. This reduces the number of queries fired.
Let's start with an example, it uses scroll and session.clear() but still fires many queries:
int i = 0; final OBQuery<BusinessPartner> bpQuery = OBDal.getInstance().createQuery( BusinessPartner.class, ""); bpQuery.setMaxResult(1000); final ScrollableResults scroller = bpQuery.scroll(ScrollMode.FORWARD_ONLY); while (scroller.next()) { final BusinessPartner bp = (BusinessPartner) scroller.get()[0]; // this will load the category object with a separate query System.err.println(bp.getBusinessPartnerCategory().getIdentifier()); // clear the session every 100 records if ((i % 100) == 0) { OBDal.getInstance().getSession().clear(); } i++; }
Now this can be done differently, using a left join, for this we have to use the native Hibernate query object, the organization filtering is done explictly. In the following code snippet Hibernate will generate one large sql query which will also load the referenced object.
int i = 0; // for joining referenced objects we can't use OBQuery, but have to // use a direct Hibernate query object final String queryStr = "from BusinessPartner as bp left join bp.businessPartnerCategory where bp.organization.id " + OBDal.getInstance().getReadableOrganizationsInClause(); final Query qry = OBDal.getInstance().getSession().createQuery(queryStr); qry.setMaxResults(1000); final ScrollableResults scroller = qry.scroll(ScrollMode.FORWARD_ONLY); while (scroller.next()) { final BusinessPartner bp = (BusinessPartner) scroller.get()[0]; // the category is already loaded, so this won't fire a query System.err.println(bp.getBusinessPartnerCategory().getIdentifier()); // clear the session every 100 records if ((i % 100) == 0) { OBDal.getInstance().getSession().clear(); } i++; }
Mass insert/update, using proxy objects
The hibernate manual has an interesting section on batch updates and inserts. The description in this wiki page follows the same approach with one additional concept: using proxy objects.
When doing mass inserts/updates it is important to do the following things:
- flush and clear the session regurarly
- if setting associations in the new/to-update object use a proxy object
Let's start with an example which does none of this:
for (String name : names) { BusinessPartner bp = OBProvider.getInstance().get(BusinessPartner.class); bp.setName(name); bp.setSearchKey(name); final Category category = (Category) OBDal.getInstance().get(Category.ENTITY_NAME, TEST_BP_CATEGORY_ID); bp.setBusinessPartnerCategory(category); OBDal.getInstance().save(bp); } OBDal.getInstance().commitAndClose();
With to many inserts this loop will fail with an Out Of Memory exception. So you need to flush and clear the session every soo many iterations (for example every 100 iterations is a good number). So now the same example with flush and clear:
int i = 0; for (String name : names) { BusinessPartner bp = OBProvider.getInstance().get(BusinessPartner.class); bp.setName(name); bp.setSearchKey(name); final Category category = (Category) OBDal.getInstance().get(Category.ENTITY_NAME, TEST_BP_CATEGORY_ID); bp.setBusinessPartnerCategory(category); OBDal.getInstance().save(bp); if ((i % 100) == 0) { OBDal.getInstance().flush(); OBDal.getInstance().getSession().clear(); } i++; } OBDal.getInstance().commitAndClose();
There is one last optimization possible, the OBDal.getInstance().get(...) call will fire a query to load the category. Normally the read object is cached in the Hibernate cache. But as we clear the session also this object is removed from the cache, this means that every 100 iterations a new query is fired for the Category object.
This behavior can be prevented by calling the getProxy method from OBDal. This method won't actually query the database but create a proxy object (if the object wasn't already loaded in the session cache ofcourse).
Note: as no query is done to the database, the getProxy will always return an object, even if the object does not exist in the database. But this error (illegal foreign key reference) is noticed when flushing to the database.
The OBDal.getProxy method has the same parameters as the get method:
int i = 0; for (String name : names) { BusinessPartner bp = OBProvider.getInstance().get(BusinessPartner.class); bp.setName(name); bp.setSearchKey(name); final Category category = (Category) OBDal.getInstance().getProxy(Category.ENTITY_NAME, TEST_BP_CATEGORY_ID); bp.setBusinessPartnerCategory(category); OBDal.getInstance().save(bp); if ((i % 100) == 0) { OBDal.getInstance().flush(); OBDal.getInstance().getSession().clear(); } i++; } OBDal.getInstance().commitAndClose();
Testing your queries using junit
A lot has been written around unit testing, it makes greate sense and is critical for application robustness. For performance testing and query tuning it really makes sense to place your queries in junit test and run them from there. It will make the analyze-debug-test cycle soo much quicker and faster. See the How_to_create_testcases howto for more information on how to create testcases.
When encountering performance issues...
Performance problems can appear in different ways:
- high memory usage
- high CPU load and slow responsiveness
Note that high CPU load and slow responsiveness can be caused by high memory usage as the java garbage collector will take a lot CPU power if memory limits are hit.
- High memory usage: this can be caused by queries which use the list method instead of scrolling together with session clear
- high CPU load and slow responsiveness: this can be caused by a high user load in combination with queries which are too slow, consider changing object queries to querying for direct values.
The Finding_Perfomance_Issues wiki page explains how to analyze performance problems. It describes what to do in different scenarios and proposes analysis tools to use.
If you want to see the sql which gets generated by Hibernate see this tip.
Slow flush
One of the typical performance issues with DAL occurs when flushing. Flush checks all entities loaded in first level cache looking for modified objects requiring to be persisted in database. If the number of these objects in memory is too big, flush can become slow.
It is possible to obtain debug log of the time each flush takes and which entities where loaded in memory at that time by setting debug level to org.openbravo.dal.service.OBDal
class.
Do's and Don'ts
Don't call qry.list() more than once
With each call to qry.list() Hibernate will fire a query, it is better to place the result of the list() method call in a local variable and use that one:
// DON'T if (obcBsl.list().size() > 0) { List o = obcBsl.list(); } // DO List o = obcBsl.list(); if (o.size() > 0) { ... }
Do only flush when really needed
When changing and saving objects through Hibernate, the object is not directly persisted in the database. Hibernate will queue sql statements and send them in batches to the database. This means that after calling OBDal.getInstance().save(...) the object is not necessarily present in the database. To ensure that all queued actions are send to the database call flush().
But flush should be called with some care as there are performance penaltie. When flushing Hibernate will check all the objects loaded in memory to find if any of them have changed. With 10000-ends of objects this takes 2 seconds or. So only call flush if it is really needed.
Do read the manual
- The following Hibernate manual sections:
- Several blog posts: