Modules:Analytics FactTables
Languages: |
English | Français | Translate this article... |
Contents |
Fact Tables versus Transaction Tables for Cubes
Openbravo makes it very easy to define cubes on any table defined in the Application Dictionary of Openbravo. This includes all the current transactional tables.
There is however a common understanding in the OLAP/Analytics world that defining cubes on the basis of transactional tables is not the best approach. There are several reasons for using separate fact tables:
- Faster performance with fact tables:
- transactional tables are often normalized, meaning that to retrieve derived information, which is often needed for reporting multiple joins are needed (a so-called snowflake schema), for example: to use the BP Category as a dimension for the OrderLine transactional table the following join is needed: from order line to order header to business partner to business partner category
- transactional tables tend to have indexes specifically for the transactional needs but not for reporting needs, adding indexes (for reporting) can degrade the transactional performance
- Pre-compute information: often for reporting purposes it makes sense to pre-compute specific numbers. The most simple example is doing currency conversion to a common reporting currency. This can't be done efficiently on a transactional table using the Mondrian schema. A separate fact tables allows you to pre-compute information in many more ways.
Therefore often it makes sense to define a separate table in the Application Dictionary to cover your analytics requirements. Openbravo provides an infrastructure provides standard mechanisms to update fact tables and let custom code participate in the standard fact-table-update process. This is explained in this wiki document.
Openbravo Fact Table Infrastructure
The Openbravo Fact Table infrastructure consists of 2 main parts:
- the possibility to implement your own fact updater class (extending an existing base class). This class will automatically participate in the Openbravo fact updating process.
- adding new columns/dimensions to existing fact tables and update the fact table.
Defining your own fact table
When defining your fact table you have to take the following steps:
- create the fact table in the database and define it in the application dictionary, this is done in the same way as for any other Openbravo table, see this howto
- implement a OBAnalyticsFactUpdater class which takes care of creating the fact records from the transactional tables.
In this section we will go into this second topic, creating your own OBAnalyticsFactUpdater class. First it should extend the standard OBAnalyticsFactUpdater class. Extending this class has several advantages:
- By extending this class Openbravo will automatically find your class and call it as part of the fact updating process
- The base class provides a standard approach for handling fact records, it takes care of copying values from the transactional record to the fact record, and handles special time dimensions. You only need to override specific methods to get a fully operating fact-update implemented.
We will now go through an example fact updater implementation, nl. the fact updater we use for the Sales Order Cube.
You start by extending the OBAnalyticsFactUpdater class:
public class OBAnalyticsFactOrderUpdater extends OBAnalyticsFactUpdater<OrderLine, AnalyticsFactOrder> {
As you can see you extend a class with generics, it defines both the core transactional table and the fact table as generics parameters. In this example the OrderLine is the core transactional table and the AnalyticsFactOrder is the target fact table. Many methods you need to implement/override make use of the same generics.
As a next step you need to implement several abstract methods inherited from the base class. We go through each one separately. First the getData method, it should return the core transactional table records, in this case OrderLine instances:
protected ScrollableResults getData() { final Date fromDate = getLastUpdateRunDate(); final String qryStr = "select ol from " + OrderLine.ENTITY_NAME + " ol where ol.client.id = :client and ol.active=true and ol." + OrderLine.PROPERTY_SALESORDER + ".active=true and ol." + OrderLine.PROPERTY_SALESORDER + "." + Order.PROPERTY_DOCUMENTSTATUS + "='CO' and (ol.updated >= :updated or ol." + OrderLine.PROPERTY_SALESORDER + ".updated >= :updated)"; final Query qry = OBDal.getInstance().getSession().createQuery(qryStr); qry.setDate("updated", fromDate); qry.setString("client", OBContext.getOBContext().getCurrentClient().getId()); return qry.scroll(ScrollMode.FORWARD_ONLY); }
Note how the getLastUpdateRunDate method is used to query for any changed/new objects since the last time this fact table was updated.
Then also the updater needs to remove existing fact objects which have become stale, the following method does that:
protected void purgeFacts() { // Note: no implicit joins are allowed in the where clause, an inner select is however allowed final String qryStr = "delete " + getFactEntityName() + " olFact where (olFact.updated < (select ol.updated from " + OrderLine.ENTITY_NAME + " as ol where ol.id = olFact." + AnalyticsFactOrder.PROPERTY_SALESORDERLINE + ") or olFact.updated < (select o.updated from " + Order.ENTITY_NAME + " as o where olFact." + AnalyticsFactOrder.PROPERTY_SALESORDERLINE + "." + OrderLine.PROPERTY_SALESORDER + ".id = o.id)) and olFact.client.id=:client"; OBDal.getInstance().getSession().createQuery(qryStr) .setString("client", OBContext.getOBContext().getCurrentClient().getId()).executeUpdate(); }
Note the comparison on the updated column, you can choose the same approach in your factupdater.
The next method is to tell the base class how to copy data from the transactional table to the fact table. Next to direct property names, you can use a property path: salesOrder.businessPartner.businessPartnerCategory also. In the example below the code uses the generated property names from the generated entities, this gives some more compile time safety.
protected Map<String, String> getPropertyMap() { final String orderPath = OrderLine.PROPERTY_SALESORDER; final String bpPath = OrderLine.PROPERTY_SALESORDER + "." + Order.PROPERTY_BUSINESSPARTNER; final String productPath = OrderLine.PROPERTY_PRODUCT; final Map<String, String> result = new HashMap<String, String>(); result.put(OrderLine.PROPERTY_CLIENT, AnalyticsFactOrder.PROPERTY_CLIENT); result.put(OrderLine.PROPERTY_ORGANIZATION, AnalyticsFactOrder.PROPERTY_ORGANIZATION); result.put(bpPath, AnalyticsFactOrder.PROPERTY_BUSINESSPARTNER); ... return result; }
The last method which must be overridden is a simple one, tell the base class the entityname of the fact table:
protected String getFactEntityName() { return AnalyticsFactOrder.ENTITY_NAME; }
There are several other methods you can override in your base class, they all have accompanying javadoc to explain their meaning and purpose.
Extending an existing Fact table
You can also add new dimension columns to an existing fact table. Openbravo itself also does this by adding retail dimensions to the ERP sales cube. To extend an existing fact table you have to take the following steps:
- add the dimension column to the fact table and define it in the Application Dictionary. This can be done in the same way as for any other custom column, see this howto for more information.
- add the new column as a dimension, this is done as for other dimensions (the additional dimension should be in its own module), see here for more information
- Implement a fact updater so that your new dimension fact columns are also filled when the rest of the fact table is updated
In this section we will discuss only the third bullet, the other 2 are covered in other wiki documents (see the links in the bullet points).
You start by creating a class which extends the OBAnalyticsFactObjectUpdater class:
public class OBRetailAnalyticsOrderLineFactObjectUpdater extends OBAnalyticsFactObjectUpdater<InvoiceLine, AnalyticsFactOrder> {
As you can see you override a class with generic parameters. They are the core transaction table (the source for the fact table) and the fact table type. By extending the base class Openbravo will automatically detect your implementation and will call it when updating the fact tables.
Then next you need to implement 2 methods. The first one asks if your class can handle a specific source/fact table. This method is needed because all extending fact updaters are called for all fact tables.
public boolean canHandleFact(OBAnalyticsFactUpdater<?, ?> factUpdater, Object sourceBob, Object factBob) { return sourceBob instanceof InvoiceLine && factBob instanceof AnalyticsFactOrder; }
It should return true if your class is able to process the specific fact object. If you return true then the next method is called by Openbravo. This next method allows you to update the fact table with additional properties:
public void handleFact(OBAnalyticsFactUpdater<InvoiceLine, AnalyticsFactOrder> factUpdater, InvoiceLine invoiceLine, AnalyticsFactOrder orderLineFact) { if (invoiceLine.getSalesOrderLine() == null) { return; } final Order salesOrder = invoiceLine.getSalesOrderLine().getSalesOrder(); if (salesOrder.getObposApplications() != null) { orderLineFact.setOBRETANPOSTerminal(salesOrder.getObposApplications()); orderLineFact .setObretanTerminaltype(salesOrder.getObposApplications().getObposTerminaltype()); } }
Making the Update Process Configurable
It is possible to configure an updater to be executed just for a particular range of dates or schedule it to generate just a particular number of fact records (batches) on every execution. This configuration will depend on what has being set on the Analytics Process Configuration window.
To apply the configuration defined in that window, the query executed in the getData() method, can make use of the following inherited methods:
- getDateRangeQuery: to retrieve the range of data to be calculated. It receives a String parameter with the name of the date property used for sorting the fact records.
- getDateSortDirection: to retrieve the type of sorting to be applied. This is will depend on the type of process being executed (forward or backward).
- setConfigParameters: to set the configuration parameters in the query.
This is an example of the getData() method which adds into the updater process the ability to be configurable by using the methods explained above:
@Override protected ScrollableResults getData() { String strQry = "select o from " + Order.ENTITY_NAME + " o "; // eager initialization for entities that are known to be used later in mapped properties strQry += "join fetch o.businessPartner \n" // + "left join fetch o.deliveryLocation dl\n" // + "left join fetch dl.locationAddress dla\n" // + "left join fetch o.invoiceAddress ia\n" // + "left join fetch ia.locationAddress\n" // + "left join fetch o.partnerAddress pa\n" // + "left join fetch pa.locationAddress"; strQry += " where o.client.id = :client and o.active = true"; strQry += " and o." + Order.PROPERTY_SALESTRANSACTION + " is true"; strQry += " and o." + Order.PROPERTY_DOCUMENTSTATUS + " IN ('CO', 'CL')"; strQry += " and not exists(select fact.id from " + AnalyticsFactSalesOrderHeader.ENTITY_NAME + " fact where fact." + AnalyticsFactSalesOrderHeader.PROPERTY_SALESORDER + ".id=o.id)"; strQry += " and o." + Order.PROPERTY_DOCUMENTTYPE + "." + DocumentType.PROPERTY_REVERSAL + " = false"; strQry += " and exists(select il.id from " + InvoiceLine.ENTITY_NAME + " il where il." + InvoiceLine.PROPERTY_INVOICE + ".active=true and il." + InvoiceLine.PROPERTY_INVOICE + "." + Invoice.PROPERTY_DOCUMENTSTATUS + " IN ('CO', 'CL') and il." + InvoiceLine.PROPERTY_SALESORDERLINE + "." + OrderLine.PROPERTY_SALESORDER + "=o and " + InvoiceLine.PROPERTY_INVOICE + "." + Invoice.PROPERTY_DOCUMENTTYPE + "." + DocumentType.PROPERTY_REVERSAL + " = false)"; strQry += getDateRangeQuery("o." + Order.PROPERTY_ORDERDATE); strQry += " order by o." + Order.PROPERTY_ORDERDATE + " " + getDateSortDirection(); final Query qry = OBDal.getInstance().getSession().createQuery(strQry); qry.setString("client", OBContext.getOBContext().getCurrentClient().getId()); setConfigParameters(qry); qry.setReadOnly(true); qry.setFetchSize(5000); return qry.scroll(ScrollMode.FORWARD_ONLY); }
Date/Time Dimension values in fact tables
Transactional tables practically always have date properties/columns which are relevant as a dimension. For the date dimensions in your fact table you should not use the date reference/type but instead use a foreign key reference to the TimeDimension table. Note for fact tables it is not needed to create a date column, only the reference to the TimeDimension entity is needed.
A foreign key to the TimeDimension table has a performance advantage (no DB functions) and an additional advantage that there are dimension members for all dates, so no 'holes' in your reported data.
See the sales order fact table as an example, it has several 'date' columns which are in fact references to the TimeDimension table.
Openbravo will automatically convert date/datetime values to the correct TimeDimension record when overriding the fact updater. So there is no special custom handling needed.
Automatic Multi-Currency Support
Many/most measures are amounts which are in a specific currency. For reporting purposes (aggregation/summation) the measures all need to be in the same currency.
Openbravo Analytics can automatically solve this, it will automatically detect price and amount columns in the fact table and create separate facts for each of the relevant currencies. The supports currencies are read from the accounting schema.
To enable this support there are different use-cases:
- when extending an existing fact table with numeric/to-be-converted columns make sure that these columns use a Price or Amount reference. Then Openbravo will automatically convert the values in these columns also. Note this assumes that the original fact table already has a currency and conversion date set (see next bullet).
- for a new fact table, Openbravo expects that there is a currency and a conversiondate property. If in your case these properties have different names then you need to override these 2 methods in your fact updater implementation:
- protected String getCurrencyPropertyName()
- protected String getConversionDatePropertyName()