Data Warehouse Design Techniques – Accumulating Snapshot Fact Tables

Data Warehouse Design Techniques – Snapshot Fact Tables
April 5, 2017
Loading Hierarchical Bridge Tables
May 24, 2017

Today we will look at a different type of fact table, the accumulating fact.

Accumulating Fact Tables

Accumulating Fact tables are used to show the activity of progress through a well-defined business process and are most often used to research the time between milestones.  These fact tables are updated as the business process unfolds and each milestone is completed.  The most common examples can be found in order and insurance processing. Here a single fact table may contain the important business processes milestones upon which the organization wishes to measure performance.

A different version of this same accumulating fact table may include the calculated lag time between the milestone dates as shown below.

The lag (number of days) can be calculated from the claim date (total time to complete the milestone) or the lag can be calculated as the number of days since the last milestone was completed (days spent in this stage). The important part is to be consistent in calculating the lag within the snapshot fact. Capturing of this lag data can make analysis on this process easier.

There are other options you could use instead of an accumulating snapshot fact table to capture this information.

  1. You could track this information as a type 2 SCD for claim. In this instance, you would add milestone to the claim dimension and track the information using the active dates of the dimension. To do this you will need to create a complex correlated sub query to do the lag calculation.
  2. You could drill across the different fact tables congaing these milestone facts by using the claim dimension. This avoids the complexity of the correlated subquery but puts pressure on the database and reporting application at the time of report delivery.

Remember that Accumulating snapshot tables are helpful when:

  • You are studying the time between linear and predictable milestones
  • You are looking to avoid the complexity of correlated subqueries or the possible performance issues of drilling across fact tables.

Finally, although you are capturing this information in an accumulating fact table you may want to also capture this information in a dimension as a type 2 SCD.

Leave a Reply

Your email address will not be published. Required fields are marked *