Data Warehouse Design Techniques – Rapidly Changing Dimensions

Data Warehouse Design Techniques – Slowly Changing Dimensions
January 18, 2017
Data Warehouse Design Techniques – Degenerate & Junk Dimensions
February 1, 2017
In my last blog post, I showed how to capture changes for slowly changing dimensions. Today we will see how to handle rapidly changing dimensions.

Rapidly Changing Dimensions

Rapidly changing dimensions are dimensions where the attribute values of the dimension change frequently causing the dimension grow rapidly if you a have designed the dimension to capture the changes as a Type 2 dimension. The rapid growth of this dimension will impact maintenance and performance as the dimension grows. Let’s take a look at solutions to this challenge.

Mini-Dimension (SCD Type 4)

Mini-dimensions contain the rapidly changing attributes of the original dimension and are treated as a stand-alone dimension. This dimension will band the attributes together instead of treating them as individual pieces of data. The fact table can then connect to the mini-dimension to based upon the data provided or this mini-dimension could be connected to the “parent” dimension using the surrogate key as a Type 1 or Type 2 SCD attribute.

An example of this can be seen in a customer table where the client wishes to store information which may change at least once per year and in some cases many times per week. In the table below we see that the client wishes to capture the following rapidly changing attributes: customer’s age, income, the number of lifetime purchases, rating, account status, and credit score.

 

In order to limit the growth of this table, we can remove these rapidly changing attributes and move them to a table called DIM_CUSTOMER_ATTRIBUTE.

Since there are thousands of permutations of these attributes it is recommended that you band the attributes. For example, we may want to band age in groups of 10 years, income in bands of $10,000, and lifetime purchases and credit score in bands of 25 while providing a single value for the other attributes since they are limited in number. Here is a partial example of the data which may be stored in the mini-dimension table.

Junk Dimension

If we only cared about the customer status (Collect, Current, Payment – On-Time, Payment – Late) from the above example a Junk Dimension might be a better solution. A junk dimension is a dimension containing one or more unrelated attributes and is used to avoid having a large number of foreign keys in the fact table.

Conclusion

As you can see there are multiple solutions to handle the challenge of a rapidly changing dimension. The most important thing to remember is to take your time, profile your data. This will help ensure that your model is flexible and scalable for future growth and to meet the changing needs of the business.

Comments are closed.