Thursday, December 26, 2013

SCD Type 4, a Solution for Rapidly Changing Dimension

SCD Type 2, is design to generate new records for every change of a dimension attribute, so that complete historical changes can be tracked correctly. When we have dimension attributes which changes very frequently, the dimension grow very rapidly causing considerable performance and maintenance issues. In this article lets see how we can handle this rapidly changing dimension issue using SCD Type 4.

What is SCD Type 4


SCD Type 4 design technique is used when SCD Type 2 dimension grows rapidly due to the frequently changing dimension attributes. In SCD Type 4,  frequently changing attributes will be removed from the main dimension and added in to a Mini Dimension.
To make the explanation easy, lets consider a customer dimension with the following structure.
SCD Type 4, a Solution for Rapidly Changing Dimension
Customer attributes such as Name, Date Of Birth, Customer State changes very rarely or do not even change, where as the Age Band, Income Band and Purchase Band is expected to change much frequently.

If this Customer dimension is used by an organization with 100 million customer, can expect this dimension to grow to 200 or 300 million records with in an year, assuming that there will be at least two or three changes for a customer per year.

Adding Mini Dimension

We can split the dimension into two dimensions, one with the attributes which are less frequently changing and attributes which are frequently changing as in below model. The frequently changing attributes will be grouped into aMini Dimension. 


The Mini Dimension will contain one row for each possible combination of attributes. In our case all possible combinations of AGE_BAND, INCOME_BAND and PURCHASE_BAND will be available in CUST_DEMO_DIM with the surrogate key CUST_DEMO_KEY.

If we have 20 different Age Bands and four different Income Bands and three Purchase Bands, we will have 20 X 4 X 3 = 2400 distinct possible combinations. These values can be populated into the Mini Dimension table once and for ever with surrogate key ranging from 1 to 2400.

Note : Mini Dimension do not store the historical attributes, but the fact table preserved the history of dimension attribute assignment.

Below is the model for the Customer dimension with a Mini Dimension for the Sales data mart.

Mini Dimension Challenges

When Mini Dimension starts changing rapidly, multiple Mini Dimensions can be introduced to handle such scenarios. If no fact records are to associate main dimension and mini dimension, a fact less fact table can be used associate main dimension and mini dimension.

No comments:

Post a Comment