Thursday, December 26, 2013

SCD Type 1 Implementation using Informatica PowerCenter

Unlike SCD Type 2, Slowly Changing Dimension Type 1 do not preserve any history versions of data. This methodology overwrites old data with new data, and therefore stores only the most current information. In this article lets discuss the step by step implementation of SCD Type 1 using Informatica PowerCenter.

The number of records we store in SCD Type 1 do not increase exponentially as this methodology overwrites old data with new data  Hence we may not need the performance improvement techniques used in the SCD Type 2 Tutorial.

Understand the Staging and Dimension Table.

For our demonstration purpose, lets consider the CUSTOMER Dimension. Below are the detailed structure of both staging and dimension table.
Staging Table
In our staging table, we have all the columns required for the dimension table attributes. So no other tables other than Dimension table will be involved in the mapping. Below is the structure of our staging table.
Informatica Source Definition

Key Points

  1. Staging table will have only one days data. Change Data Capture is not in scope.
  2. Data is uniquely identified using CUST_ID.
  3. All attribute required by Dimension Table is available in the staging table

Dimension Table

Here is the structure of our Dimension table.
SCD Type 1 Implementation using Informatica

Key Points

  1. CUST_KEY is the surrogate key.
  2. CUST_ID is the Natural key, hence the unique record identifier.

Mapping Building and Configuration

Step 1
Lets start the mapping building process. For that pull the CUST_STAGE source definition into the mapping designer.
Slowly Changing Dymention Type 1
Step 2
Now using a LookUp Transformation fetch the existing Customer columns from the dimension table T_DIM_CUST. This lookup will give NULL values if the customer is not already existing in the Dimension tables.
  • LookUp Condition : IN_CUST_ID = CUST_ID
  • Return Columns : CUST_KEY
Slowly Changing Dymention Type 1
Step 3
Use an Expression Transformation to identify the records for Insert and Update using below expression. 
    • INS_UPD :- IIF(ISNULL(CUST_KEY),'INS', 'UPD')    
Additionally create two output ports.
    • CREATE_DT :- SYSDATE
    • UPDATE_DT :- SYSDATE
See the structure of the mapping in below image.
Slowly Changing Dymention Type 1
Step 4
Map the columns from the Expression Transformation to a Router Transformation and create two groups (INSERT, UPDATE) in Router Transformation using the below expression. The mapping will look like shown in the image.
  • INSERT :- IIF(INS_UPD='INS',TRUE,FALSE)
  • UPDATE :- IIF(INS_UPD='UPD',TRUE,FALSE)
Slowly Changing Dymention Type 1

INSERT Group

Step 5
Every records coming through the 'INSERT Group' will be inserted into the Dimension table T_DIM_CUST.

Use a Sequence generator transformation to generate surrogate key CUST_KEY as shown in below image. And map the columns from the Router Transformation to the target as shown below image.
Slowly Changing Dymention Type 1

Note : Update Strategy is not required, if the records are set for Insert.

UPDATE Group

Step 6
Records coming from the 'UPDATE Group' will update the customer Dimension with the latest customer attributes. Add an Update Strategy Transformation before the target instance and set it as DD_UPDATE. Below is the structure of the mapping.
Slowly Changing Dymention Type 1
We are done with the mapping building and below is the structure of the completed mapping.
Slowly Changing Dymention Type 1

Workflow and Session Creation

There is not any specific properties required to be given during the session configuration.
informatica worklet
Below is a sample data set taken from the Dimension table T_DIM_CUST.
Initial Inserted Value for CUSTI_ID 1003
Slowly Changing Dymention Type 1
Updated Value for CUSTI_ID 1003
Slowly Changing Dymention Type 1

No comments:

Post a Comment