Monday, 29 August 2016

Slowly changing dimention - Type III

Slowly changing dimention - Type III


The historical data is tracked/maintained by creating a new column and maintaining the old value in the new column.

Only the last 2 updates can be maintained,








1 . Import the source table in Source Analyzer and target table in the Target Designer.
2. In the Mapping Designer,

  1. Drag and drop the source table and the Source Qualifier is automatically created.
  2. Create a lookup Transformation based on the Target table and put a condition based on the ProductID
  3. Now create a Router and group the condition for Insert/Update.
  • Group A INSERT : IIF(ISNULL(PDT_ID_TRG),true,false)
  • Group B UPDATE: IIF(NOT ISNULL(PDT_ID_TRG) AND PRODUCT_NAME != PDT_NAME_TRG,true,false)


GROUP A : It is used to insert the new records to the target table with the below values,

SURR_KEY  = SeqenceGenerator.NextVal();

Group B, It is used to update the values 

PRODUCT_NAME = SOURCE.PRODUCT_NAME
OLD_PRODUCT_NAME = TARGET.PRODUCT_NAME
UPDATED_DT = SYSDATE

4. Create Update Strategy Transformation for the respective groups from Router and finally map it to the target tables.


Mapping Designer















Output Data




Slowly Changing Dimension - Type II

Slowly Changing Dimension - SCD Type II


The historical data is tracked/maintained by creating multiple rows for the Natural Key in the Dimension Table by

1. Creating separate Surrogate Key
2. Creating different version number

Here instead of creating different version number we will create ,

EFF_START_DT
EFF_END_DT
CHANGED_FLAG














1 . Import the source table in Source Analyzer and target table in the Target Designer.
2. In the Mapping Designer,


  1. Drag and drop the source table and the Source Qualifier is automatically created.
  2. Create a lookup Transformation based on the Target table and put a condition based on the ProductID
  3. Now create a Router and group the condition for Insert/Update.
  • Group A INSERT : IIF(ISNULL(PDT_ID_TRG),true,false)
  • Group B UPDATE: IIF(NOT ISNULL(PDT_ID_TRG) AND PRODUCT_NAME != PDT_NAME_TRG AND MOD_FLAG = 'Y',true,false)


GROUP A : It is used to insert the new records to the target table with the below values,

SURR_KEY  = SeqenceGenerator.NextVal();
EFF_START_DT = SYSDATE
EFF_END_DT = TO_DATE('30-12-2099','DD-MM-YYYY')
CHANGED_FLAG = 'Y'

Group B, There are 2 different pipelines, 

1. Add the modified row as a new row and set the values are below

SURR_KEY  = SeqenceGenerator.NextVal();
EFF_START_DT = SYSDATE
EFF_END_DT = TO_DATE('30-12-2099','DD-MM-YYYY')
CHANGED_FLAG = 'Y'

2. Update the modified row as below

EFF_END_DT = SYSDATE
CHANGED_FLAG = 'N'

4. Create Update Strategy Transformation for the respective groups from Router and finally map it to the target tables.


Mapping Designer Screen Shot











Output Data

Saturday, 20 August 2016

Slowly Changing Dimension - Type I

Using Lookup Transformation - Dynamic Cache


With a dynamic cache, the Integration Service inserts or updates rows in the cache. When you cache the target table as the lookup source, you can look up values in the cache to determine if the values exist in the target. The Lookup transformation marks rows to insert or update the target.


Final Output -  The Second Table gets updated with the recent record



Table 1 - Application1 - No Primary Key

Table2  - Application2 - App_ID is the Primary Key
LookUp Tranformation - Condition Tab


Lookup Transformation - Properties Tab

Lookup Transformation - Ports Tab

Router - Groups Tab

Router - Ports Tab




Mapping Designing

Update Strategy - Property t/ab

Session Edit

Session Edit

Friday, 19 August 2016

Slowly Changing Dimension

Slowly Changing Dimension

Dimentions that change over time rather than changing on a regular schedule.

Types of Slowly Changing Dimensions

  • SCD Type 1
  • SCD Type 2
  • SCD Type 3


SCD Type 1

Keeps only the current data and does not maintain the historical data.

i.e., Overwriting the old/existing value.

Example :

Suppose the employee changes his location. i,e from India to USA and then he gets shifted to Germany.

EMPLOYEE STAGING

EMPLOYEE DIMENSION

Employee Dimension














ADV:  Do not consume a lot of space to maintain data
DISADV:  It does not maintain historical data.

SCD Type 2 or Version Number Mapping

Keeps the current data and historical data as well.

It creates a new column PM_VERSION_NUMBER in the table to track the changes and PM_PRIMARY_KEY in the table to maintain the history.

EMPLOYEE DIMENSION- VERSION MAPPING










SCD Type 2 or Flag Mapping

Keeps the current data and historical data as well.

It creates a new column PM_CURRENT_FLAG in the table to track the changes and PM_PRIMARY_KEY in the table to maintain the history.

EMPLOYEE DIMENSION - FLAG MAPPING














SCD Type 2 or Effective date range Mapping

Keeps the current data and historical data as well.

It creates PM_BEGIN_DATE and PM_END_DATE to maintain the date range in the table to track the changes and
PM_PRIMARY_KEY in the table to maintain the history.


EMPLOYEE DIMENSION - EFFECTIVE DATE RANGE MAPPING

SCD Type 3 - Dimension Mapping

Keeps the current and partial historical data by adding new column PM_PREV_COL_NAME column to the table.