Friday, September 22, 2017

Performance Tuning the Siebel Change Capture Process in DAC

Performance Tuning the Siebel Change Capture Process in DAC

DAC performs the change capture process for Siebel source systems. This process has two components:

1)The change capture process occurs before any task in an ETL process runs.

2)The change capture sync process occurs after all of the tasks in an ETL process have completed successfully.

Supporting Source Tables:-

The source tables that support the change capture process are as follows:

     S_ETL_I_IMG. Used to store the primary key (along with MODIFICATION_NUM and LAST_UPD) of the records that were either created or modified since the time of the last ETL.

     S_ETL_R_IMG. Used to store the primary key (along with MODIFICATION_NUM and LAST_UPD) of the records that were loaded into the data warehouse for the prune time period.

     S_ETL_D_IMG. Used to store the primary key records that are deleted on the source transactional system.

Full and Incremental Change Capture Processes:-

The full change capture process (for a first full load) does the following:

    Inserts records into the S_ETL_R_IMG table, which has been created or modified for the prune time period.

   Creates a view on the base table. For example, a view V_CONTACT would be created for the base table S_CONTACT.

The incremental change capture process (for subsequent incremental loads) does the following:

   Queries for all records that have changed in the transactional tables since the last ETL date, filters them against the records from the R_IMG table, and inserts them into the S_ETL_I_IMG table.

   Queries for all records that have been deleted from the S_ETL_D_IMG table and inserts them into the S_ETL_I_IMG table.

Removes the duplicates in the S_ETL_I_IMG table. This is essential for all the databases where "dirty reads" (queries returning uncommitted data from all transactions) are allowed.

Creates a view that joins the base table with the corresponding S_ETL_I_IMG table.


Performance Tips for Siebel Sources:-

Performance Tip: Reduce Prune Time Period:-

Reducing the prune time period (in the Connectivity Parameters subtab of the Execution Plans tab) can improve performance, because with a lower prune time period, the S_ETL_R_IMG table will contain a fewer

number of rows. The default prune time period is 2 days. You can reduce it to a minimum of 1 day.

Note: If your organization has mobile users, when setting the prune time period, you must consider the lag time that may exist between the timestamp of the transactional system and the mobile users' local timestamp. You

should interview your business users to determine the potential lag time, and then set the prune time period accordingly.


Performance Tip: Eliminate S_ETL_R_IMG From the Change Capture Process

If your Siebel implementation does not have any mobile users (which can cause inaccuracies in the values of the "LAST_UPD" attribute), you can simplify the change capture process by doing the following:

Removing the S_ETL_R_IMG table.

Using the LAST_REFRESH_DATE rather than PRUNED_LAST_REFRESH_DATE.

To override the default DAC behavior, add the following SQL to the customsql.xml file before the last line in the file, which reads as
. The customsql.xml file is located in the dac\CustomSQLs directory.


TRUNCATE TABLE S_ETL_I_IMG_%SUFFIX
;
TRUNCATE TABLE S_ETL_D_IMG_%SUFFIX
;



TRUNCATE TABLE S_ETL_I_IMG_%SUFFIX
;
INSERT %APPEND INTO S_ETL_I_IMG_%SUFFIX
     (ROW_ID, MODIFICATION_NUM, OPERATION, LAST_UPD)
     SELECT
          ROW_ID
          ,MODIFICATION_NUM
          ,'I'
          ,LAST_UPD
     FROM
          %SRC_TABLE
     WHERE
          %SRC_TABLE.LAST_UPD > %LAST_REFRESH_TIME
          %FILTER
INSERT %APPEND INTO S_ETL_I_IMG_%SUFFIX
     (ROW_ID, MODIFICATION_NUM, OPERATION, LAST_UPD)
     SELECT
          ROW_ID
          ,MODIFICATION_NUM
          ,'D'
          ,LAST_UPD
     FROM
          S_ETL_D_IMG_%SUFFIX
     WHERE NOT EXISTS
     (
          SELECT
                 'X'
          FROM
                 S_ETL_I_IMG_%SUFFIX
          WHERE
                 S_ETL_I_IMG_%SUFFIX.ROW_ID = S_ETL_D_IMG_%SUFFIX.ROW_ID
          )
;



DELETE
FROM S_ETL_D_IMG_%SUFFIX
WHERE
     EXISTS
     (SELECT
                 'X'
     FROM
                 S_ETL_I_IMG_%SUFFIX
     WHERE
                 S_ETL_D_IMG_%SUFFIX .ROW_ID = S_ETL_I_IMG_%SUFFIX.ROW_ID
                 AND S_ETL_I_IMG_%SUFFIX.OPERATION = 'D'
     )
;




Performance Tip: Omit the Process to Eliminate Duplicate Records :-

When the Siebel change capture process runs on live transactional systems, it can run into deadlock issues when DAC queries for the records that changed since the last ETL process. To alleviate this problem, you need to

enable "dirty reads" on the machine where the ETL is run. If the transactional system is on a database that requires "dirty reads" for change capture, such as MSSQL, DB2, or DB2-390, it is possible that the record

identifiers columns (ROW_WID) inserted in the S_ETL_I_IMG table may have duplicates. Before starting the ETL process, DAC eliminates such duplicate records so that only the record with the smallest

MODIFICATION_NUM is kept. The SQL used by DAC is as follows:


SELECT
     ROW_ID, LAST_UPD, MODIFICATION_NUM
FROM
     S_ETL_I_IMG_%SUFFIX A
WHERE EXISTS
     (
     SELECT B.ROW_ID, COUNT(*)  FROM S_ETL_I_IMG_%SUFFIX B
               WHERE B.ROW_ID = A.ROW_ID
                 AND B.OPERATION = 'I'
                 AND A.OPERATION = 'I'
     GROUP BY
         B.ROW_ID
     HAVING COUNT(*) > 1
     )
AND A.OPERATION = 'I'
ORDER BY 1,2

However, for situations where deadlocks and "dirty reads" are not an issue, you can omit the process that detects the duplicate records by using the following SQL block. Copy the SQL block into the customsql.xml file

before the last line in the file, which reads as
. The customsql.xml file is located in the dac\CustomSQLs directory.

SELECT
     ROW_ID, LAST_UPD, MODIFICATION_NUM
FROM
     S_ETL_I_IMG_%SUFFIX A
WHERE 1=2

Performance Tip: Omit the Process to Eliminate Duplicate Records
When the Siebel change capture process runs on live transactional systems, it can run into deadlock issues when DAC queries for the records that changed since the last ETL process. To alleviate this problem, you need to

enable "dirty reads" on the machine where the ETL is run. If the transactional system is on a database that requires "dirty reads" for change capture, such as MSSQL, DB2, or DB2-390, it is possible that the record

identifiers columns (ROW_WID) inserted in the S_ETL_I_IMG table may have duplicates. Before starting the ETL process, DAC eliminates such duplicate records so that only the record with the smallest

MODIFICATION_NUM is kept. The SQL used by DAC is as follows:


SELECT
     ROW_ID, LAST_UPD, MODIFICATION_NUM
FROM
     S_ETL_I_IMG_%SUFFIX A
WHERE EXISTS
     (
     SELECT B.ROW_ID, COUNT(*)  FROM S_ETL_I_IMG_%SUFFIX B
               WHERE B.ROW_ID = A.ROW_ID
                 AND B.OPERATION = 'I'
                 AND A.OPERATION = 'I'
     GROUP BY
         B.ROW_ID
     HAVING COUNT(*) > 1
     )
AND A.OPERATION = 'I'
ORDER BY 1,2

However, for situations where deadlocks and "dirty reads" are not an issue, you can omit the process that detects the duplicate records by using the following SQL block. Copy the SQL block into the customsql.xml file

before the last line in the file, which reads as
. The customsql.xml file is located in the dac\CustomSQLs directory.

SELECT
     ROW_ID, LAST_UPD, MODIFICATION_NUM
FROM
     S_ETL_I_IMG_%SUFFIX A
WHERE 1=2


Performance Tip: Manage Change Capture Views

DAC drops and creates the incremental views for every ETL process. This is done because DAC anticipates that the transactional system may add new columns on tables to track new attributes in the data warehouse. If you

do not anticipate such changes in the production environment, you can set the DAC system property "Drop and Create Change Capture Views Always" to "false" so that DAC will not drop and create incremental views. On

DB2 and DB2-390 databases, dropping and creating views can cause deadlock issues on the system catalog tables. Therefore, if your transactional database type is DB2 or DB2-390, you may want to consider setting the

DAC system property "Drop and Create Change Capture Views Always" to "false." For other database types, this action may not enhance performance.

Note: If new columns are added to the transactional system and the ETL process is modified to extract data from those columns, and if views are not dropped and created, you will not see the new column definitions in the

view, and the ETL process will fail.

Performance Tip: Determine Whether Informatica Filters on Additional Attributes

DAC populates the S_ETL_I_IMG tables by querying only for data that changed since the last ETL process. This may cause all of the records that were created or updated since the last refresh time to be extracted.

However, the extract processes in Informatica may be filtering on additional attributes. Therefore, for long-running change capture tasks, you should inspect the Informatica mapping to see if it has additional WHERE

clauses not present in the DAC change capture process. You can modify the DAC change capture process by adding a filter clause for a . combination in the ChangeCaptureFilter.xml file, which 


is located in the dac\CustomSQLs directory.

SQL for Change Capture and Change Capture Sync Processes

The SQL blocks used for the change capture and change capture sync processes are as follows:


TRUNCATE TABLE S_ETL_I_IMG_%SUFFIX
;
TRUNCATE TABLE S_ETL_R_IMG_%SUFFIX
;
TRUNCATE TABLE S_ETL_D_IMG_%SUFFIX
;
INSERT %APPEND INTO S_ETL_R_IMG_%SUFFIX
     (ROW_ID, MODIFICATION_NUM, LAST_UPD)
     SELECT
          ROW_ID
          ,MODIFICATION_NUM
          ,LAST_UPD
     FROM
          %SRC_TABLE
     WHERE
          LAST_UPD > %PRUNED_ETL_START_TIME
          %FILTER
;



TRUNCATE TABLE S_ETL_I_IMG_%SUFFIX
;
INSERT %APPEND INTO S_ETL_I_IMG_%SUFFIX
     (ROW_ID, MODIFICATION_NUM, OPERATION, LAST_UPD)
     SELECT
          ROW_ID
          ,MODIFICATION_NUM
          ,'I'
          ,LAST_UPD
     FROM
          %SRC_TABLE
     WHERE
          %SRC_TABLE.LAST_UPD > %PRUNED_LAST_REFRESH_TIME
          %FILTER
          AND NOT EXISTS
          (
          SELECT
                  ROW_ID
                  ,MODIFICATION_NUM
                  ,'I'
                  ,LAST_UPD
          FROM
                  S_ETL_R_IMG_%SUFFIX
          WHERE
                  S_ETL_R_IMG_%SUFFIX.ROW_ID = %SRC_TABLE.ROW_ID
                  AND S_ETL_R_IMG_%SUFFIX.MODIFICATION_NUM = %
SRC_TABLE.MODIFICATION_NUM
                  AND S_ETL_R_IMG_%SUFFIX.LAST_UPD = %
SRC_TABLE.LAST_UPD
                )
;
INSERT %APPEND INTO S_ETL_I_IMG_%SUFFIX
          (ROW_ID, MODIFICATION_NUM, OPERATION, LAST_UPD)
          SELECT
                  ROW_ID
                  ,MODIFICATION_NUM
                  ,'D'
                  ,LAST_UPD
          FROM
                  S_ETL_D_IMG_%SUFFIX
          WHERE NOT EXISTS
          (
                  SELECT
                           'X'
                  FROM
                           S_ETL_I_IMG_%SUFFIX
                  WHERE
                           S_ETL_I_IMG_%SUFFIX.ROW_ID = S_ETL_D_IMG_%
SUFFIX.ROW_ID
          )
;



DELETE
FROM S_ETL_D_IMG_%SUFFIX
WHERE
          EXISTS
          (
          SELECT
                    'X'
          FROM
                    S_ETL_I_IMG_%SUFFIX
          WHERE
                    S_ETL_D_IMG_%SUFFIX.ROW_ID = S_ETL_I_IMG_%SUFFIX.ROW_ID
                    AND S_ETL_I_IMG_%SUFFIX.OPERATION = 'D'
          )
;
DELETE
FROM S_ETL_I_IMG_%SUFFIX
WHERE LAST_UPD < %PRUNED_ETL_START_TIME
;
DELETE
FROM S_ETL_I_IMG_%SUFFIX
WHERE LAST_UPD > %ETL_START_TIME
;
DELETE
FROM S_ETL_R_IMG_%SUFFIX
WHERE
          EXISTS
          (
           SELECT
                    'X'
          FROM
                    S_ETL_I_IMG_%SUFFIX
          WHERE
                    S_ETL_R_IMG_%SUFFIX.ROW_ID = S_ETL_I_IMG_%SUFFIX.ROW_ID
          )
;
INSERT %APPEND INTO S_ETL_R_IMG_%SUFFIX
          (ROW_ID, MODIFICATION_NUM, LAST_UPD)
          SELECT
                  ROW_ID
                  ,MODIFICATION_NUM
                  ,LAST_UPD
          FROM
                  S_ETL_I_IMG_%SUFFIX
;
DELETE FROM S_ETL_R_IMG_%SUFFIX WHERE LAST_UPD < %PRUNED_ETL_START_TIME
;





3 comments:

  1. Did you know that that you can earn money by locking special pages of your blog / site?
    All you need to do is open an account on Ad Work Media and implement their Content Locking tool.

    ReplyDelete
  2. Strange "water hack" burns 2lbs overnight

    Over 160,000 women and men are losing weight with a easy and SECRET "liquids hack" to lose 1-2lbs every night as they sleep.

    It's effective and works on anybody.

    Here are the easy steps for this hack:

    1) Grab a clear glass and fill it with water half full

    2) And now learn this amazing HACK

    and become 1-2lbs skinnier the next day!

    ReplyDelete
  3. Thank you so much for such a detailed blog post about Informatica and some other useful needed aspects casually.

    Informatica Read Soap API

    ReplyDelete

Data engineering Interview Questions

1)  What all challenges you have faced and how did you overcome from it? Ans:- Challenges Faced and Overcome As a hypothetical Spark develop...