Tuesday, April 15, 2014

SOFT-DELETES & HARD-DELETE:

SOFT-DELETES:- DELETE  THE RECORDS FROM THE TARGET TABLE (MEANING, MARKING A RECORD AS 'DELETED' INSTEAD
OF PHYSICALLY DELETING THEM).

HARD-DELETE:-DELETE THE RECORD PERMANENTLY FROM DATABASE.


ETL Perspective on Deletion:-
When we have ‘Soft DELETE’ implemented at the source side, it becomes very easy to track the invalid
transactions and we can tag those transactions in DW accordingly. We just need to filter the
records from source using that STATUS field and issue an UPDATE in DW for the corresponding records.
Few things to be kept in mind in this case.

If only ACTIVE records are supposed to be used in ETL processing, we need to add specific filters
while fetching source data.

Sometimes INACTIVE records are pulled into the DW and moved till the ETL Data Warehouse level.
While pushing the data into Exploration Data Warehouse, only the ACTIVE records are sent for reporting purpose.

For ‘Hard DELETE’, if Audit Table is maintained at source systems for what are transactions were deleted,
we can source the same, i.e. join the Audit table and the Source table based on NK and
logically delete them in DW too. But it becomes quite cumbersome and costly when no account is kept of what
was deleted at all. In these cases, we need to use different ways to track them and update the
corresponding records in DW.

No comments:

Post a Comment

Data engineering Interview Questions

1)  What all challenges you have faced and how did you overcome from it? Ans:- Challenges Faced and Overcome As a hypothetical Spark develop...