DELTALAKE AZURE CLOUD DATABRICS -PYSPARK SNOWFLAKE: Separating duplicate and non-duplicate rows to separate tables

Thursday, April 3, 2014

Separating duplicate and non-duplicate rows to separate tables

Step 1: Drag the source to mapping and connect it to an aggregator transformation.

Step 2: In aggregator transformation, group by the key column and add a new port call it count_rec to count the key column.

Step 3: connect a router to the aggregator from the previous step.In router make two groups one named "original" and another as "duplicate"
In original write count_rec=1 and in duplicate write count_rec>1.

The picture below depicting group name and the filter conditions

Step 4: Connect two group to corresponding target table.

1 comment:

rohitDecember 27, 2018 at 4:24 AM
Hi there to every one, since I am genuinely keen of
reading this website's post to be updated daily. It carries
nice data.
kajal agarwal hot
ReplyDelete
Replies

Add comment

Thursday, April 3, 2014

Separating duplicate and non-duplicate rows to separate tables

1 comment:

SNOWFLAKE Interview Questions 2025

DELTALAKE AZURE CLOUD DATABRICS -PYSPARK SNOWFLAKE