Thursday, April 3, 2014

Separating duplicate and non-duplicate rows to separate tables

Step  1: Drag  the source to mapping and connect it to an aggregator transformation.
scenario 3 src to aggr
Step  2: In aggregator transformation, group by the key column and add a new port  call it count_rec to count  the key column.
Step  3: connect  a router to the  aggregator from the previous step.In router make two groups one named "original" and another as "duplicate"
In original write count_rec=1 and in duplicate write count_rec>1.
scenario 3 aggr to router
The picture below depicting group name and the filter conditions
scenario router grouping
Step 4: Connect two group to corresponding target table.
Scenario 3 router to tgt

1 comment:

  1. Hi there to every one, since I am genuinely keen of
    reading this website's post to be updated daily. It carries
    nice data.
    kajal agarwal hot

    ReplyDelete

 BEST PYSPARK LEARNING SITES https://www.youtube.com/watch?v=s3B8HXLlLTM&list=PL2IsFZBGM_IHCl9zhRVC1EXTomkEp_1zm&index=5 https://www...