cancel
Showing results for 
Search instead for 
Did you mean: 

duplicate records

Former Member
0 Kudos

Hi

My Scenario is file to File and my requirement is how to idendify the duplicate records in a input(sender)file .

Thanks,

Sagar

Accepted Solutions (0)

Answers (3)

Answers (3)

Former Member
0 Kudos

Simple scenario to handle duplictae recors is writing an java udf using array list . Code is as follows :

You can modify ur code as per ur requirement:#

for (int i=0; i<value.length;i++)

{

//if the list is empty ,add the incoming value

f(values.isEmpty())

{

values.add(value<i>);

result.addValue(value<i>);

}

//if the list already contains the incoming value , add a suppress to your result ..this will eliminate the duplicate keys

else if(values.contains(value<i>))

{

result.addSuppress();

}

//else keep on adding new values

else

{

values.add(value<i>);

result.addValue(value<i>);

}

}

Former Member
0 Kudos

Hi Sagar,

To understand the actual requirement, I think your question is about identifying duplicate records in a data file and not duplicate files.

Cosidering duplicate records the following approaches can be useful:

1. If there be a key field on the file message structure to identify a record uniquely: Use standard graphical mapping functions SORT and SPLITBYVALUE(valuechange): This will help you get the unique enttries in the queue and then use SEARCHBYKEY for the rest of the nodes.

2. If the unique identifier for a record is a combination of multiple fields values, then concat the key fields and use the above process.

Using SortByKey would make your mapping bit heavy on processing side though. To improve processing you may choose to use a use an XSLT mapping to sort all the records of the initial file input based on the key field(s) and then use SPLITBYVALU(valuechange) in the next step graphical mapping.

Hope it makes some sense.

Regards,

Suddhasatta

Former Member
0 Kudos

Hi Suddhasatta

Your absolutely correct I want to remove the duplicate records at the receiver side so do i need to develop the UDF or it is achived by the standard functions. Please let me know.

Thanks,

Satish.

Former Member
0 Kudos

Hi Sagar,

Check the following link

/people/jeyakumar.muthu2/blog/2005/12/19/data-mining-using-apriori-algorithm-in-xi-150-part-ii

Regards,

Kinshuk

RKothari
Contributor
0 Kudos

Hello,

Please check the below wiki:

http://wiki.sdn.sap.com/wiki/display/XI/DifferentwaystokeepyourInterfacefromprocessingduplicate+files

You can try the concept as per your requirement.

-Rahul