cancel
Showing results for 
Search instead for 
Did you mean: 

Group records based on combination of fields

praveen_vanga3
Participant
0 Kudos

Hi,

I would like to group records based on three fields from source structure.

My Source structure:

MT_Value

     ----Value ( 0---Unbound)

          ---- ID

          ---- Dest

          ---- Source

Target also same structure but I would like to delete duplicate records with the combination of fields.

The values ID ,Dest and Source will repeated for some the records if the same combination repeats in each value node then considered that is a duplicate. How  can we achieve this? I have done if it's a single field to compare duplicate records .

ID----->RemoveContext----->SplitbyValue(ValueChange)----CollapseContext----> ( Target Node ).

in my case I want to delete duplicate records based on  ID, Dest and Source.

Thanks

Praveen

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Refer the blog

praveen_vanga3
Participant
0 Kudos

Hi Ram,

I have already checked, multiple fields in if condition and also I have 84 fields in my structure.

Thanks

Praveen

former_member182412
Active Contributor
0 Kudos

Hi Praveen,

Use below mapping.

UDF:


public void removeDuplicates(String[] id, String[] dest, String[] source, ResultList result, Container container)

  throws StreamTransformationException {

  List<String> list = new ArrayList<String>();

  StringBuilder sb = new StringBuilder();

  String key = "";

  for (int i = 0; i < id.length; i++) {

  if (!id[i].equals(ResultList.CC)) {

  sb.setLength(0);

  key = sb.append(id[i]).append(dest[i]).append(source[i]).toString();

  if (list.contains(key))

  result.addSuppress();

  else {

  list.add(key);

  result.addValue("");

  }

  }

  }

  }

Testing:

Regards,

Praveen.

former_member190293
Active Contributor
0 Kudos

Hi Praveen! It's a nice shot!

Couple of thoughts in addition:

1. Adding any specific separator into key sequence will help to avoid situations like:

ID=20 Dest=BB Source=YY against ID=20B Dest=B Source=YY.

2. Transferring key building mechanism from UDF to outside mapping might help to use this function with various keys with no limits. Just to build key via source values and Concat function and pass result to UDF.

Regards, Evgeniy.

former_member182412
Active Contributor
0 Kudos

Hi Evegeniy,

Thanks for the comment.

  1. I was trying to insert '/' symbol between the fields when building the key but i dont know how the data look like, but i totally agree with you to be safe we need to add some character between the fields.
  2. Yes it will be nice if we keep the logic outside the UDF to build the key, but performance point of view better to right inside because you are end up looping 3 times (you need to use two concat functions to build the key + actual UDF) if you write the building the key logic out side the UDF. But if you write inside it is one time loop. If you have thousands of records writing the logic inside will give you better performance.

Regards,

Praveen.

praveen_vanga3
Participant
0 Kudos

Hi Praveen,

It's really helpful....simply awesome... I tried with Udf as well as Graphical mapping with help of concat to the root node , I am able to generate unique values for the root node but in this case what ever the values under the root node are mismatching. But anyhow It resolved by your greatest code to my issue.I am really appreciate your effort.

Thank you very much...

Thanks

Praveen

Answers (0)