cancel
Showing results for 
Search instead for 
Did you mean: 

How to eliminate duplicate records

Former Member
0 Kudos

Hi frzz,

After performing the join am getting multiple records for the same ID. But i need the first record of each ID. Can some one help me how to achieve this.

Best Regards,

Krishna.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member

HI All,

Thanks a lot for your valuable inputs.

Actual scenario is when i join two tables am getting records as below.


OpportunityID      Description    Status

1234                     XYZ                 Inprocess

1234                     XYZ                 Booked

1234                     XYZ                 Won

But i want very first record ( 1234     XYZ      Inprocess ). Like above example my output contains many opportunityIDs with multiple entries. In all the cases i have to take very first record of perticulat OpportunityID.

Can you please suggest how can i achieve this in Analytic View?

Best Regards,

Krishna.

former_member187673
Active Participant
0 Kudos

Well I guess you need to determine what the first record is based on - i.e. how you are sorting....is it based on a date-time field for instance? If you are using SQL you could partition by opp ID sort by date time and then filter on the row number e.g.

select * from (

     SELECT *, ROW_NUMBER() over (partition by opportunityID order by opp_date desc, opp_time desc) as row_num

)

where row_num = 1;

See here for details - SAP HANA Reference: Window Functions

You really should include the code that you are using to produce the above for us to help further.

Peter

former_member182302
Active Contributor
0 Kudos

As peter mentions, you can use either row_number or rank for filtering.

To achieve the same using Graphical Calculation View, you may want to have a look on this blog:

Regards,

Krishna Tangudu

Former Member
0 Kudos

Hi Peter & Krishna,

Thanks a lot for your inputs. Its really helped me.

Best Regards,

Krishna.


Former Member
0 Kudos

Just FYI, I'd go with 's approach as it's 1) easier to maintain graphically and 2) window functions execute in the row store rather than column store, which will likely degrade performance.

- cool approach!

former_member182302
Active Contributor
0 Kudos

Thanks

Regards,

Krishna Tangudu

Answers (3)

Answers (3)

Former Member
0 Kudos

Hi frzz,

Thanks a lot for your response.  I will check and let you know.

Best Regards,

Krishna.

shyam_uthaman
Participant
0 Kudos

Using a Distinct should work out well but you should reconsider looking at your join condition too.. If you could post the snippet here, we could take a better look at what you are doing.

former_member187673
Active Participant
0 Kudos

Hi Krishna, you could do a distinct or a group by on all the columns to get unique rows:

e.g. select a, b, c from test_table group by a, b, c

if using CE functions or Graphical Calc then you could use an aggregation

You may also want to look at your join to see why you are getting the dups if they are not expected.

Peter

former_member182302
Active Contributor
0 Kudos

Adding to Peter's comments,

I was wondering if you are join condition is correct and still you are getting multiple records ( incase you are doing a cross join? )

Apart from that as peter suggested you can try both the options to eliminate the duplicate records.

Suggested approach is through "GROUP BY" but you can check with "DISTINCT" ( wonder since HANA is columnar will distinct work better than GROUP BY? )

You can check the performance through Viz plans and let us know

Regards,

Krishna Tangudu

former_member187673
Active Participant
0 Kudos

Hi Krishna, an interesting thing I found from previous testing is that it seems GROUP BY like I've done above has the exact same explain plan as doing a DISTINCT. So either should be fine. Try it out.