Unicode collation

Former Member · ‎09-17-2010

I read that Unicode preserves the 127 ascii character collating sequence. Yet, as I listed function groups in Unicode DEV and non Unicode PRD for comparison I noticed the underscore "_" character sorts as a lower sequence than alphabetics in Unicode and higher in non Unicode. In ASCII underscore is coded as 97. I haven't found a Unicode table yet showing the numeric assignment to Unicode characters though it would be quite large. This may have unexpected ramifications that I will be watching for. Comment encouraged!

nils_buerckel · ‎09-20-2010

Hi Jeffrey,

Unicode systems use the ICU (International component for Unicode) library for sorting.

Please have a look at

http://userguide.icu-project.org/collation

for further info.

If you want to check the sorting independent of ABAP, please test it with

http://minaret.info/test/sort.msp

or

http://demo.icu-project.org/icu-bin/locexp?_=root&d_=de&x=col

There you will see, that underscore is sorted in a higher sequence than "normal" characters (which is compatible with Non-Unicode).

In ABAP, you can also copy the program RSCP0102 to customer name range and adapt the words provided by this report.

Regarding underscore, it will give you the same result as in the links mentioned.

So somehow the sorting result you experienced in your DEV system might have been based on binary mode - in that case sorting is different.

Please also have a look at SAP notes 50337 and 952625.

Best regards,

Nils Buerckel

SAP AG

P.S.

This link gives you a good description how sorting works in non-Unicode systems:

/people/hannes.kuehnemund/blog/2008/08/15/sort-varietes-between-operating-systems

Edited by: Nils Buerckel on Sep 22, 2010 11:16 AM