cancel
Showing results for 
Search instead for 
Did you mean: 

Strange Character Issue in CSV file

Former Member
0 Kudos

Hello Experts,

My scenario is Proxy to File. Receiver file Should be CSV file with Pipe delimited. Sending customer information from ECC to PI to File. they have to send different countries customer details(Ex: they were sending russian, turkish etc..). to handle the special characters initially we have used UTF-8 but its failed to convert russian and turkish characters after that we used UTF-16, this encoding format handled all the special characters.but problem is added some special character to first field value(þÿ1200101) in CSV file. actually receiver application accept only UTF-8 and UTF-16 formats only. i tried use UTF-16LE but application does not support. could you please help me how to remove the special character.is there any way to handle it by wrriting UDF to handle this with out Java mapping. Kindly do the needful.

Thank you.

Regards,

Sanjay.

Accepted Solutions (0)

Answers (2)

Answers (2)

stefan_grube
Active Contributor
0 Kudos

This is a byte order mark, which is standard part of UTF-16.

Byte order mark - Wikipedia, the free encyclopedia

If  the receiver does not support UTF-16LE, you could try UTF-16BE instead.

I wonder why russian or turkish letters fails with UTF-8, as the characters are part of UTF-8.

Are you checking the file with an editor that is able to display UTF-8 characters correctly?

Do not use Microsoft Notepad for this purpose!

RaghuVamseedhar
Active Contributor
0 Kudos

Sanjay,

All characters (Russian, Turkish....) can be encoded in UTF-8. UTF-8 is the de facto standard. Please use UTF-8 in SAP PI. Please set UTF-8 encoding in SAP Proxy (set in SM59 RFC).

If þÿ are removed in middle-ware, it is data loss.

FYI. "Thy" meaning:- archaic or dialect form of "your". Unicode.

RaghuVamseedhar
Active Contributor
0 Kudos

Sanjay,

I agree with .

BOM are add to starting of text stream or file, to give heads-up to text editor about encoding.

UTF-8 - BOM - 0xEF,0xBB,0xBF (but if text editor does not support UTF-8, it is displayed as ).

UTF-16 - BOM - U+FEFF (but if text editor does not support UFT-16, it is displayed as þÿ ÿþ, depending on endianness).

Unicode standard does not recommend using BOM.

Please use UFT-8. There is no character which cannot be represented in UTF-8.

Please use foxe editor as your default editor (do not use Microsoft notepad, as BOM is add to starting of files).