Solved: Issue with Read dataset - Character set conversion...

Former Member · ‎05-26-2010

Hi Gurus,

We are converting our system from 4.7C to ECC 6.0.

I am facing issue with one of the interface file read where in one of the record contains '£' symbol.

The file is present on either presentation / application server

This is the ST22 dump message

'''''At the conversion of a text from codepage '4110' to codepage '4103''''''

However I tried rectify this by using either

1> Opening dataset in legacy text mode

or

2> Using encoding non-unicode

We have around 600 interfaces and multitude of open dataset statement. Is it wise to update all the open dataset stmt using in legacy mode?

I am a bit sceptical to use encoding default option.

Can anyone give me the pros and cons of using encoding default Vs using in legacy text mode with examples?

Thanks in advance.

---Amit Jain

nils_buerckel · ‎05-26-2010

Hi Amit,

if you have Non-Unicode files (I assume this) to upload, you should use 'encoding Non-Unicode'.

'Encoding default' is in most cases not a good option, since this results in Unicode on the Unicode system and in Non-Unicode on the Non-Unicode system. Therefore on the Unicode system, it is the same as 'encoding UTF-8'. So please do not use default, if you do not have very special requirements (actually I do not have a customer example where this would be necessary ...).

Please also have a look at:

1) http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/e0928b44-b811-2a10-7599-cc4bb6585c46

2) http://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/b02d3594-ae48-2a10-83a7-89d369b708e5 --> Page 30

3) /people/ulrich.brink/blog/2005/08/18/unicode-file-handling-in-abap

Best regards,

Nils Buerckel

SAP AG

nils_buerckel · ‎05-26-2010

Hi Amit,

if you have Non-Unicode files (I assume this) to upload, you should use 'encoding Non-Unicode'.

'Encoding default' is in most cases not a good option, since this results in Unicode on the Unicode system and in Non-Unicode on the Non-Unicode system. Therefore on the Unicode system, it is the same as 'encoding UTF-8'. So please do not use default, if you do not have very special requirements (actually I do not have a customer example where this would be necessary ...).

Please also have a look at:

1) http://www.sdn.sap.com/irj/scn/index?rid=/library/uuid/e0928b44-b811-2a10-7599-cc4bb6585c46

2) http://www.sdn.sap.com/irj/sdn/go/portal/prtroot/docs/library/uuid/b02d3594-ae48-2a10-83a7-89d369b708e5 --> Page 30

3) /people/ulrich.brink/blog/2005/08/18/unicode-file-handling-in-abap

Best regards,

Nils Buerckel

SAP AG

Former Member · ‎05-27-2010

Hi Nils,

Things are not yet clear to me.I save the file using notepad as type ANSI and upload it to App server.

My system details are

Running FM: SCP_GET_CODEPAGE_NUMBER

Output

START_APPL_CODEPAGE 4103

APPL_CODEPAGE 4103

GUI_CODEPAGE 4110

DATABASE_CODEPAGE 4103

DATABASE_NONUNIQ

APPL_FOR_DISPLAY 4103

APPL_FOR_PROPOSE 4103

APPL_FOR_INPUT 4103

USER_LOGIN_CODEPAGE 4103

USER_EMODE_CODEPAGE 4103

I also searched for presence of character '£' in code page 4110 and 4103 by running tcode : SCP and it looks like this character is found in both the code pages.

Is saving file to type ANSI implies saving it to Non-unicode? Please confirm.

Why does then my program dump when reading record which has this symbol.

Code snippet is:

OPEN DATASET lv_filename

FOR INPUT

IN TEXT MODE

ENCODING DEFAULT

MESSAGE lv_msg.

nils_buerckel · ‎05-27-2010

Hi Amit,

Windows apps which do not use Unicode, save text files using one of the Windows code pages, often called "ANSI" code pages.

Therefore if you use ANSI, then the file is saved as Non-Unicode based on the default code page of your Windows setup.

Encoding of Pound sign ('£'):

Non-Unicode (Windows 1252): Hex A3

Unicode (UTF-8 / 4110): Hex C2 83

Unicode (UTF-16LE / 4103): Hex 00 A3

You can also use a hex code editor to check your file.

You are using the 'encoding default' option in your code. As lined out before, this is not a good choice.

However, in this case the system assumes that the file is encoded in UTF-8. But as you saved it in ANSI, it is Non-Unicode.

Now the system reads A3 and the successor byte (which is probably a SPACE - Hex 20). This combination of bytes is not a valid one in UTF-8 - hence you get the error.

Please retry with NON-UNICODE option.

Regards,

Nils Buerckel

SAP AG

Former Member · ‎05-27-2010

Hi Nils,

This is a convincing answer and I will use encoding Non-unicode for all the interfaces we expect most of the files to be ANSI.

Full points awarded.

Thanks,

Amit Jain

Issue with Read dataset - Character set conversion is not possible