OPEN DATASET... in a Unicode system

Former Member · ‎04-28-2010

I'm discussing with a developer @ SAP about the correct use of OPEN DATASET in a Unicode system. I'm not sure I'm correct with my opinion so maybe someone could shed a light on it.

The source of discussion is the (SAP standard) program RSQUEU01. This program is used to download data from the TemSE; in our case we use it to produce a file that is to be sent to auditors (component FI-AIS).

Our system is a little endian Unicode system (codpage 4103). If we download the created data using that program we get a dump with "CONVT_CODEPAGE" because a character could not be converted from 4103 to 1100.

The proposed correction by the developer is changing

OPEN DATASET EXP_FILENAME FOR OUTPUT IN legacy BINARY MODE.

to

OPEN DATASET exp_filename                               
 FOR OUTPUT IN LEGACY BINARY MODE                      
 IGNORING CONVERSION ERRORS.

I think, that correction is wrong. Since we have 12 languages in the system including some Asian the output might get corrupted.

When I asked him about that I was told use an application server with a "correct codepage" then - I'm not sure what that means since I can't connect an ASCII application server to a Unicode system.

I guess the statement should be

OPEN DATASET EXP_FILENAME FOR OUTPUT IN BINARY MODE ENCODING DEFAULT.

This makes sure that no data is cut (like doublebyte) and makes sure, the appropriate codepage (LE/BE) is used.

Are my assumptions right?

Markus

(OSS 323320/2010)

Former Member · ‎04-28-2010

Hi Markus,

The Dump "CONVT_CODEPAGE" is caused when system tries to convert some characters which are in 4103 codepage and not there in 1100 codepage.

IGNORING CONVERSION ERRORS will only ignore such errors which are due to implicit code page conversion.

You can put it in a Try statement and Catch the exception if you want the output in 1100 code page only else these characters would get corrupted. (I assume that you won't use any double byte characters).

Yes you are right if you want to have all double byte characters too then you need to use ENCODING DEFAULT which would use 4103 in Unicode system.

In Unicode systems for legacy mode, the field content can be cut off by this when texts are written in Eastern Asian languages.

I think your ABAPer did a bad fix. You can go back and correct them.

Good Luck

Cheers

Ajay Prakash

Former Member · ‎04-29-2010

Hi Markus,

Let's first clarify the difference ways for writing files:

<li>BINARY MODE: Means that we essentially dump a sequence of bytes, which isn't necessarily related to any code page (and characters). I.e. if I'd want to save for example an executable program, the individual bytes have no meaning when interpreted as characters (unless we look at strings stored in the program). Note that legacy binary mode actually allows you to specify a code page though, but in general the recommendation is not to use the legacy option.</li>

<li>TEXT MODE: Here we have text information that has to be interpreted using a specific code page; thus usually the additional parameter ENCODING should be given, which specifies which code page is used.</li>

</ul>

Now, let's clear up a small typo in Ajay's response:


Yes you are right if you want to have all double byte characters too then you need to use ENCODING DEFAULT which would use 4103 in Unicode system.

That is incorrect. In a Unicode system [encoding default|http://help.sap.com/abapdocu_70/en/ABAPOPEN_DATASET_ENCODING.htm] corresponds to UTF-8, not UTF-16.

Back to your problem. Your suggestion doesn't work, because you cannot specify encoding default for a binary output (the legacy binary mode allows you to specify a code page, but that's misleading and I wouldn't use any legacy mode). So when you try to use the syntax you proposed, you'd get a syntax error.

Generally the recommendation is for Unicode enabled applications to use UTF-8 files with byte order mark, i.e. something like


open dataset EXP_FILE in text mode encoding utf-8 with byte-order mark.

However, the real question is what your external audit application expects and it sounds as if it's not Unicode enabled...

Enough blabber, here's what I'd do: Since you're having issues with a audit-related standard SAP program I'd post the question in forum - other people must have run into that problem. Also, I checked OSS, but couldn't make much sense out of the few notes I've found (and nothing seemed relevant). Check what the expected input format is in the external audit system; possibly post a message to OSS.

Cheers, harald

Former Member · ‎04-29-2010

Hi Herald,

Thanks for correcting the typo error. It is supposed to be 4110 for UTF-8 instead of 4103.

In the above case you are also right that OSS message needs to be raised as this is a standard program.

Also we need to check if the target system is Unicode enabled only then the proposed solution would work.

If the target auditor's system doesn't support Unicode then probably it needs to made Unicode compatible else sending double byte character would pose you a challenge..

Cheers

Ajay

markus_doehr2 · ‎04-29-2010

Thank you all very much for the explanations.

That developer was actually someone @ SAP (see the OSS number under my name in the initial post) so that discussion was done through OSS.

The corresponding note (1463122 - RSQUEU01: runtime error CONVT_CODEPAGE) was corrected and is now working correctly.

Thank you again for all your feedback, I highly appreciate.

Markus