cancel
Showing results for 
Search instead for 
Did you mean: 

UTF encoding issues on file adapters and mappings

Former Member
0 Kudos

Hi,

We did some tests regarding to UTF-8 and UTF-16 encoding using file adapters. Our conclusion so far is (when using Windows OS):

1. Inbound adapter can handle UTF-8 and UTF-16 correctly, but do not specify the encoding!

2. XI mappings will set the XML encoding to UTF-8 correctly when sending an UTF-16 file to XI.

3. Outbound adapter can only handle UTF-8 (and US-ACSII and ISO-8859-1) correctly.

The exact test results are:

>>Outbound file adapter bug.

If no encoding is specified in the outbound file adapter, UTF-8 and UTF-16 are handled correctly. However if the encoding is set to UTF-16, XI mapping will fail with the error:

During the application mapping com/sap/xi/tf/_CHRIS_OUTBOUND_TO_INBOUND_ a com.sap.aii.utilxi.misc.api.BaseRuntimeException was thrown: Fatal Error: com.sap.engine.lib.xml.parser.Parser~

Part of the trace:

com.sap.aii.ibrun.server.mapping.MappingRuntimeException: Runtime exception occurred during execution of application mapping program com/sap/xi/tf/_CHRIS_OUTBOUND_TO_INBOUND_: com.sap.aii.utilxi.misc.api.BaseRuntimeException; Fatal Error: com.sap.engine.lib.xml.parser.ParserException: XMLParser: No data allowed here: (hex) a0d, a0d, 6e3c(:main:, row:3, col:2) at com.sap.aii.ibrun.server.mapping.JavaMapping.executeStep(JavaMapping.java:72) at com.sap.aii.ibrun.server.mapping.Mapping.execute(Mapping.java:91) at com.sap.aii.ibrun.server.mapping.MappingHandler.run(MappingHandler.java:78) at com.sap.aii.ibrun.sbeans.mapping.MappingRequestHandler.handleMappingRequest

>>Inbound file adapter bug.

If the encoding of an inbound file adapter is set to UTF-16 everything works ok (except the XML encoding is not set correctly, but this may be a mapping issue and not an adapter issue). However the default UTF-16 encoding seems to be UTF-16BE, where I would expect UTF-16LE since this is the most commonly used encoding.

If the encoding UTF-16LE or UTF-16BE the characterset used in the message is correct, except the BOM of the file. The BOM is empty which means UTF-8 encoded file. Since the file is UTF-16BE or UTF-16LE encoded, this is wrong and the correct BOM should be added by the adapter.

Encodings like US-ASCII and ISO-8859-1 are handled correctly.

>>Mapping bug

When we send in a message encoded in UTF-8 and want to send it out as a UTF-16 encoded message, we need to set the XML encoding to UTF-16. Normally this is done by an XSLT mapping using the <xsl:output encoding=”UTF-16”/> command.

The UTF-8 message will get processed by the XSLT and any special character will be converted to its UTF-16 value. However the output message is not UTF-16 encoded (1 byte in-stead off 2 bytes).

When this 1 byte message is send to the inbound adapter (encoding is set to UTF-16) the message will be translated from 1 byte to 2 byte (UTF-8 to UTF-16). The characters that were converted from UTF-8 to UTF-16 will be read as single byte characters and will be converted again. This will result in an incorrect message with illegal characters.

So basically characters will be converted to UTF-16 2 times, which is incorrect.

Maybe someone can confirm this on another XI system (maybe different OS). If you need test files or mapping, please let me know.

Kind regards,

Christiaan Schaake.

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Update after carefully reading all the UTF related documents on the internet.

For UTF-16 the BOM is required and the adapter is handling this correctly. (encoding=UTF-16 will create the BOM).

For UTF-16LE and UTF-16BE the BOM must not be set. The application should be able to handle the conversion. The adapter is working correct again.

If the adapter is set to binary mode in stead of the text mode, the file will always be read correctly.

About the mapping issue, I'm still experimenting with this one.

Kind regards,

Christiaan Schaake.

Former Member
0 Kudos

Hi Christiaan,

I'm wondering whether you would be able to help me, as it appears that you have some knowledge of UTF-16...

I am trying to import an XML file encoded in UTF-16, with BOM and BO elements into XI. I then want to map it to an IDoc and send it through to SAP R/3. I am experiencing a problem in importing the file. XI keeps giving me a parsing error. I think that it may have something to do with the way that my data type is defined. Do you know whether my data type should contain the BOM and BO elements?

Thanks for you help,

Miguel

Former Member
0 Kudos

haha