cancel
Showing results for 
Search instead for 
Did you mean: 

How to read files with codepage UTF16LE with "Sender File Adapter"

0 Kudos

Hi everybody,

I am using a XI-Filesender-Adapter to get a (UTF-16LE encoded) file und process it in XI-Mapping.

This is my File-Content:

Cost Centre,Cost Code,Page Count (B&W),Page Count (Colour),Job Count

Unknown,Lexmark,"37,480",334,"11,968"

Unknown,Unknown,312,0,177

110000,Lexmark,128,228,43

The HEX-representation of this content is:

FF FE 43 00 6F 00 73 00 74 00 20 00 43 00 65 00

...

(The starting 2 Bytes FF FE represent UTF16-LE )

And this is the payload I get for mapping:

<?xml version="1.0" encoding="utf-8" ?>
 
 <ns:MT_POM_KOSTEN xmlns:ns="http://aua.com/pom">
 <POM_REC>
  <COSTCENTER>uFEFFCost Centre</COSTCENTER>
 
  <COSTCODE>Cost Code</COSTCODE>
 
  <PAGECOUNT_BW>Page Count (B&W)</PAGECOUNT_BW>
 
  <PAGECOUNT_COL>Page Count (Colour)</PAGECOUNT_COL>
 
  <JOBCOUNT>Job Count</JOBCOUNT>
 
  </POM_REC>
 <POM_REC>
  <COSTCENTER>Unknown</COSTCENTER>
 
  <COSTCODE>Lexmark</COSTCODE>
 
  <PAGECOUNT_BW>37,480</PAGECOUNT_BW>
 
  <PAGECOUNT_COL>334</PAGECOUNT_COL>
 
  <JOBCOUNT>11,968</JOBCOUNT>
 
  </POM_REC>
 <POM_REC>
  <COSTCENTER>Unknown</COSTCENTER>
 
  <COSTCODE>Unknown</COSTCODE>
 
  <PAGECOUNT_BW>312</PAGECOUNT_BW>
 
  <PAGECOUNT_COL>0</PAGECOUNT_COL>
 
  <JOBCOUNT>177</JOBCOUNT>
 
  </POM_REC>
 <POM_REC>
  <COSTCENTER>110000</COSTCENTER>
 
  <COSTCODE>Lexmark</COSTCODE>
 
  <PAGECOUNT_BW>128</PAGECOUNT_BW>
 
  <PAGECOUNT_COL>228</PAGECOUNT_COL>
 
  <JOBCOUNT>43</JOBCOUNT>
 
  <POM_REC>

I can see the correct strings (for example Cost Centre) in payload, but the string-comparison in the user defines function cannot recognize the equality of the strings:

for (int i =0; i < a.length; i++) {
if (
   (a<i>.equals("Cost Centre"))  )
  result.addSuppress();
else
 
  result.addValue("");
}

Actually I am using UTF-8 as codepage in Fileadapter (and Text as type)

When I try to use UTF16 (or UTF-16LE) as Codepage, I am getting unreadable characters.

I also tried binary, UTF16-BE,...

The only way is to covert the file to ANSI before I use it with XI. Then my function does work correctly.

Does anybody have an idea, how I can read a UTF16-LE File and process it correctly in XI?

I am using XI 7.00 0023 and JSDK 1.4.2-34

Thanks a lot

Armin

Accepted Solutions (0)

Answers (2)

Answers (2)

Former Member
0 Kudos

Hi Armin,

Does PI require the little endian files in Hex format to be read by the File sender channel?

Thanks,

Diptee

markushardank
Product and Topic Expert
Product and Topic Expert
0 Kudos

Hi Armin,

probably you should write the strings all in capital letters and remove the space in between. Guess that the different spelling is causing the problem here.

You can also print the string to the default trace in PI and compare.

best regards,

Markus

0 Kudos

Hi Markus,

Thank you for your posting. I tried your idea:

I used Info-Traces to see, which values I am getting in my function. The trace shows, that case sensitive letters and spaces are comming correctly to my mapping:

<Trace level="1" type="T">*** START APPLICATION TRACE ***</Trace>

<Trace level="2" type="T">Cost Centre</Trace>

<Trace level="2" type="T">Unknown</Trace>

<Trace level="2" type="T">Unknown</Trace>

<Trace level="2" type="T">110000</Trace>

I replaced

a<i>.equals

by

a<i>.indexOf

in my function for the first field:

AbstractTrace at = container.getTrace();

 for (int i =0; i < a.length; i++) {
at.addInfo(a<i>);
if (
   (a<i>.indexOf("Cost Centre")  != -1  ) ||
   (a<i>.equals("Default")) || 
   (a<i>.equals("--TOTAL--")) 
   )

Now my function can recognize the String "Cost Centre" in queue.

I beleive that (wrong) Unicode characters are still included in the very first field ("Cost Centre"), even if I am not able to see them in my trace.

I am looking for a way to get rid of these characters, when I am using the file sender adapter and creating a UTF-8 xml from my UTF-16-csv-file.

thanks and best regards

Armin

anupam_ghosh2
Active Contributor
0 Kudos

Hello Armin,

I have gone through some SAP notes and blogs to find solution to your problem, here is what I found

1. SAP NOTE 821267

____________________

q) How do I correctly configure the File Encoding used by the File

Adapter?

Flat Files with File Content Conversion

For a File Sender channel, configure the encoding of the source

file. The file will be interpreted according to the configured

encoding and converted to XML with an UTF-8 encoding.

For a File Receiver channel, configure the encoding to match

the encoding you would like to be written to the target flat

file.

- Flat Files without File Content Conversion

Whether to configure an encoding in this case depends on if you

want to pass through the file "as is", e.g. within a File

Sender to File Receiver scenario, or if you want to convert the

file's encoding on its way through the Integration Server. For

"as is" processing, configure both the sender and the receiver

using the File Type setting "Binary".

To apply an encoding conversion, configure the respective

source and target encoding in both the sender and receiver

channel.

Important: Configuring an encoding in the receiver channel

will only lead to the expected results if the payload sent to

the receiver channel is in UTF-8 format (e.g., by having

specified an encoding conversion in the Sender channel).

So as per this note if you configure the encoding scheme of sender communication channel to UTF-16LE, adapter should be able to convert it to UTF-8 by default. But you have posted that this encoding scheme is leading to unreadable charcters

2) SAP note 880173

This speaks of use of module XMLAnonymizerBean which can be applied to XML payload to change its encoding.

3) How to guide on encoding : http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79f...

4) SAP note:960663

http://help.sap.com/saphelp_nw04/helpdata/en/45/da2deb47812e98e10000000a155369/content.htm

TextCodePageConversion Bean details which might solve your problem.

5) Finally if nothing above works you need a java mapping code to convert to target XML structure without any File content conversion. The mapping will convert the received file to proper target XML in "UTF-8" encoding. Please let us know if you need help on the code with this final option.

Regards

Anupam

0 Kudos

Thank you for all postings here.

I also consulted SAP in service marketplace.

These bytes at the beginning of the file are the "BOM" (Byte Order Maps) showing the unicode type (UTF-8 or UTF-16,...) and the byte order (little endian or big endian)

http://de.wikipedia.org/wiki/Byte_Order_Mark

File sender does not remove these bytes. So we need to use one of many options to remove them in our mapping (using a java mapping, or using IndexOf() insteed of equals() to check the content,...)

Armin