on 01-03-2012 12:16 PM
Hi everybody,
I am using a XI-Filesender-Adapter to get a (UTF-16LE encoded) file und process it in XI-Mapping.
This is my File-Content:
Cost Centre,Cost Code,Page Count (B&W),Page Count (Colour),Job Count
Unknown,Lexmark,"37,480",334,"11,968"
Unknown,Unknown,312,0,177
110000,Lexmark,128,228,43
The HEX-representation of this content is:
FF FE 43 00 6F 00 73 00 74 00 20 00 43 00 65 00
...
(The starting 2 Bytes FF FE represent UTF16-LE )
And this is the payload I get for mapping:
<?xml version="1.0" encoding="utf-8" ?>
<ns:MT_POM_KOSTEN xmlns:ns="http://aua.com/pom">
<POM_REC>
<COSTCENTER>uFEFFCost Centre</COSTCENTER>
<COSTCODE>Cost Code</COSTCODE>
<PAGECOUNT_BW>Page Count (B&W)</PAGECOUNT_BW>
<PAGECOUNT_COL>Page Count (Colour)</PAGECOUNT_COL>
<JOBCOUNT>Job Count</JOBCOUNT>
</POM_REC>
<POM_REC>
<COSTCENTER>Unknown</COSTCENTER>
<COSTCODE>Lexmark</COSTCODE>
<PAGECOUNT_BW>37,480</PAGECOUNT_BW>
<PAGECOUNT_COL>334</PAGECOUNT_COL>
<JOBCOUNT>11,968</JOBCOUNT>
</POM_REC>
<POM_REC>
<COSTCENTER>Unknown</COSTCENTER>
<COSTCODE>Unknown</COSTCODE>
<PAGECOUNT_BW>312</PAGECOUNT_BW>
<PAGECOUNT_COL>0</PAGECOUNT_COL>
<JOBCOUNT>177</JOBCOUNT>
</POM_REC>
<POM_REC>
<COSTCENTER>110000</COSTCENTER>
<COSTCODE>Lexmark</COSTCODE>
<PAGECOUNT_BW>128</PAGECOUNT_BW>
<PAGECOUNT_COL>228</PAGECOUNT_COL>
<JOBCOUNT>43</JOBCOUNT>
<POM_REC>
I can see the correct strings (for example Cost Centre) in payload, but the string-comparison in the user defines function cannot recognize the equality of the strings:
for (int i =0; i < a.length; i++) {
if (
(a<i>.equals("Cost Centre")) )
result.addSuppress();
else
result.addValue("");
}
Actually I am using UTF-8 as codepage in Fileadapter (and Text as type)
When I try to use UTF16 (or UTF-16LE) as Codepage, I am getting unreadable characters.
I also tried binary, UTF16-BE,...
The only way is to covert the file to ANSI before I use it with XI. Then my function does work correctly.
Does anybody have an idea, how I can read a UTF16-LE File and process it correctly in XI?
I am using XI 7.00 0023 and JSDK 1.4.2-34
Thanks a lot
Armin
Hi Armin,
Does PI require the little endian files in Hex format to be read by the File sender channel?
Thanks,
Diptee
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Armin,
probably you should write the strings all in capital letters and remove the space in between. Guess that the different spelling is causing the problem here.
You can also print the string to the default trace in PI and compare.
best regards,
Markus
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hi Markus,
Thank you for your posting. I tried your idea:
I used Info-Traces to see, which values I am getting in my function. The trace shows, that case sensitive letters and spaces are comming correctly to my mapping:
<Trace level="1" type="T">*** START APPLICATION TRACE ***</Trace>
<Trace level="2" type="T">Cost Centre</Trace>
<Trace level="2" type="T">Unknown</Trace>
<Trace level="2" type="T">Unknown</Trace>
<Trace level="2" type="T">110000</Trace>
I replaced
a<i>.equals
by
a<i>.indexOf
in my function for the first field:
AbstractTrace at = container.getTrace();
for (int i =0; i < a.length; i++) {
at.addInfo(a<i>);
if (
(a<i>.indexOf("Cost Centre") != -1 ) ||
(a<i>.equals("Default")) ||
(a<i>.equals("--TOTAL--"))
)
Now my function can recognize the String "Cost Centre" in queue.
I beleive that (wrong) Unicode characters are still included in the very first field ("Cost Centre"), even if I am not able to see them in my trace.
I am looking for a way to get rid of these characters, when I am using the file sender adapter and creating a UTF-8 xml from my UTF-16-csv-file.
thanks and best regards
Armin
Hello Armin,
I have gone through some SAP notes and blogs to find solution to your problem, here is what I found
1. SAP NOTE 821267
____________________
q) How do I correctly configure the File Encoding used by the File
Adapter?
Flat Files with File Content Conversion
For a File Sender channel, configure the encoding of the source
file. The file will be interpreted according to the configured
encoding and converted to XML with an UTF-8 encoding.
For a File Receiver channel, configure the encoding to match
the encoding you would like to be written to the target flat
file.
- Flat Files without File Content Conversion
Whether to configure an encoding in this case depends on if you
want to pass through the file "as is", e.g. within a File
Sender to File Receiver scenario, or if you want to convert the
file's encoding on its way through the Integration Server. For
"as is" processing, configure both the sender and the receiver
using the File Type setting "Binary".
To apply an encoding conversion, configure the respective
source and target encoding in both the sender and receiver
channel.
Important: Configuring an encoding in the receiver channel
will only lead to the expected results if the payload sent to
the receiver channel is in UTF-8 format (e.g., by having
specified an encoding conversion in the Sender channel).
So as per this note if you configure the encoding scheme of sender communication channel to UTF-16LE, adapter should be able to convert it to UTF-8 by default. But you have posted that this encoding scheme is leading to unreadable charcters
2) SAP note 880173
This speaks of use of module XMLAnonymizerBean which can be applied to XML payload to change its encoding.
3) How to guide on encoding : http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/502991a2-45d9-2910-d99f-8aba5d79f...
4) SAP note:960663
http://help.sap.com/saphelp_nw04/helpdata/en/45/da2deb47812e98e10000000a155369/content.htm
TextCodePageConversion Bean details which might solve your problem.
5) Finally if nothing above works you need a java mapping code to convert to target XML structure without any File content conversion. The mapping will convert the received file to proper target XML in "UTF-8" encoding. Please let us know if you need help on the code with this final option.
Regards
Anupam
Thank you for all postings here.
I also consulted SAP in service marketplace.
These bytes at the beginning of the file are the "BOM" (Byte Order Maps) showing the unicode type (UTF-8 or UTF-16,...) and the byte order (little endian or big endian)
http://de.wikipedia.org/wiki/Byte_Order_Mark
File sender does not remove these bytes. So we need to use one of many options to remove them in our mapping (using a java mapping, or using IndexOf() insteed of equals() to check the content,...)
Armin
User | Count |
---|---|
85 | |
23 | |
11 | |
9 | |
8 | |
5 | |
5 | |
5 | |
5 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.