cancel
Showing results for 
Search instead for 
Did you mean: 

Byte Order Mark (BOM) not found in UTF-8 file download from XI

Former Member
0 Kudos

Hi Guys,

Facing difficulty in downloading file from XI in UTF-8 format with byte order mark.

Receiver File adapter has been configured to download the file in UTF-8 file format. But the byte order mark is missing. Same works well for UTF-16. Could see the byte order mark at the beginning of file "FEFF" for UTF-16BE - Unicode big endian.

As per SAP help, UTF-8 supposed to be the default encoding for TEXT file type.

Configuring the Receiver File/FTP Adapter in the SAP help link.

http://help.sap.com/saphelp_nw04/helpdata/en/d2/bab440c97f3716e10000000a155106/frameset.htm

Could you please advice on how to achieve BOM in UTF-8 file as it is very important for the outbound file to get loaded in our vendor system.

Thanks.

Best Regards

Thiru

Accepted Solutions (0)

Answers (3)

Answers (3)

Former Member
0 Kudos

Hi!<br>

<br>

Had the same problem. But here, we create a "CSV"-File which must have the BOM otherwise it will not be recogniced as UTF-8.

<br>

Therefore I've done the folowing:

Created a simple destination-structure which represents the CSV and done the mapping with the graphical-mapper. The destination-Structure looks like:

<br>

(?xml version="1.0" encoding="UTF-8"?)<br>
(ONLYLINES)<br>
	(LINE)<br>
		(ENTRY)Hello I'm line 1(/ENTRY)<br>
	(/LINE)<br>
	(LINE)<br>
		(ENTRY)and I'm line 2(/ENTRY)<br>
	(/LINE)<br>
(/ONLYLINES)

As you can see, the "ENTRY"-Element holds the data.<br>

<br>

Now I've created the folowing Java-Mapping and added that mapping within the Interface-Mapping as second step after the graphical mapping:<br>

<br>

---cut---<br>
package sfs.biz.xi.global;<br>
<br>
import java.io.InputStream;<br>
import java.io.OutputStream;<br>
import java.util.Map;<br>
<br>
import javax.xml.parsers.DocumentBuilder;<br>
import javax.xml.parsers.DocumentBuilderFactory;<br>
<br>
import org.w3c.dom.Document;<br>
import org.w3c.dom.Element;<br>
import org.w3c.dom.NodeList;<br>
<br>
import com.sap.aii.mapping.api.StreamTransformation;<br>
import com.sap.aii.mapping.api.StreamTransformationException;<br>
<br>
public class OnlyLineConvertAddingBOM implements StreamTransformation {<br>
<br>
	public void execute(InputStream in, OutputStream out) throws StreamTransformationException {<br>
		try {<br>
			byte BOM[] = new byte[3];<br>
			BOM[0]=(byte)0xEF;<br>
			BOM[1]=(byte)0xBB;<br>
			BOM[2]=(byte)0xBF;<br>
			String retString=new String(BOM,"UTF-8");<br>
			Element ServerElement;<br>
			NodeList Server;<br>
			<br>
            DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();<br>
            DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();<br>
            Document doc = docBuilder.parse(in);<br>
            doc.getDocumentElement().normalize();<br>
            NodeList ConnectionList = doc.getElementsByTagName("ENTRY");<br>
            int count=ConnectionList.getLength();<br>
            for (int i=0;i<count;i++) {<br>
                ServerElement = (Element)ConnectionList.item(i);<br>
                Server = ServerElement.getChildNodes();<br>
                retString += Server.item(0).getNodeValue().trim() + "\r\n";<br>
            }<br>
            <br>
            out.write(retString.getBytes("UTF-8"));<br>
			<br>
		} catch (Throwable t) {<br>
			throw new StreamTransformationException(t.toString());<br>
		}<br>
	}<br>
<br>
	public void setParameter(Map arg0) {<br>
		// TODO Auto-generated method stub<br>
		<br>
	}<br>
<br>
/*<br>
	public static void main(String[] args) {<br>
		File testfile=new File("c:\\instance.xml");<br>
		File testout=new File("C:\\testout.txt");<br>
		FileInputStream fis = null;<br>
		FileOutputStream fos= null;<br>
		OnlyLineConvertAddingBOM myFI=new OnlyLineConvertAddingBOM();<br>
		try {<br>
		      fis = new FileInputStream(testfile);<br>
			  fos = new FileOutputStream(testout);<br>
		      myFI.setParameter(null);<br>
		      myFI.execute(fis, fos);<br>
		} catch (Exception e) {<br>
			e.printStackTrace();<br>
		}<br>
		      <br>
		      <br>
	}<br>
	*/<br>
<br>
}<br>
--cut---

<br>

This Mapping searches all "ENTRY"-Tags within the XML-Strucure and creates a big string which startes with the UTF-8-BOM and than combined each ENTRY-Element, separated by CR/LF.<br>

<br>

We use this as Payload for an Mail-Adapter (sending via SMTP) but it should also work on File-Adapter.<br>

<br>

Hope it helps.<br>

Rene<br>

<br>

Besides: could someone tell SAP that this editor is the WORSEST editor I've ever seen. Maybe this guys should copy somethink from wikipedia :-((

Edited by: Rene Pilz on Oct 8, 2009 5:06 PM

stefan_grube
Active Contributor
0 Kudos

> Besides: could someone tell SAP that this editor is the WORSEST editor I've ever seen. Maybe this guys should copy somethink from wikipedia :-((

Why don't you tell them?

stefan_grube
Active Contributor
0 Kudos

There is no byte order mark in UTF-8 as UTF-8 is a single byte code page.

Windows creates a unicode identifier for a UTF-8 text file, but this is Windows specific and no standard.

Regards

Stefan

Former Member
0 Kudos

Hi Stefan,

Thanks for the info on UTF-8. Able to achieve byte-order mark using dataset but does not work when the file was downloaded via XI. Communication channel were set correctly as mentioned in the first message.

BOM is the mandatory requirement for our vendor to identify the file type and to process the file accordingly.

Best Regards

Thiru

stefan_grube
Active Contributor
0 Kudos

> BOM is the mandatory requirement for our vendor to identify the file type and to process the file accordingly.

In that case you have to provide the missing characters with a Java mapping or an adapter module.

A "BOM" for UTF-8 is not used outside the Windows world.

http://en.wikipedia.org/wiki/Byte-order_mark

Regards

Stefan

former_member732072
Active Participant
0 Kudos

Hi,

Please have a look at the following link and see if it helps

http://unicode.org/faq/utf_bom.html

Best Regards

Former Member
0 Kudos

Hi Prakash,

Thanks for the quick response. Am aware of UTF-8 encoding. Using XI to download the file from Receiver File adapter. As per SAP docu, it says, file type must be set as Text in Communication Channel to download the file in UTF-8 encoding being the default. But the file gets downloaded in ANSI format.

Just checking is there anything which could contribute to this unexpected behaviour.

Thanks.

Best Regards

Thiru