cancel
Showing results for 
Search instead for 
Did you mean: 

Code Page Question

Former Member
0 Kudos

I am implementing a Java mapping to remove xml tags from a message payload. The problem is that when I implement this mapping in the interface mapping I am losing the special characters (the payload is a pdf). However when I test this using eclipse the output is correct with the correct characters. Anyone have any ideas why the output in PI is different when it seems that the mapping is doing the right thing when tested in a standard IDE?

Thanks

Accepted Solutions (0)

Answers (9)

Answers (9)

Former Member
0 Kudos

Hi Stefan,

The test is run from the test tab in the operation mapping. Although I have also seen the same behavious at runtime, yesterday. I used you're blog as inspiration and utilized the tcpgateway

Yes the xml's are identical, I cut and paste from one to the other and visa versa. I view the output through notepad but I also rename the output file .pdf to see if it opens in adobe.

Initially I was converting to string, hence thinking it could be an issue with character set. Sorry I can't rename the thread now.

Thank you for your help by the way it is much much appreciated!!!!

stefan_grube
Active Contributor
0 Kudos

you should set up a scenario file to file.

Then check the target file with a hex editor to see, if is is valid.

could you post the final version of your code?.

Former Member
0 Kudos

Hi Stefan,

I'll look into that.

Here is the final code. Any other ideas

public class RemoveXML extends AbstractTransformation {

	public void transform(TransformationInput arg0, TransformationOutput arg1)
	throws StreamTransformationException {

		String inputPayload = convertInputStreamToString(arg0.getInputPayload()
		 		.getInputStream());
		 	 
		String ImageStartTag = "<Image>";
		String ImageEndTag = "</Image";
		 
		String base64String = inputPayload.substring(inputPayload.indexOf(ImageStartTag) + ImageStartTag.length(), 
		                                                                         inputPayload.lastIndexOf(ImageEndTag));
		try{
		 
		 BASE64Decoder decoder = new BASE64Decoder();
		 byte b[] = decoder.decodeBuffer(base64String);
		 arg1.getOutputPayload().getOutputStream().write(b);
		}
		
catch(Exception e) {
	System.out.println(e);
	}
}

	public String convertInputStreamToString(InputStream in) {
		StringBuffer sb = new StringBuffer();
		try {
			InputStreamReader isr = new InputStreamReader(in);
			Reader reader = new BufferedReader(isr);
			int ch;
			while ((ch = in.read()) > -1) {
				sb.append((char) ch);
			}
			reader.close();
		} catch (Exception exception) {
		}
			return sb.toString();
	}
}

stefan_grube
Active Contributor
0 Kudos

this code seems correct.

So check the result of the mapping with hex editor. A text editor will not help you.

Former Member
0 Kudos

Hi Stefan,

The output is from the operation mapping in the test tab. Although I also saw the same thing yesterday at runtime from the the HTTP post, using the tcpgateway as described in your blog

/people/stefan.grube/blog/2007/03/29/troubleshooting-soap-http-and-mail-adapter-scenarios-with-tcpgateway

Yes the source XML is identical, I literall cut and paste from an xml file into the text tab of the operation mapping. Then I test the actual file using the NWDS.

This is a nightmare

In any case thanks for your help thus far!

Regarding the code page issue... maybe the thread is named incorrectly... sorry about this, when I first posted I was converting back to strings etc...

Edited by: SeekingAnswers on Jun 8, 2010 4:46 PM

Edited by: SeekingAnswers on Jun 8, 2010 4:48 PM

Former Member
0 Kudos

Hi Stefan,

Thanks again for your input. Unfortunately again it doesn't make a difference to the output although the code is a lot more stremlined

I still get the output like below. I believe this is now an issue with the JVM settings and not the code as I guess you have also tested as have I on something like NWDS and the output works perfectly, it just falls over when it enters PI.....

If you have any other ideas please let me know, otherwise thank you for your help and I will keep you updated on the progress with the JVM

uFFFDuFFFDuFFFD  uFFFDuFFFDA  uFFFDu0138u02004(uFFFDuFFFDuFFFD?L4(uFFFD4(uFFFDuFFFDuFFFDuFFFDuFFFDuFFFD4(uFFFDuFFFD4(uFFFD1uFFFDuFFFDuFFFDu0460uFFFDuFFFDuFFFDuFFFDuFFFDuFFFDH4(uFFFD uFFFDuFFFDu0455uFFFDuFFFD uFFFDfuFFFD FTFV6uFFFDFR

stefan_grube
Active Contributor
0 Kudos
uFFFDuFFFDuFFFD  uFFFDuFFFDA  uFFFDu0138u02004(uFFFDuFFFDuFFFD?L4(uFFFD4(uFFFDuFFFDuFFFDuFFFDuFFFDuFFFD4(uFFFDuFFFD4(uFFFD1uFFFDuFFFDuFFFDu0460uFFFDuFFFDuFFFDuFFFDuFFFDuFFFDH4(uFFFD uFFFDuFFFDu0455uFFFDuFFFD uFFFDfuFFFD FTFV6uFFFDFR

where you get this from?

stefan_grube
Active Contributor
0 Kudos

make sure that the string that you use for base64 decoding is accurate. if the string is just one character too long or too short, the result of the decoding is totally different.

Former Member
0 Kudos

Hi Stefan,

I get the output from the operation mapping in PI. I know that the base64 string is correct because when I execuite the mapping in the NWDS for test purposes I get the desired output.

We tried changing the file.encoding parameter but it had little effect... back to square one....

stefan_grube
Active Contributor
0 Kudos

> I get the output from the operation mapping in PI.

From test mode? runtime? from file?

Could you simply explain what you do?

> I know that the base64 string is correct because when I execuite the mapping in the NWDS for test purposes I get the desired output.

you are sure, that your source XML is identical?

How do you view the output?

> We tried changing the file.encoding parameter but it had little effect... back to square one....

Of course not. Let me tell you again: you cannot have a codepage issue for a binary

Edited by: Stefan Grube on Jun 8, 2010 4:17 PM

Former Member
0 Kudos

One more thing I found someone else having the same issue a while back on this forum and they mentioned doing the following;

Under certain operating system platforms, such as Solaris, the APIs used by the Java Runtime (JRE) are not Unicode-aware. Consequently, the JRE needs to be configured to correctly interpret the character set it receives from the operating system.

This is configured through the "file.encoding" system property as well as the "LANG" environment variable.

Make sure you set "file.encoding" to a character set (such as ISO-8859-1) that supports the special characters you would like to process. This system property can be configured by appending "-Dfile.encoding=<encoding>" to the Java VM parameters section of the SAP J2EE Config Tool.

Additionally, you need to set the "LANG" environment variable to a locale that supports more than 7 bits, such as "de.ISO8859-1". The encoding you specify in the LANG environment variable needs to match the encoding set via "file.encoding".

You can persistently configure the environment variable by setting it in the profile $HOME/.sapenv_$HOSTNAME.csh of the <sid>adm user: setenv LANG de.ISO8859-1.

What do you think?

stefan_grube
Active Contributor
0 Kudos

> What do you think?

PI works with UTF-8 which is Java standard.

Maybe you change your eclipse environment to UTF-8 also?

Former Member
0 Kudos

Hi Stefan,

Thanks again for your help. It is really appreciated!!!!

I implemented the changes as can be seen below. However I am still getting the issue whereby in the interface mapping the output is completely not as expected.

I get the following error when trying to display the tree view which is expected

Unable to display tree view; Error when loading XML document (Invalid byte 1 of 1-byte UTF-8 sequence.)

and then when I go to the source view I can see that all of the characters are not as expected.

public void transform(TransformationInput arg0, TransformationOutput arg1)
	throws StreamTransformationException {


String inputPayload = convertInputStreamToString(arg0.getInputPayload()
		.getInputStream());


String RootStartTag = "<ns2:MT_Source xmlns:ns2=\"urn:com:any:item\">";
String RootEndTag = "</ns2:MT_Source>";
String ImageStartTag = "<Item>";
String ImageEndTag = "</Item>";
String XmlHead = "<\\?xml version=\"1.0\" encoding=\"utf-8\" \\?>";


inputPayload = inputPayload.replaceAll(RootStartTag, "");
inputPayload = inputPayload.replaceAll(RootEndTag, "");
inputPayload = inputPayload.replaceAll(ImageStartTag, "");
inputPayload = inputPayload.replaceAll(ImageEndTag, "");
inputPayload = inputPayload.replaceAll(XmlHead, "");


try{

BASE64Decoder decoder = new BASE64Decoder();

byte b[] = decoder.decodeBuffer(inputPayload);
arg1.getOutputPayload().getOutputStream().write(b);


}
catch(Exception e) {
	System.out.println(e);
}
}


public String convertInputStreamToString(InputStream in) {
StringBuffer sb = new StringBuffer();
try {
	InputStreamReader isr = new InputStreamReader(in);
	Reader reader = new BufferedReader(isr);
	int ch;
	while ((ch = in.read()) > -1) {
		sb.append((char) ch);
	}
	reader.close();
} catch (Exception exception) {
}
return sb.toString();
}
}

Edited by: SeekingAnswers on Jun 8, 2010 10:38 AM

stefan_grube
Active Contributor
0 Kudos

> I get the following error when trying to display the tree view which is expected

> Unable to display tree view; Error when loading XML document (Invalid byte 1 of 1-byte UTF-8 sequence.)

Of course you get this. You have a binary, you cannot watch this as XML

> and then when I go to the source view I can see that all of the characters are not as expected.

Do you use Notepad? It does not make sense to watch a binary with Notepad.

Save the file with suffix .pdf and open it with PDF reader. Is this valid or not?

stefan_grube
Active Contributor
0 Kudos
String inputPayload = convertInputStreamToString(arg0.getInputPayload()
 		.getInputStream());
 
 
String ImageStartTag = "<Item>";
String ImageEndTag = "</Item>";

String base64String = inputPayload.substring(inputPayload.indexOf(ImageStartTag)  + ImageStartTag.length(), 
                                                                         inputPayload.lastIndexOf(ImageEndTag));
try{
 
 BASE64Decoder decoder = new BASE64Decoder();
 
 byte b[] = decoder.decodeBuffer(base64String);
Former Member
0 Kudos

Hi Stefan,

Thanks for your response.

I have implemented the decode in the java mapping but still the same result.

The characters on the out put are like the following:

��f��ޮȨ����O�zw(v)����?��������� 0?%PDF-1.2

%����

9 0 obj

<<

/Length 10 0 R

/Filter /FlateDecode

>>

stream

H�͐�J�0 ��

When they should be more like

%PDF-1.2

%âãÏÓ

9 0 obj

<<

/Length 10 0 R

/Filter /FlateDecode

>>

stream

H?Í?ÑJÃ0 ?? ïð{§²fç$M?

Here you can see a sample of the code

public void transform(TransformationInput arg0, TransformationOutput arg1)
			throws StreamTransformationException {

		String inputPayload = convertInputStreamToString(arg0.getInputPayload()
				.getInputStream());
		String outputPayload = "";

		String RootStartTag = "<ns2:MT_Source xmlns:ns2="urn:com:any:item">";
		String RootEndTag = "</ns2:MT_Source>";
		String ImageStartTag = "<item>";
		String ImageEndTag = "</item>";
		String XmlHead = "<\?xml version="1.0" encoding="utf-8" \?>";
		
		inputPayload = inputPayload.replaceAll(RootStartTag, "");
		inputPayload = inputPayload.replaceAll(RootEndTag, "");
		inputPayload = inputPayload.replaceAll(ImageStartTag, "");
		inputPayload = inputPayload.replaceAll(ImageEndTag, "");
		inputPayload = inputPayload.replaceAll(XmlHead, "");
		
		String decodedValue = "";

		try{

		BASE64Decoder decoder = new BASE64Decoder();

		byte b[] = decoder.decodeBuffer(inputPayload);
		decodedValue = new String(b);

		}
		catch(Exception e) {
			System.out.println(e);
		}
		

		outputPayload = decodedValue;
		

		try {

			/*
			 * Output payload is returned using the TransformationOutput class
			 * arg1.getOutputPayload().getOutputStream()
			 */

			arg1.getOutputPayload().getOutputStream().write(
					outputPayload.getBytes("iso-8859-1"));
		} catch (Exception exception1) {
		}
	}

	public String convertInputStreamToString(InputStream in) {
		StringBuffer sb = new StringBuffer();
		try {
			InputStreamReader isr = new InputStreamReader(in);
			Reader reader = new BufferedReader(isr);
			int ch;
			while ((ch = in.read()) > -1) {
				sb.append((char) ch);
			}
			reader.close();
		} catch (Exception exception) {
		}
		return sb.toString();
	}
}

Edited by: SeekingAnswers on Jun 8, 2010 9:54 AM

stefan_grube
Active Contributor
0 Kudos

This is thie issue:

decodedValue = new String(b);

outputPayload = decodedValue;

outputPayload.getBytes("iso-8859-1"));

here you have a conversion

byte[] -> UTF-8 -> String -> iso-8859-1 -> byte[]

As I said before, use only

byte[]

not String

arg1.getOutputPayload().getOutputStream().write(b);

Edited by: Stefan Grube on Jun 8, 2010 10:03 AM

Edited by: Stefan Grube on Jun 8, 2010 10:03 AM

stefan_grube
Active Contributor
0 Kudos

The characters on the out put are like the following:

��f��ޮȨ����O�zw(v)����?��������� 0?%PDF-1.2

%����

When they should be more like

%PDF-1.2

%âãÏÓ

It seems youre replace statements do not delete everything in XML, so the base64 decoder has more input is necessary.

You should work with substring to get your base64.

search for ImageStartTag and ImageEndTag

Former Member
0 Kudos

Please also bear in mind that the ouput file when tested in eclipse gives the the correct output as shown above, and even the UDF gave the correct output characters. It seems to be an issue with the output of the java mapping when executed from within the Operation Mapping.

Edited by: SeekingAnswers on Jun 8, 2010 9:53 AM

Former Member
0 Kudos

Hi sorry I will be a little clearer.

I have a pdf message which I have decoded from a Base64 string using a udf in a graphical mapping. That is fine all characters seem to be there. The problem is that after this I am using a java mapping to remove the xml tags from the message so that I can send it as a plain text. In the output of the Java Mapping all of my characters are transformed into something different. For example umlauts or similar characters appear as question marks or boxes.

So basically I am trying to workout how to ensure that the output of my java mapping keeps the same character set as it has going into it, which at the moment it isn't.

Further to this when I test the java mapping using eclipse the output is as expected, which leads me to believe that it is not the code but more the way PI handles message mappings.

Does that make a little more sense?

stefan_grube
Active Contributor
0 Kudos

> I have a pdf message which I have decoded from a Base64 string using a udf in a graphical mapping.

A PDF is a binray which should be decoded in a UDF.

You could do following:

Use a Java mapping to decode the base64 PDF. The output is plain PDF binary.

>So basically I am trying to workout how to ensure that the output of my java mapping keeps the same character set as it has going into it, which at the moment it isn't.

When you work in your Java mapping only with byte[] and not with String, then the outpu is the same as the input.

But when you transform the InputStream to String, then you really cannot know what happens, when you have a binary as input.

stefan_grube
Active Contributor
0 Kudos

I have really no idea, how you remove XML tags from a PDF.

And what special characters do you mean?

Could you explain this a little be more detailled?