cancel
Showing results for 
Search instead for 
Did you mean: 

UTF-8 Invalid characters on PI

0 Kudos

Hi all,

Do you have any list of invalid characters on UTF-8?

The reason is that for example & and # both are invalid characters in XML terms....so I want to take care that tha payload does not have the special character or the mapping in PI is capable of handling it

I want to get all the invalid characters.

Please help.

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

"<" , ">" ,"&" , ' ," are the most common escape characters.

It might not be required to avoid them all the times as PI will perform escaping and de-escaping when converting from text to XML.

Are you sending/receiving a XML file from PI to a sender/receiver system which cannot handle escape sequence?

Answers (4)

Answers (4)

0 Kudos

The XML specification does not use the term "character entity" or "character entity reference". The XML specification defines five "predefined entities" representing special characters, and requires that all XML processors honor them. The entities can be explicitly declared in a DTD, as well, but if this is done, the replacement text must be the same as the built-in definitions. XML also allows other named entities of any size to be defined on a per-document basis.

The table below lists the five XML predefined entities. The "Name" column mentions the entity's name. The "Character" column shows the character, if it is renderable. In order to render the character, the format &name; is used; for example, &amp; renders as &. The "Unicode code point" column cites the character via standard UCS/Unicode "U+" notation, which shows the character's code point in hexadecimal. The decimal equivalent of the code point is then shown in parentheses. The "Standard" column indicates the first version of XML that includes the entity. The "Description" column cites the character via its canonical UCS/Unicode name, in English.

Name Character Unicode code point (decimal) Standard Description

quot " U+0022 (34) XML 1.0 (double) quotation mark

amp & U+0026 (38) XML 1.0 ampersand

apos ' U+0027 (39) XML 1.0 apostrophe (= apostrophe-quote)

lt < U+003C (60) XML 1.0 less-than sign

gt > U+003E (62) XML 1.0 greater-than sign

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references

I will use this... thanks for your help.

Edited by: Israel Toledo on Aug 4, 2011 7:21 PM

0 Kudos

The scenario is HTTP ->PI -> IDOC.

I already see that links but im looking for something like this:

'&' replace with & amp

'"' replace with & quot

''' replace with & #039

'<' replace with & lt

'>' replace with & gt

...

but the complete list.

Edited by: Israel Toledo on Aug 4, 2011 6:23 PM

Edited by: Israel Toledo on Aug 4, 2011 6:24 PM

stefan_grube
Active Contributor
0 Kudos

Here you find everything that you need to know:

http://www.w3.org/TR/2006/REC-xml11-20060816/#charsets (allowed unicode characters)

and

http://www.w3.org/TR/2006/REC-xml11-20060816/#syntax (allowed characters as data)

baskar_gopalakrishnan2
Active Contributor
0 Kudos