Rational Developer for System z
COBOL for Windows, Version 7.5, Programming Guide


Understanding the encoding of XML documents

The parser decides how to process your document by using certain sources of information about the document encoding. These sources of encoding information must be consistent with one another. The parser signals an XML exception event if it finds a conflict.

The sources of encoding information are:

The basic document encoding is one of the following encoding categories that the parser determines by examining the first few bytes of the XML document:

The type of the data item that contains the document is also relevant. The parser supports the following combinations:

The discovery that the encoding is UTF-16 also provides the code-page information, CCSID 1202 UTF-16LE (little-endian), since Unicode is effectively a single large code page. Thus, if the parser finds a UTF-16 document in a national data item, it ignores external code-page information.

But if the basic document encoding is ASCII or EBCDIC, the parser needs specific code-page information to be able to parse correctly. This additional code-page information is acquired from the document encoding declaration or from the external code pages.

The document encoding declaration is an optional part of the XML declaration at the beginning of the document. (See the related task about specifying the code page for details.)

The external code page for ASCII XML documents (the external ASCII code page) is the code page indicated by the current runtime locale. The external code page for EBCDIC XML documents (the external EBCDIC code page) is either:

Finally, the encoding must be one of the supported code pages. (See the related reference below about coded character sets for XML documents for details.)

related tasks
Specifying the code page

related references
Locales and code pages that are supported
Coded character sets for XML documents
XML PARSE exceptions that allow continuation
XML PARSE exceptions that do not allow continuation


Terms of use | Feedback

Copyright IBM Corporation 1996, 2008.
This information center is powered by Eclipse technology. (http://www.eclipse.org)