The parser decides how to process your document by using certain sources of information about the document encoding. These sources of encoding information must be consistent with one another. The parser signals an XML exception event if it finds a conflict.
The sources of encoding information are:
The basic document encoding is one of the following encoding categories that the parser determines by examining the first few bytes of the XML document:
The type of the data item that contains the document is also relevant. The parser supports the following combinations:
The discovery that the encoding is UTF-16 also provides the code-page information, CCSID 1200 UTF-16BE (big-endian), since Unicode is effectively a single large code page. Thus, if the parser finds a UTF-16 document in a national data item, it ignores external code-page information.
But if the basic document encoding is ASCII or EBCDIC, the parser needs specific code-page information to be able to parse correctly. This additional code-page information is acquired from the document encoding declaration or from the external code page.
The document encoding declaration is an optional part of the XML declaration at the beginning of the document. (See the related task about specifying the code page for details.)
The external code page is the value in effect for the CODEPAGE compiler option.
Finally, the encoding must be one of the supported code pages. (See the related reference below about coded character sets for XML documents for details.)
related tasks
Specifying the code page