To parse an XML document with the XML PARSE statement, the document must be
encoded using a supported encoding. The supported encodings for a particular parse
depend on:
- The category of the data item that contains the XML document
- The setting of the XMLPARSE compiler option
- The optional phrases that are specified on the XML PARSE statement.
For XML documents that are contained in a national data item, the supported code page is
Unicode UTF-16BE
(big-endian), CCSID 1200.
For XML documents that are contained in an alphanumeric data item, the supported code pages
when the XMLPARSE(XMLSS) compiler option is in effect are:
- If the RETURNING NATIONAL phrase is specified on
the XML PARSE: Unicode UTF-8 or any EBCDIC or ASCII code page that is
supported by the z/OS Unicode Services for conversion to Unicode UTF-16.
- If the RETURNING NATIONAL phrase is not specified: Unicode UTF-8 or any of
the single-byte EBCDIC code pages listed in the related reference about
Coded character sets for XML documents.
For XML documents that are contained in an alphanumeric data item, the supported code pages
when the XMLPARSE(COMPAT) compiler option is in effect are
specified in the
related reference about
Coded character sets for XML documents.
Determining the encoding of an input XML document
The parser must know the encoding for an XML document in order to process it correctly.
If the specified encoding is not one of the supported coded character sets, the parser signals
an XML exception event before beginning the parse
operation. If the actual document encoding does not match the specified encoding, the
parser signals
an appropriate XML exception after beginning the parse operation.
Several sources of encoding information are used in determining the encoding of an XML document:
- When the XMLPARSE(XMLSS) option is in effect:
- The datatype of the data item that contains the XML document
- The optional ENCODING phrase of the XML PARSE statement
- The CCSID specified by the CODEPAGE compiler option
- When the XMLPARSE(COMPAT) option is in effect:
- The datatype of the data item that contains the XML document
- The encoding declaration specified within the XML document.
- The CCSID specified by the CODEPAGE compiler option
- The actual encoding of the XML document, determined by examining the first few bytes of the
document
When the XMLPARSE(XMLSS) option is in effect:
- Any encoding declaration specified within the XML document is ignored.
- For XML documents that are contained in a national data item, the
ENCODING phrase of the XML PARSE statement must be omitted or must specify
CCSID 1200. The CCSID specified by the CODEPAGE compiler option is ignored. The
parser signals an XML exception event if the actual document encoding is not Unicode UTF-16BE.
- For XML documents that are contained in an alphanumeric data item, the CCSID
specified in the ENCODING phrase of the XML PARSE statement overrides the
CODEPAGE compiler option.
- If the XML PARSE statement includes an ENCODING phrase, the specified CCSID
overrides the CCSID specified in the CODEPAGE compiler option. The parser raises an XML exception event at
the beginning of the parse if the actual document encoding is not consistent with the
specified CCSID.