To parse XML documents, use the XML PARSE statement, specifying the XML document that is to be parsed and the procedure for handling XML events that occur during parsing. You can also optionally specify what action should be taken after parsing finishes.
XML PARSE XMLDOCUMENT PROCESSING PROCEDURE XMLEVENT-HANDLER ON EXCEPTION DISPLAY 'XML document error ' XML-CODE STOP RUN NOT ON EXCEPTION DISPLAY 'XML document was successfully parsed.' END-XML
In the XML PARSE statement you first identify the data item (XMLDOCUMENT in the example above) that contains the XML document character stream. In the DATA DIVISION, you can declare the identifier as national (either a national group item or an elementary item of category national) or as alphanumeric (either an alphanumeric group item or an elementary item of category alphanumeric). If it is national, its content must be encoded in Unicode UTF-16LE, CCSID 1202. If it is alphanumeric, its content must be encoded with one of the supported single-byte EBCDIC or ASCII character sets. See the related reference below about coded characters sets for more information.
If the CHAR(EBCDIC) compiler option is in effect, do not specify the NATIVE keyword on the data description entry for the identifier if the entry describes an alphanumeric data item. If the CHAR(EBCDIC) option is in effect and the identifier is alphanumeric, the content of the identifier must be encoded in EBCDIC.
ASCII XML documents that do not contain an encoding declaration are parsed with the code page indicated by the current runtime locale. EBCDIC XML documents that do not contain an encoding declaration are parsed with the code page specified in the EBCDIC_CODEPAGE environment variable. If the EBCDIC_CODEPAGE environment variable is not set, they are parsed with the default EBCDIC code page selected for the current runtime locale.
XML declaration: If the document that you are parsing contains an XML declaration, the declaration must begin in the first byte of the document. If the string <?xml starts after the first byte of the document, the parser generates an exception code. The attribute names that are coded in the XML declaration must all be in lowercase characters.
Next you specify the name of the procedure (XMLEVENT-HANDLER in the example) that is to handle the XML events from the document.
In addition, you can specify either or both of the following phrases to receive control after parsing finishes:
You can end the XML PARSE statement with the explicit scope terminator END-XML. Use END-XML to nest an XML PARSE statement that uses the ON EXCEPTION or NOT ON EXCEPTION phrase in a conditional statement (for example, in another XML PARSE statement or in an XML GENERATE statement).
The parser passes control to the processing procedure for each XML event. Control returns to the parser when the end of the processing procedure is reached. This exchange of control between the XML parser and the processing procedure continues until one of the following events occurs:
Special registers: Use the XML-EVENT special register to determine which event the parser passes to your processing procedure. XML-EVENT contains an event name such as 'START-OF-ELEMENT'. The parser passes the content for the event in special register XML-TEXT or XML-NTEXT, depending on the type of the XML identifier specified in the XML PARSE statement.
related tasks
Understanding the encoding of XML documents
Writing procedures to process XML
Specifying the code page with a locale
related references
Locales and code pages that are supported
Coded character sets for XML documents
The content of XML-EVENT
XML PARSE statement
(COBOL for Windows Language Reference)