The XML PARSE statement is the COBOL language interface to the high-speed XML parser that is part of the COBOL run time. The XML PARSE statement parses an XML document into its individual pieces and passes each piece, one at a time, to a user-written processing procedure.
XML PARSE statements must not be specified in declarative procedures.
If identifier-1 is a national group item, identifier-1 is processed as an elementary data item of category national.
When the CHAR(EBCDIC) compiler option is in effect and identifier-1 is an elementary item of USAGE DISPLAY, the NATIVE keyword must not be specified on the data description entry for identifier-1.
When the CHAR(EBCDIC) compiler option is in effect and identifier-1 is either an alphanumeric group item or an elementary alphanumeric data item, the content of identifier-1 must be encoded in EBCDIC. Other encodings, such as ASCII or packed decimal, might cause errors at run time.
If identifier-1 is alphanumeric and either its data description entry contains the NATIVE phrase or the CHAR(EBCDIC) compiler option is not in effect, the content of identifier-1 must be encoded using one of the ASCII character sets listed in Coded character sets for XML documents in the COBOL for Windows Programming Guide. If the XML document in such a data item does not specify an encoding declaration, the XML document is parsed with the code page indicated by the current runtime locale.
If identifier-1 is alphanumeric and its data description entry does not contain the NATIVE phrase and the CHAR(EBCDIC) compiler option is in effect, the content of identifier-1 must be encoded using one of the single-byte EBCDIC character sets listed in Coded character sets for XML documents in the COBOL for Windows Programming Guide. If the XML document in such a data item does not specify an encoding declaration, the XML document is parsed with the code page specified by the EBCDIC_CODEPAGE environment variable, or if the EBCDIC_CODEPAGE environment variable is not set, the default EBCDIC code page selected for the current runtime locale as described in the COBOL for Windows Programming Guide.
See the COBOL for Windows Programming Guide for more information on setting and using runtime locales and code pages. The single-byte ASCII and EBCDIC code pages are those for which the column labeled Language group (the rightmost column) of the table Locales and code pages supported does not specify “Ideographic languages.”
If identifier-1 is of category national, its content must be encoded using CCSID 1202 (Unicode UTF-16LE). It must not contain any character entities that are represented using multiple encoding units. Use a character reference, for example:
The only necessary relationship between procedure-name-1 and procedure-name-2 is that they define a consecutive sequence of operations to execute, beginning at the procedure named by procedure-name-1 and ending with the execution of the procedure named by procedure-name-2.
If there are two or more logical paths to the return point, then procedure-name-2 can name a paragraph that consists of only an EXIT statement; all the paths to the return point must then lead to this paragraph.
The processing procedure consists of all the statements at which XML events are handled. The range of the processing procedure includes all statements executed by CALL, EXIT, GO TO, GOBACK, INVOKE, MERGE, PERFORM, and SORT statements that are in the range of the processing procedure, as well as all statements in declarative procedures that are executed as a result of the execution of statements in the range of the processing procedure.
The range of the processing procedure must not cause the execution of any GOBACK or EXIT PROGRAM statement, except to return control from a method or program to which control was passed by an INVOKE or CALL statement, respectively, that is executed in the range of the processing procedure.
The range of the processing procedure must not cause the execution of an XML PARSE statement, unless the XML PARSE statement is executed in a method or outermost program to which control was passed by an INVOKE or CALL statement that is executed in the range of the processing procedure.
A program executing on multiple threads can execute the same XML statement or different XML statements simultaneously.
The processing procedure can terminate the run unit with a STOP RUN statement.
For more details about the processing procedure, see Control flow.
An exception condition exists when the XML parser detects an error in processing the XML document. The parser first signals an XML exception by passing control to the processing procedure with special register XML-EVENT containing 'EXCEPTION'. The parser also provides a numeric error code in special register XML-CODE, as detailed in the COBOL for Windows Programming Guide.
An exception condition also exists if the parsing is deliberately terminated by the processing procedure setting XML-CODE to -1 before returning to the parser for any normal XML event. In this case, the parser does not signal an XML exception event.
If the ON EXCEPTION phrase is specified, the parser then transfers control to imperative-statement-1. If the ON EXCEPTION phrase is not specified, the NOT ON EXCEPTION phrase, if any, is ignored, and control is transferred to the end of the XML PARSE statement.
Special register XML-CODE contains the numeric error code for the XML exception or -1 after execution of the XML PARSE statement.
If the processing procedure handles the XML exception event and sets XML-CODE to zero before returning control to the parser, the exception condition no longer exists. If no other unhandled exceptions occur prior to the termination of the parser, control is transferred to imperative-statement-2 of the NOT ON EXCEPTION phrase, if specified.
If an exception condition does not exist at termination of XML PARSE processing, control is transferred to imperative-statement-2 of the NOT ON EXCEPTION phrase, if specified. If the NOT ON EXCEPTION phrase is not specified, control is transferred to the end of the XML PARSE statement. The ON EXCEPTION phrase, if specified, is ignored.
Special register XML-CODE contains zero after execution of the XML PARSE statement.
The scope of a conditional XML GENERATE or XML PARSE statement can be terminated by:
END-XML can also be used with an XML GENERATE or XML PARSE statement that does not specify either the ON EXCEPTION or NOT ON EXCEPTION phrase.
For more information on explicit scope terminators, see Delimited scope statements.
When a given XML GENERATE or XML PARSE statement appears as imperative-statement-1 or imperative-statement-2, or as part of imperative-statement-1 or imperative-statement-2 of another XML GENERATE or XML PARSE statement, that given XML GENERATE or XML PARSE statement is a nested XML GENERATE or XML PARSE statement.
Nested XML GENERATE or XML PARSE statements are considered to be matched XML GENERATE and END-XML, or XML PARSE and END-XML combinations proceeding from left to right. Thus, any END-XML phrase that is encountered is matched with the nearest preceding XML GENERATE or XML PARSE statement that has not been implicitly or explicitly terminated.
When the XML parser receives control from an XML PARSE statement, the parser analyzes the XML document and transfers control to procedure-name-1 at the following points in the process:
Control returns to the XML parser when the end of the processing procedure is reached.
The exchange of control between the parser and the processing procedure continues until either:
Then, the parser terminates and returns control to the XML PARSE statement with the XML-CODE special register containing the most recent value set by the parser or the processing procedure.
For each XML event passed to the processing procedure, the XML-CODE, XML-EVENT, and XML-TEXT or XML-NTEXT special registers contain information about the particular event. The content of the XML-CODE special register is defined during and after execution of an XML PARSE statement. The contents of all other XML special registers are undefined outside the range of the processing procedure.
For normal XML events, special register XML-CODE contains zero when the processing procedure receives control. For XML exception events, XML-CODE contains one of the XML exception codes specified in the COBOL for Windows Programming Guide. Special register XML-EVENT is set to the event name, such as 'START-OF-DOCUMENT'. Either XML-TEXT or XML-NTEXT contains the piece of the document corresponding with the event, as described in XML-EVENT.
For more information about the XML special registers, see Special registers.
For all kinds of XML events, if XML-CODE is not zero when the processing procedure returns control to the parser, the parser terminates without a further EXCEPTION event. Setting XML-CODE to -1 before returning to the parser from the processing procedure for an event other than EXCEPTION forces the parser to terminate with a user-initiated exception condition. For some EXCEPTION events, the processing procedure can handle the event, then set XML-CODE to zero to force the parser to continue, although subsequent results are unpredictable. When XML-CODE is zero, parsing continues until the entire XML document has been parsed or an exception condition occurs.
For more information about the EXCEPTION event and exception processing, see the COBOL for Windows Programming Guide.