Rational Developer for System z
Enterprise COBOL for z/OS, Version 4.1, Language Reference


XML PARSE statement

The XML PARSE statement is the COBOL language interface to either of two high-speed XML parsers, depending on the setting of the XMLPARSE compiler option:

The XML PARSE statement parses an XML document into its individual pieces and passes each piece, one at a time, to a user-written processing procedure.

Read syntax diagramSkip visual syntax diagram
Format

>>-XML PARSE--identifier-1--+------------------------------+---->
                            '-+------+--ENCODING--codepage-'   
                              '-WITH-'                         

>--+--------------------+--------------------------------------->
   '-RETURNING NATIONAL-'   

>--PROCESSING PROCEDURE--+----+--procedure-name-1--+-------------------------------+-->
                         '-IS-'                    '-+-THROUGH-+--procedure-name-2-'   
                                                     '-THRU----'                       

>--+-------------------------------------------+---------------->
   '-+----+--EXCEPTION--imperative-statement-1-'   
     '-ON-'                                        

>--+------------------------------------------------+----------->
   '-NOT--+----+--EXCEPTION--imperative-statement-2-'   
          '-ON-'                                        

>--+---------+-------------------------------------------------><
   '-END-XML-'   

identifier-1
Must be an alphanumeric group item, a national group item, an elementary data item of category alphanumeric, or an elementary data item of category national that contains the XML document character stream. identifier-1 cannot be a function-identifier.

If identifier-1 is a national group item, identifier-1 is processed as an elementary data item of category national.

If identifier-1 is of category national, its content must be encoded using CCSID 1200 (Unicode UTF-16BE). If the XMLPARSE(COMPAT) compiler option is in effect, identifier-1 must not contain any character entities that are represented using multiple encoding units. Use a character reference, for example:
  • "&#67603;" or
  • "&#x10813;"
to represent any such characters.

If identifier-1 is of category alphanumeric, its content must be encoded using one of the character sets listed in Coded character sets for XML documents in the Enterprise COBOL Programming Guide. If the XMLPARSE(COMPAT) compiler option is in effect, and identifier-1 is alphanumeric and contains an XML document that does not specify an encoding declaration, the XML document is parsed with the code page specified by the CODEPAGE compiler option.

If the XMLPARSE(XMLSS) compiler option is in effect, the XML document is parsed with the code page specified in the ENCODING phrase; if the ENCODING phrase is not used, the document is parsed with the code page specified by the CODEPAGE compiler option. Any encoding declaration in the XML document is ignored.

RETURNING NATIONAL phrase
The RETURNING NATIONAL phrase can be specified only when the XMLPARSE(XMLSS) compiler option is in effect.

When identifier-1 references a data item of category alphanumeric and the RETURNING NATIONAL phrase is specified, XML document fragments are automatically converted to Unicode UTF-16 representation and returned to the processing procedure in the national special registers XML-NTEXT, XML-NNAMESPACE, and XML-NNAMESPACE-PREFIX.

When the RETURNING NATIONAL phrase is not specified and identifier-1 references a data item of category alphanumeric, the XML document fragments are returned to the processing procedure in the alphanumeric special registers XML-TEXT, XML-NAMESPACE, and XML-NAMESPACE-PREFIX except that: when XMLPARSE(COMPAT) is in effect, text for the ATTRIBUTE-NATIONAL-CHARACTER and CONTENT-NATIONAL-CHARACTER XML events is always returned in special register XML-NTEXT.

When identifier-1 references a national data item, XML document fragments are always returned in Unicode UTF-16 representaion in the national special registers XML-NTEXT, XML-NNAMESPACE, and XML-NNAMESPACE-PREFIX.

ENCODING phrase
The ENCODING phrase can be specified only when the XMLPARSE(XMLSS) compiler option is in effect.

The ENCODING phrase specifies an encoding that is assumed for the source XML document in identifier-1. codepage must be an unsigned integer data item or an unsigned integer literal that represents a valid coded character set identifier (CCSID). The ENCODING phrase specification overrides the encoding specified by the CODEPAGE compiler option. The encoding specified in any XML declaration is always ignored.

If identifier-1 references a data item of category national, codepage must specify CCSID 1200, for Unicode UTF-16.

If identifier-1 references a data item of category alphanumeric, codepage must specify CCSID 1208 for UTF-8 or a CCSID for a supported EBCDIC or ASCII codepage. See Coded character sets for XML documents in the Enterprise COBOL Programming Guide for details.

PROCESSING PROCEDURE phrase
Specifies the name of a procedure to handle the various events that the XML parser generates.
procedure-name-1, procedure-name-2
Must name a section or paragraph in the procedure division. When both procedure-name-1 and procedure-name-2 are specified, if either is a procedure name in a declarative procedure, both must be procedure names in the same declarative procedure.
procedure-name-1
Specifies the first (or only) section or paragraph in the processing procedure.
procedure-name-2
Specifies the last section or paragraph in the processing procedure.
For each XML event, the parser transfers control to the first statement of the procedure named procedure-name-1. Control is always returned from the processing procedure to the XML parser. The point from which control is returned is determined as follows:
  • If procedure-name-1 is a paragraph name and procedure-name-2 is not specified, the return is made after the execution of the last statement of the procedure-name-1 paragraph.
  • If procedure-name-1 is a section name and procedure-name-2 is not specified, the return is made after the execution of the last statement of the last paragraph in the procedure-name-1 section.
  • If procedure-name-2 is specified and it is a paragraph name, the return is made after the execution of the last statement of the procedure-name-2 paragraph.
  • If procedure-name-2 is specified and it is a section name, the return is made after the execution of the last statement of the last paragraph in the procedure-name-2 section.

The only necessary relationship between procedure-name-1 and procedure-name-2 is that they define a consecutive sequence of operations to execute, beginning at the procedure named by procedure-name-1 and ending with the execution of the procedure named by procedure-name-2.

If there are two or more logical paths to the return point, then procedure-name-2 can name a paragraph that consists of only an EXIT statement; all the paths to the return point must then lead to this paragraph.

The processing procedure consists of all the statements at which XML events are handled. The range of the processing procedure includes all statements executed by CALL, EXIT, GO TO, GOBACK, INVOKE, MERGE, PERFORM, and SORT statements that are in the range of the processing procedure, as well as all statements in declarative procedures that are executed as a result of the execution of statements in the range of the processing procedure.

The range of the processing procedure must not cause the execution of any GOBACK or EXIT PROGRAM statement, except to return control from a method or program to which control was passed by an INVOKE or CALL statement, respectively, that is executed in the range of the processing procedure.

The range of the processing procedure must not cause the execution of an XML PARSE statement, unless the XML PARSE statement is executed in a method or outermost program to which control was passed by an INVOKE or CALL statement that is executed in the range of the processing procedure.

A program executing on multiple threads can execute the same XML statement or different XML statements simultaneously.

The processing procedure can terminate the run unit with a STOP RUN statement.

For more details about the processing procedure, see Control flow.

ON EXCEPTION
The ON EXCEPTION phrase specifies imperative statements that are executed when the XML PARSE statement raises an exception condition.

An exception condition exists when the XML parser detects an error in processing the XML document. The parser first signals an XML exception by passing control to the processing procedure with special register XML-EVENT containing 'EXCEPTION'. The parser also provides a numeric error code in special register XML-CODE, as detailed in the Enterprise COBOL Programming Guide.

An exception condition also exists if the processing procedure sets XML-CODE to -1 before returning to the parser for any normal XML event. In this case, the parser does not signal an EXCEPTION XML event and parsing is terminated.

If the ON EXCEPTION phrase is specified, the parser transfers control to imperative-statement-1. If the ON EXCEPTION phrase is not specified, the NOT ON EXCEPTION phrase, if any, is ignored and control is transferred to the end of the XML PARSE statement.

Special register XML-CODE contains the numeric error code for the XML exception or -1 after execution of the XML PARSE statement.

If the processing procedure handles the XML exception event and sets XML-CODE to zero before returning control to the parser, the exception condition no longer exists. If no other unhandled exceptions occur before termination of the parser, control is transferred to imperative-statement-2 of the NOT ON EXCEPTION phrase, if specified.

NOT ON EXCEPTION
The NOT ON EXCEPTION phrase specifies imperative statements that are executed when no exception condition exists at the termination of XML PARSE processing.

If an exception condition does not exist at termination of XML PARSE processing, control is transferred to imperative-statement-2 of the NOT ON EXCEPTION phrase, if specified. If the NOT ON EXCEPTION phrase is not specified, control is transferred to the end of the XML PARSE statement. The ON EXCEPTION phrase, if specified, is ignored.

Special register XML-CODE contains zero after execution of the XML PARSE statement.

END-XML phrase
This explicit scope terminator delimits the scope of XML GENERATE or XML PARSE statements. END-XML permits a conditional XML GENERATE or XML PARSE statement (that is, an XML GENERATE or XML PARSE statement that specifies the ON EXCEPTION or NOT ON EXCEPTION phrase) to be nested in another conditional statement.

The scope of a conditional XML GENERATE or XML PARSE statement can be terminated by:

  • An END-XML phrase at the same level of nesting
  • A separator period

END-XML can also be used with an XML GENERATE or XML PARSE statement that does not specify either the ON EXCEPTION or NOT ON EXCEPTION phrase.

For more information on explicit scope terminators, see Delimited scope statements.

Nested XML GENERATE or XML PARSE statements

When a given XML GENERATE or XML PARSE statement appears as imperative-statement-1 or imperative-statement-2, or as part of imperative-statement-1 or imperative-statement-2 of another XML GENERATE or XML PARSE statement, that given XML GENERATE or XML PARSE statement is a nested XML GENERATE or XML PARSE statement.

Nested XML GENERATE or XML PARSE statements are considered to be matched XML GENERATE and END-XML, or XML PARSE and END-XML combinations proceeding from left to right. Thus, any END-XML phrase that is encountered is matched with the nearest preceding XML GENERATE or XML PARSE statement that has not been implicitly or explicitly terminated.

Control flow

When the XML parser receives control from an XML PARSE statement, the parser analyzes the XML document and transfers control at the following points in the process:

  • The start of the parsing process
  • When a document fragment is found
  • When the parser detects an error in parsing the XML document
  • The end of processing the XML document

Control returns to the XML parser when the end of the processing procedure is reached.

The exchange of control between the parser and the processing procedure continues until either:

  • The entire XML document has been parsed, ending with the END-OF-DOCUMENT event.
  • The processing procedure terminates parsing deliberately by setting XML-CODE to -1 before returning to the parser.
  • When the XMLPARSE(XMLSS) compiler option is in effect: The parser detects an exception of any kind.
  • When the XMLPARSE(COMPAT) compiler option is in effect: The parser detects an exception (other than an encoding conflict) and the processing procedure does not reset special register XML-CODE to zero before to returning to the parser.
  • When the XMLPARSE(COMPAT) compiler option is in effect: The parser detects an encoding conflict exception and the processing procedure does not reset special register XML-CODE to zero or to the CCSID of the document encoding.

In each case, the processing procedure returns control to the parser. Then, the parser terminates and returns control to the XML PARSE statement with the XML-CODE special register containing the most recent value set by the parser or -1 (which might have been set by the parser or by the processing procedure).

For each XML event passed to the processing procedure, the XML-CODE and XML-EVENT special registers contain information about the particular event. Special register XML-EVENT is set to the event name, such as 'START-OF-DOCUMENT'. For most events, the XML-TEXT or XML-NTEXT special register contains document text. Additionally, when the XMLPARSE(XMLSS) compiler option is in effect, the XML-NAMESPACE and XML-NAMESPACE-PREFIX or the XML-NNAMESPACE and XML-NNAMESPACE-PREFIX special registers contain a namespace identifier and namespace prefix when applicable. See XML-EVENT for details.

The content of the XML-CODE special register is defined during and after execution of an XML PARSE statement. The contents of all other XML special registers are undefined outside the range of the processing procedure.

For normal XML events, special register XML-CODE contains zero when the processing procedure receives control. For XML exception events, XML-CODE contains an XML exception code as described in the Enterprise COBOL Programming Guide.

For more information about the XML special registers, see:

For an introduction to special registers, see Special registers

For more information about the EXCEPTION event and exception processing, see the Enterprise COBOL Programming Guide.


Terms of use | Feedback

This information center is powered by Eclipse technology. (http://www.eclipse.org)