Rational Developer for System z
Enterprise COBOL for z/OS, Version 4.1, Language Reference


XML GENERATE statement

The XML GENERATE statement converts data to XML format.

Read syntax diagramSkip visual syntax diagram
Format

>>-XML GENERATE--identifier-1--FROM--identifier-2--------------->

>--+-----------------------------+------------------------------>
   '-COUNT--+----+--identifier-3-'   
            '-IN-'                   

>--+------------------------------+----------------------------->
   '-+------+--ENCODING--codepage-'   
     '-WITH-'                         

>--+---------------------------+--+----------------------+------>
   '-+------+--XML-DECLARATION-'  '-+------+--ATTRIBUTES-'   
     '-WITH-'                       '-WITH-'                 

>--+-------------------------------------------------------------------------------------+-->
   '-NAMESPACE--+----+--+-identifier-4-+--+--------------------------------------------+-'   
                '-IS-'  '-literal-4----'  '-NAMESPACE-PREFIX--+----+--+-identifier-5-+-'     
                                                              '-IS-'  '-literal-5----'       

>--+-------------------------------------------+---------------->
   '-+----+--EXCEPTION--imperative-statement-1-'   
     '-ON-'                                        

>--+------------------------------------------------+----------->
   '-NOT--+----+--EXCEPTION--imperative-statement-2-'   
          '-ON-'                                        

>--+---------+-------------------------------------------------><
   '-END-XML-'   

identifier-1
The receiving area for a generated XML document. identifier-1 must reference one of the following:
  • An elementary data item of category alphanumeric
  • An alphanumeric group item
  • An elementary data item of category national
  • A national group item

When identifier-1 references a national group item, identifier-1 is processed as an elementary data item of category national. When identifier-1 references an alphanumeric group item, identifier-1 is treated as though it were an elementary data item of category alphanumeric.

identifier-1 must not be described with the JUSTIFIED clause, and cannot be a function identifier. identifier-1 can be subscripted or reference modified.

identifier-1 must not overlap identifier-2, identifier-3, codepage (if an identifier), identifier-4, or identifier-5.

The generated XML output is encoded as described in the documentation of the ENCODING phrase below.

identifier-1 must reference a data item of category national, or the ENCODING phrase must specify 1208, if any of the following statements is true:

  • The CODEPAGE compiler option specifies an EBCDIC DBCS code page.
  • identifier-4 or identifier-5 references a data item of category national.
  • literal-4 or literal-5 is of category national.
  • The generated XML includes data from identifier-2 for:
    • Any data item of class national or class DBCS
    • Any data item with a DBCS name (that is, a data item whose name consists of DBCS characters)
    • Any data item of class alphanumeric that contains DBCS characters

identifier-1 must be large enough to contain the generated XML document. Typically, it should be from five to 10 times the size of identifier-2, depending on the length of the data-name or data-names within identifier-2. If identifier-1 is not large enough, an error condition exists at the end of the XML GENERATE statement.

identifier-2
The group or elementary data item to be converted to XML format.

If identifier-2 references a national group item, identifier-2 is processed as a group item. When identifier-2 includes a subordinate national group item, that subordinate item is processed as a group item.

identifier-2 cannot be a function identifier or be reference modified, but it can be subscripted.

identifier-2 must not overlap identifier-1 or identifier-3.

identifier-2 must not specify the RENAMES clause.

The following data items specified by identifier-2 are ignored by the XML GENERATE statement:

  • Any unnamed elementary data items or elementary FILLER data items
  • Any slack bytes inserted for SYNCHRONIZED items
  • Any data item subordinate to identifier-2 that is described with the REDEFINES clause or that is subordinate to such a redefining item
  • Any data item subordinate to identifier-2 that is described with the RENAMES clause
  • Any group data item all of whose subordinate data items are ignored

All data items specified by identifier-2 that are not ignored according to the rules above must satisfy the following conditions:

  • Each elementary data item must either have class alphabetic, alphanumeric, numeric, or national, or be an index data item. (That is, no elementary data item can be described with the USAGE POINTER, USAGE FUNCTION-POINTER, USAGE PROCEDURE-POINTER, or USAGE OBJECT REFERENCE phrase.)
  • There must be at least one such elementary data item.
  • Each non-FILLER data-name must be unique within any immediately superordinate group data item.
  • Any DBCS data-names, when converted to Unicode, must be legal as names in the XML specification, version 1.0.
  • The data items must not specify the DATE FORMAT clause, or the DATEPROC compiler option must not be in effect.

For example, consider the following data declaration:

01 STRUCT.
  02 STAT PIC X(4).
  02 IN-AREA PIC X(100).
  02 OK-AREA REDEFINES IN-AREA.
    03 FLAGS PIC X.
    03 PIC X(3).
    03 COUNTER USAGE COMP-5 PIC S9(9).
    03 ASFNPTR REDEFINES COUNTER USAGE FUNCTION-POINTER.
    03 UNREFERENCED PIC X(92).
  02 NG-AREA1 REDEFINES IN-AREA.
    03 FLAGS PIC X.
    03 PIC X(3).
    03 PTR USAGE POINTER.
    03 ASNUM REDEFINES PTR USAGE COMP-5 PIC S9(9).
    03 PIC X(92).
  02 NG-AREA2 REDEFINES IN-AREA.
    03 FN-CODE PIC X.
    03 UNREFERENCED PIC X(3).
    03 QTYONHAND USAGE BINARY PIC 9(5).
    03 DESC USAGE NATIONAL PIC N(40).
    03 UNREFERENCED PIC X(12).

The following data items from the example above can be specified as identifier-2:

  • STRUCT, of which subordinate data items STAT and IN-AREA would be converted to XML format. (OK-AREA, NG-AREA1, and NG-AREA2 are ignored because they specify the REDEFINES clause.)
  • OK-AREA, of which subordinate data items FLAGS, COUNTER, and UNREFERENCED would be converted. (The item whose data description entry specifies 03 PIC X(3) is ignored because it is an elementary FILLER data item. ASFNPTR is ignored because it specifies the REDEFINES clause.)
  • Any of the elementary data items that are subordinate to STRUCT except:
    • ASFNPTR or PTR (disallowed usage)
    • UNREFERENCED OF NG-AREA2 (nonunique names for data items that are otherwise eligible)
    • Any FILLER data items

The following data items cannot be specified as identifier-2:

  • NG-AREA1, because subordinate data item PTR specifies USAGE POINTER but does not specify the REDEFINES clause. (PTR would be ignored if it specified the REDEFINES clause.)
  • NG-AREA2, because subordinate elementary data items have the nonunique name UNREFERENCED.
COUNT IN phrase
If the COUNT IN phrase is specified, identifier-3 contains (after execution of the XML GENERATE statement) the count of generated XML character encoding units. If identifier-1 (the receiver) has category national, the count is in UTF-16 character encoding units. For all other encodings (including UTF-8), the count is in bytes.
identifier-3
The data count field. Must be an integer data item defined without the symbol P in its picture string.

identifier-3 must not overlap identifier-1, identifier-2, codepage (if an identifier), identifier-4, or identifier-5.

ENCODING phrase
The ENCODING phrase, if specified, determines the encoding of the generated XML document.
codepage
Must be an unsigned integer data item or unsigned integer literal and must represent a valid coded character set identifier (CCSID). Must identify one of the code pages supported for COBOL XML processing as described in Coded character sets for XML documents (Enterprise COBOL Programming Guide).

If an identifier, codepage must not overlap identifier-1 or identifier-3.

If identifier-1 references a data item of category national, codepage must specify 1200, the CCSID for Unicode UTF-16.

If identifier-1 references a data item of category alphanumeric, codepage must specify 1208 or the CCSID of a supported EBCDIC code page as listed in Coded character sets for XML documents (Enterprise COBOL Programming Guide).

If the ENCODING phrase is omitted and identifier-1 is of category national, the document encoding is Unicode UTF-16, CCSID 1200.

A byte order mark is not generated for XML documents that have Unicode encoding.

If the ENCODING phrase is omitted and identifier-1 is of category alphanumeric, the XML document is encoded using the code page specified by the CODEPAGE compiler option in effect when the source code was compiled.

XML-DECLARATION phrase
If the XML-DECLARATION phrase is specified, the generated XML document starts with an XML declaration that includes the XML version information and an encoding declaration.

If identifier-1 is of category national, the encoding declaration has the value UTF-16 (encoding="UTF-16").

If identifier-1 is of category alphanumeric, the encoding declaration is derived from the ENCODING phrase, if specified, or from the CODEPAGE compiler option in effect for the program if the ENCODING phrase is not specified.

For an example of the effect of coding the XML-DECLARATION phrase, see Generating XML output (Enterprise COBOL Programming Guide).

If the XML-DECLARATION phrase is omitted, the generated XML document does not include an XML declaration.

ATTRIBUTES phrase
If the ATTRIBUTES phrase is specified, each eligible item included in the generated XML document is expressed as an attribute of the XML element that corresponds to the data item immediately superordinate to that eligible item, rather than as a child element of the XML element. To be eligible, a data item must be elementary, must have a name other than FILLER, and must not specify an OCCURS clause in its data description entry.

For an example of the effect of the ATTRIBUTES phrase, see Generating XML output (Enterprise COBOL Programming Guide).

NAMESPACE and NAMESPACE-PREFIX phrases
Use the NAMESPACE phrase to identify a namespace for the generated XML document. If the NAMESPACE phrase is not specified, or if identifier-4 has length zero or contains all spaces, the element names of XML documents produced by the XML GENERATE statement are not in any namespace.

Use the NAMESPACE-PREFIX phrase to qualify the start and end tag of each element in the generated XML document with a prefix.

If the NAMESPACE-PREFIX phrase is not specified, or if identifier-5 is of length zero or contains all spaces, the namespace specified by the NAMESPACE phrase specifies the default namespace for the document. In this case, the namespace declared on the root element applies by default to each element name in the document, including that of the root element. (Default namespace declarations do not apply directly to attribute names.)

If the NAMESPACE-PREFIX phrase is specified, and identifier-5 is not of length zero and does not contain all spaces, then the start and end tag of each element in the generated document is qualified with the specified prefix. The prefix should therefore preferably be short. When the XML GENERATE statement is executed, the prefix must be a valid XML name, but without the colon (:), as defined in Namespaces in XML 1.0. The prefix can have trailing spaces, which are removed before use.

identifier-4, literal-4; identifier-5, literal-5
identifier-4, literal-4: The namespace identifier, which must be a valid Uniform Resource Identifier (URI) as defined in Uniform Resource Identifier (URI): Generic Syntax.

identifier-5, literal-5: The namespace prefix, which serves as an alias for the namespace identifier.

identifier-4 and identifier-5 must reference data items of category alphanumeric or national.

identifier-4 and identifier-5 must not overlap identifier-1 or identifier-3.

literal-4 and literal-5 must be of category alphanumeric or national, and must not be figurative constants.

For full details about namespaces, see Namespaces in XML 1.0.

For examples that show the use of the NAMESPACE and NAMESPACE-PREFIX phrases, see Generating XML output (Enterprise COBOL Programming Guide).

ON EXCEPTION phrase
An exception condition exists when an error occurs during generation of the XML document, for example if identifier-1 is not large enough to contain the generated XML document. In this case, XML generation stops and the content of the receiver, identifier-1, is undefined. If the COUNT IN phrase is specified, identifier-3 contains the number of character positions that were generated, which can range from 0 to the length of identifier-1.

If the ON EXCEPTION phrase is specified, control is transferred to imperative-statement-1. If the ON EXCEPTION phrase is not specified, the NOT ON EXCEPTION phrase, if any, is ignored, and control is transferred to the end of the XML GENERATE statement. Special register XML-CODE contains an exception code, as detailed in Handling errors in generating XML documents (Enterprise COBOL Programming Guide).

NOT ON EXCEPTION phrase
If an exception condition does not occur during generation of the XML document, control is passed to imperative-statement-2, if specified, otherwise to the end of the XML GENERATE statement. The ON EXCEPTION phrase, if specified, is ignored. Special register XML-CODE contains zero after execution of the XML GENERATE statement.
END-XML phrase
This explicit scope terminator delimits the scope of XML GENERATE or XML PARSE statements. END-XML permits a conditional XML GENERATE or XML PARSE statement (that is, an XML GENERATE or XML PARSE statement that specifies the ON EXCEPTION or NOT ON EXCEPTION phrase) to be nested in another conditional statement.

The scope of a conditional XML GENERATE or XML PARSE statement can be terminated by:

  • An END-XML phrase at the same level of nesting
  • A separator period

END-XML can also be used with an XML GENERATE or XML PARSE statement that does not specify either the ON EXCEPTION or the NOT ON EXCEPTION phrase.

For more information on explicit scope terminators, see Delimited scope statements.

Nested XML GENERATE or XML PARSE statements

When a given XML GENERATE or XML PARSE statement appears as imperative-statement-1 or imperative-statement-2, or as part of imperative-statement-1 or imperative-statement-2 of another XML GENERATE or XML PARSE statement, that given XML GENERATE or XML PARSE statement is a nested XML GENERATE or XML PARSE statement.

Nested XML GENERATE or XML PARSE statements are considered to be matched XML GENERATE and END-XML combinations, or XML PARSE and END-XML combinations, proceeding from left to right. Thus, any END-XML phrase that is encountered is matched with the nearest preceding XML GENERATE or XML PARSE statement that has not been implicitly or explicitly terminated.

Operation of XML GENERATE

The content of each eligible elementary data item within identifier-2 is converted to character format as described under Format conversion of elementary data and Trimming of generated XML data. Only the first definition of each storage area is processed. Redefinitions of data items are not included. Data items that are effectively defined by the RENAMES clause are also not included.

The converted content is then inserted as element character content, or, if the ATTRIBUTES phrase is specified and the data item is eligible to be expressed as an attribute, as the value of the attribute, in the generated XML document.

The XML element names and attribute names are derived from the data-names within identifier-2 as described under XML element name and attribute name formation. The names of group items that contain the selected elementary items are retained as parent elements. If the NAMESPACE-PREFIX phrase is specified, the prefix value, minus any trailing spaces, is used to qualify the start and end tag of each element.

No extra white space (new lines, indentation, and so forth) is inserted to make the generated XML more readable. An XML declaration is generated only if the XML-DECLARATION phrase is specified.

If the receiving area specified by identifier-1 is not large enough to contain the resulting XML document, an error condition exists. See the description of the ON EXCEPTION phrase above for details.

If identifier-1 is longer than the generated XML document, only that part of identifier-1 in which XML is generated is changed. The rest of identifier-1 contains the data that was present before this execution of the XML GENERATE statement. To avoid referring to that data, either initialize identifier-1 to spaces before the XML GENERATE statement or specify the COUNT IN phrase.

If the COUNT IN phrase is specified, identifier-3 contains (after execution of the XML GENERATE statement) the total number of character positions (UTF-16 encoding units or bytes) that were generated. You can use identifier-3 as a reference modification length field to refer to the part of identifier-2 that contains the generated XML document.

After execution of the XML GENERATE statement, special register XML-CODE contains either zero, which indicates successful completion, or a nonzero exception code. See Handling errors in generating XML documents (Enterprise COBOL Programming Guide) for details.

The XML PARSE statement also uses special register XML-CODE. Therefore if you code an XML GENERATE statement in the processing procedure of an XML PARSE statement, save the value of XML-CODE before that XML GENERATE statement executes and restore the saved value after the XML GENERATE statement terminates.

Format conversion of elementary data

Elementary data items are converted to character format depending on the type of the data item:

  • Data items of category alphabetic, alphanumeric, alphanumeric-edited, DBCS, external floating-point, national, national-edited, and numeric-edited are not converted.
  • Fixed-point numeric data items other than COMPUTATIONAL-5 (COMP-5) binary data items or binary data items compiled with the TRUNC(BIN) compiler option are converted as if they were moved to a numeric-edited item that has:
    • As many integer positions as the numeric item has, but with at least one integer position
    • An explicit decimal point, if the numeric item has at least one decimal position
    • The same number of decimal positions as the numeric item has
    • A leading '-' picture symbol if the data item is signed (has an S in its PICTURE clause)
  • COMPUTATIONAL-5 (COMP-5) binary data items or binary data items compiled with the TRUNC(BIN) compiler option are converted in the same way as the other fixed-point numeric items, except for the number of integer positions. The number of integer positions is computed depending on the number of '9' symbols in the picture character string as follows:
    • 5 minus the number of decimal places, if the data item has 1 to 4 '9' picture symbols
    • 10 minus the number of decimal places, if the data item has 5 to 9 '9' picture symbols
    • 20 minus the number of decimal places, if the data item has 10 to 18 '9' picture symbols
  • Internal floating-point data items are converted as if they were moved to a data item as follows:
    • For COMP-1: an external floating-point data item with PICTURE -9.9(8)E+99
    • For COMP-2: an external floating-point data item with PICTURE -9.9(17)E+99 (illegal because of the number of digit positions)
  • Index data items are converted as if they were declared USAGE COMP-5 PICTURE S9(9).

After any conversion to character format, leading and trailing spaces and leading zeroes are eliminated, as described under Trimming of generated XML data.

If a data item after any conversion contains any characters that are illegal in XML content, as specified in the relevant XML specification, an exception is generated. See Handling errors in generating XML documents (Enterprise COBOL Programming Guide) for details.

Any remaining instances of the five characters & (ampersand), ' (apostrophe), > (greater-than sign), < (less-than sign), and " (quotation mark) are converted into the equivalent XML references '&amp;', '&apos;', '&gt;', '&lt;', and '&quot;', respectively.

Then, if identifier-1 is a data item of category national, any nonnational values are converted to national format.

Any remaining Unicode character represented by two UTF-16 encoding units (a surrogate pair) is replaced by an XML character reference. For example, the surrogate pair (NX'D802', NX'DC13') is replaced by the reference '&#x10813;'.

Trimming of generated XML data

Trimming is performed on data values after their conversion to character format. (Conversion is described under Format conversion of elementary data.)

For values converted from signed numeric values, the leading space is removed if the value is positive.

For values converted from numeric items, leading zeroes (after any initial minus sign) up to but not including the digit immediately before the actual or implied decimal point are eliminated. Trailing zeroes after a decimal point are retained. For example:

  • -012.340 becomes -12.340.
  • 0000.45 becomes 0.45.
  • 0013 becomes 13.
  • 0000 becomes 0.

Character values from data items of class alphabetic, alphanumeric, DBCS, and national have either trailing or leading spaces removed, depending on whether the corresponding data items have left (default) or right justification, respectively. That is, trailing spaces are removed from values whose corresponding data items do not specify the JUSTIFIED clause. Leading spaces are removed from values whose data items do specify the JUSTIFIED clause. If a character value consists solely of spaces, one space remains as the value after trimming is finished.

XML element name and attribute name formation

In the XML documents that are generated from identifier-2, the XML element names and attribute names are derived from the names of the data item specified by identifier-2 and from any eligible data-names that are subordinate to identifier-2 as follows:

  • The exact mixed-case spelling of data-names from the data description entry is retained. The spellings from any references to data items (for example, in an OCCURS DEPENDING ON clause) are not used.
  • Data-names that start with a digit are prefixed by an underscore. For example, the data-name '3D' becomes XML tag or attribute name '_3D'.
  • Data-names that start with the characters 'xml', in any combination of uppercase and lowercase, are prefixed by an underscore. For example, the data-name 'Xml' becomes XML tag or attribute name '_Xml'.

DBCS data-names, when translated to Unicode, must be legal as names in the XML specification, version 1.0.


Terms of use | Feedback

This information center is powered by Eclipse technology. (http://www.eclipse.org)