The XML GENERATE statement converts data to XML format.
Format >>-XML GENERATE--identifier-1--FROM--identifier-2---------------> >--+-----------------------------+------------------------------> '-COUNT--+----+--identifier-3-' '-IN-' >--+------------------------------+-----------------------------> '-+------+--ENCODING--codepage-' '-WITH-' >--+---------------------------+--+----------------------+------> '-+------+--XML-DECLARATION-' '-+------+--ATTRIBUTES-' '-WITH-' '-WITH-' >--+-------------------------------------------------------------------------------------+--> '-NAMESPACE--+----+--+-identifier-4-+--+--------------------------------------------+-' '-IS-' '-literal-4----' '-NAMESPACE-PREFIX--+----+--+-identifier-5-+-' '-IS-' '-literal-5----' >--+-------------------------------------------+----------------> '-+----+--EXCEPTION--imperative-statement-1-' '-ON-' >--+------------------------------------------------+-----------> '-NOT--+----+--EXCEPTION--imperative-statement-2-' '-ON-' >--+---------+------------------------------------------------->< '-END-XML-'
When identifier-1 references a national group item, identifier-1 is processed as an elementary data item of category national. When identifier-1 references an alphanumeric group item, identifier-1 is treated as though it were an elementary data item of category alphanumeric.
identifier-1 must not be described with the JUSTIFIED clause, and cannot be a function identifier. identifier-1 can be subscripted or reference modified.
identifier-1 must not overlap identifier-2, identifier-3, codepage (if an identifier), identifier-4, or identifier-5.
The generated XML output is encoded as described in the documentation of the ENCODING phrase below.
identifier-1 must reference a data item of category national, or the ENCODING phrase must specify 1208, if any of the following statements is true:
identifier-1 must be large enough to contain the generated XML document. Typically, it should be from five to 10 times the size of identifier-2, depending on the length of the data-name or data-names within identifier-2. If identifier-1 is not large enough, an error condition exists at the end of the XML GENERATE statement.
If identifier-2 references a national group item, identifier-2 is processed as a group item. When identifier-2 includes a subordinate national group item, that subordinate item is processed as a group item.
identifier-2 cannot be a function identifier or be reference modified, but it can be subscripted.
identifier-2 must not overlap identifier-1 or identifier-3.
identifier-2 must not specify the RENAMES clause.
The following data items specified by identifier-2 are ignored by the XML GENERATE statement:
All data items specified by identifier-2 that are not ignored according to the rules above must satisfy the following conditions:
For example, consider the following data declaration:
01 STRUCT.
02 STAT PIC X(4).
02 IN-AREA PIC X(100).
02 OK-AREA REDEFINES IN-AREA.
03 FLAGS PIC X.
03 PIC X(3).
03 COUNTER USAGE COMP-5 PIC S9(9).
03 ASFNPTR REDEFINES COUNTER USAGE FUNCTION-POINTER.
03 UNREFERENCED PIC X(92).
02 NG-AREA1 REDEFINES IN-AREA.
03 FLAGS PIC X.
03 PIC X(3).
03 PTR USAGE POINTER.
03 ASNUM REDEFINES PTR USAGE COMP-5 PIC S9(9).
03 PIC X(92).
02 NG-AREA2 REDEFINES IN-AREA.
03 FN-CODE PIC X.
03 UNREFERENCED PIC X(3).
03 QTYONHAND USAGE BINARY PIC 9(5).
03 DESC USAGE NATIONAL PIC N(40).
03 UNREFERENCED PIC X(12).
The following data items from the example above can be specified as identifier-2:
The following data items cannot be specified as identifier-2:
identifier-3 must not overlap identifier-1, identifier-2, codepage (if an identifier), identifier-4, or identifier-5.
If an identifier, codepage must not overlap identifier-1 or identifier-3.
If identifier-1 references a data item of category national, codepage must specify 1200, the CCSID for Unicode UTF-16.
If identifier-1 references a data item of category alphanumeric, codepage must specify 1208 or the CCSID of a supported EBCDIC code page as listed in Coded character sets for XML documents (Enterprise COBOL Programming Guide).
If the ENCODING phrase is omitted and identifier-1 is of category national, the document encoding is Unicode UTF-16, CCSID 1200.
A byte order mark is not generated for XML documents that have Unicode encoding.
If the ENCODING phrase is omitted and identifier-1 is of category alphanumeric, the XML document is encoded using the code page specified by the CODEPAGE compiler option in effect when the source code was compiled.
If identifier-1 is of category national, the encoding declaration has the value UTF-16 (encoding="UTF-16").
If identifier-1 is of category alphanumeric, the encoding declaration is derived from the ENCODING phrase, if specified, or from the CODEPAGE compiler option in effect for the program if the ENCODING phrase is not specified.
For an example of the effect of coding the XML-DECLARATION phrase, see Generating XML output (Enterprise COBOL Programming Guide).
If the XML-DECLARATION phrase is omitted, the generated XML document does not include an XML declaration.
For an example of the effect of the ATTRIBUTES phrase, see Generating XML output (Enterprise COBOL Programming Guide).
Use the NAMESPACE-PREFIX phrase to qualify the start and end tag of each element in the generated XML document with a prefix.
If the NAMESPACE-PREFIX phrase is not specified, or if identifier-5 is of length zero or contains all spaces, the namespace specified by the NAMESPACE phrase specifies the default namespace for the document. In this case, the namespace declared on the root element applies by default to each element name in the document, including that of the root element. (Default namespace declarations do not apply directly to attribute names.)
If the NAMESPACE-PREFIX phrase is specified, and identifier-5 is not of length zero and does not contain all spaces, then the start and end tag of each element in the generated document is qualified with the specified prefix. The prefix should therefore preferably be short. When the XML GENERATE statement is executed, the prefix must be a valid XML name, but without the colon (:), as defined in Namespaces in XML 1.0. The prefix can have trailing spaces, which are removed before use.
identifier-5, literal-5: The namespace prefix, which serves as an alias for the namespace identifier.
identifier-4 and identifier-5 must reference data items of category alphanumeric or national.
identifier-4 and identifier-5 must not overlap identifier-1 or identifier-3.
literal-4 and literal-5 must be of category alphanumeric or national, and must not be figurative constants.
For full details about namespaces, see Namespaces in XML 1.0.
For examples that show the use of the NAMESPACE and NAMESPACE-PREFIX phrases, see Generating XML output (Enterprise COBOL Programming Guide).
If the ON EXCEPTION phrase is specified, control is transferred to imperative-statement-1. If the ON EXCEPTION phrase is not specified, the NOT ON EXCEPTION phrase, if any, is ignored, and control is transferred to the end of the XML GENERATE statement. Special register XML-CODE contains an exception code, as detailed in Handling errors in generating XML documents (Enterprise COBOL Programming Guide).
The scope of a conditional XML GENERATE or XML PARSE statement can be terminated by:
END-XML can also be used with an XML GENERATE or XML PARSE statement that does not specify either the ON EXCEPTION or the NOT ON EXCEPTION phrase.
For more information on explicit scope terminators, see Delimited scope statements.
When a given XML GENERATE or XML PARSE statement appears as imperative-statement-1 or imperative-statement-2, or as part of imperative-statement-1 or imperative-statement-2 of another XML GENERATE or XML PARSE statement, that given XML GENERATE or XML PARSE statement is a nested XML GENERATE or XML PARSE statement.
Nested XML GENERATE or XML PARSE statements are considered to be matched XML GENERATE and END-XML combinations, or XML PARSE and END-XML combinations, proceeding from left to right. Thus, any END-XML phrase that is encountered is matched with the nearest preceding XML GENERATE or XML PARSE statement that has not been implicitly or explicitly terminated.
The content of each eligible elementary data item within identifier-2 is converted to character format as described under Format conversion of elementary data and Trimming of generated XML data. Only the first definition of each storage area is processed. Redefinitions of data items are not included. Data items that are effectively defined by the RENAMES clause are also not included.
The converted content is then inserted as element character content, or, if the ATTRIBUTES phrase is specified and the data item is eligible to be expressed as an attribute, as the value of the attribute, in the generated XML document.
The XML element names and attribute names are derived from the data-names within identifier-2 as described under XML element name and attribute name formation. The names of group items that contain the selected elementary items are retained as parent elements. If the NAMESPACE-PREFIX phrase is specified, the prefix value, minus any trailing spaces, is used to qualify the start and end tag of each element.
No extra white space (new lines, indentation, and so forth) is inserted to make the generated XML more readable. An XML declaration is generated only if the XML-DECLARATION phrase is specified.
If the receiving area specified by identifier-1 is not large enough to contain the resulting XML document, an error condition exists. See the description of the ON EXCEPTION phrase above for details.
If identifier-1 is longer than the generated XML document, only that part of identifier-1 in which XML is generated is changed. The rest of identifier-1 contains the data that was present before this execution of the XML GENERATE statement. To avoid referring to that data, either initialize identifier-1 to spaces before the XML GENERATE statement or specify the COUNT IN phrase.
If the COUNT IN phrase is specified, identifier-3 contains (after execution of the XML GENERATE statement) the total number of character positions (UTF-16 encoding units or bytes) that were generated. You can use identifier-3 as a reference modification length field to refer to the part of identifier-2 that contains the generated XML document.
After execution of the XML GENERATE statement, special register XML-CODE contains either zero, which indicates successful completion, or a nonzero exception code. See Handling errors in generating XML documents (Enterprise COBOL Programming Guide) for details.
The XML PARSE statement also uses special register XML-CODE. Therefore if you code an XML GENERATE statement in the processing procedure of an XML PARSE statement, save the value of XML-CODE before that XML GENERATE statement executes and restore the saved value after the XML GENERATE statement terminates.
Elementary data items are converted to character format depending on the type of the data item:
After any conversion to character format, leading and trailing spaces and leading zeroes are eliminated, as described under Trimming of generated XML data.
If a data item after any conversion contains any characters that are illegal in XML content, as specified in the relevant XML specification, an exception is generated. See Handling errors in generating XML documents (Enterprise COBOL Programming Guide) for details.
Any remaining instances of the five characters & (ampersand), ' (apostrophe), > (greater-than sign), < (less-than sign), and " (quotation mark) are converted into the equivalent XML references '&', ''', '>', '<', and '"', respectively.
Then, if identifier-1 is a data item of category national, any nonnational values are converted to national format.
Any remaining Unicode character represented by two UTF-16 encoding units (a surrogate pair) is replaced by an XML character reference. For example, the surrogate pair (NX'D802', NX'DC13') is replaced by the reference '𐠓'.
Trimming is performed on data values after their conversion to character format. (Conversion is described under Format conversion of elementary data.)
For values converted from signed numeric values, the leading space is removed if the value is positive.
For values converted from numeric items, leading zeroes (after any initial minus sign) up to but not including the digit immediately before the actual or implied decimal point are eliminated. Trailing zeroes after a decimal point are retained. For example:
Character values from data items of class alphabetic, alphanumeric, DBCS, and national have either trailing or leading spaces removed, depending on whether the corresponding data items have left (default) or right justification, respectively. That is, trailing spaces are removed from values whose corresponding data items do not specify the JUSTIFIED clause. Leading spaces are removed from values whose data items do specify the JUSTIFIED clause. If a character value consists solely of spaces, one space remains as the value after trimming is finished.
In the XML documents that are generated from identifier-2, the XML element names and attribute names are derived from the names of the data item specified by identifier-2 and from any eligible data-names that are subordinate to identifier-2 as follows:
DBCS data-names, when translated to Unicode, must be legal as names in the XML specification, version 1.0.