ILE COBOL Language Reference

Coded character sets for XML documents

XML PARSE supports XML documents in national data items, in alphanumeric data items, and in IFS files with UCS-2 and single byte CCSIDs. Documents in national data items must be encoded using Unicode UCS-2, CCSID 13488. Documents in alphanumeric data items must be encoded using one of the explicitly supported single-byte EBCDIC CCSIDs shown in Supported EBCDIC CCSIDs for XML documents (Table 34) or one of the ASCII CCSIDs shown in Supported ASCII CCSIDs for XML documents (Table 35).


Table 34. Supported EBCDIC CCSIDs for XML documents

CCSID Description
1140, 37 USA, Canada, etc. Euro Country Extended CCSID (ECECP), Country Extended CCSID
1141, 273 Austria, Germany ECECP, CECP
1142, 277 Denmark, Norway ECECP, CECP
1143, 278 Finland, Sweden ECECP, CECP
1144, 280 Italy ECECP, CECP
1145, 284 Spain, Latin America (Spanish) ECECP, CECP
1146, 285 UK ECECP, CECP
1147, 297 France ECECP, CECP
1148, 500 International ECECP, CECP
1149, 871 Iceland ECECP, CECP


Table 35. Supported ASCII CCSIDs for XML documents

CCSID Description
813 ISO 8859-7 Greek / Latin
819 ISO 8859-1 Latin 1 / Open Systems
920 ISO 8859-9 Latin 5 (ECMA-128, Turkey TS-5881)

When you parse ASCII XML documents, the document fragments passed to the processing procedure in special register XML-TEXT are encoded in ASCII. Because ILE COBOL operations such as move and comparison rely on EBCDIC encoding or on national characters for proper operation, you must convert the document fragments before using them. To do this when the XML document is in a COBOL program, first convert from the ASCII CCSID of the XML document to national characters using the MOVE statement. Then, if necessary, convert the result from national characters to EBCDIC using the MOVE statement.

XML documents in a COBOL program encoded in other CCSIDs can be parsed by converting them to national characters using the MOVE statement. The individual pieces of document text passed to the processing procedure in special register XML-NTEXT can then be converted back to the original CCSID as necessary, using the MOVE statement.

When the XML document is in an IFS file, use the copy object (CPY) command to do the CCSID conversion. To make it easier to work with document fragments returned from the parser, it is recommended that you do the following before you use the document in an XML PARSE:

  1. Characters preceding the '<' tag at the start of each xml record should be removed.
  2. The end of each line in the IFS file must have only a CR (carriage return) and not a LF (line feed).
  3. Convert XML documents to UCS-2 CCSID 13488, or to the CCSID of the job.
  4. Manually change the encoding declaration in the XML document to specify the document's actual CCSID.

See the ILE COBOL Programmer's Guide for details on specifying the document encoding and how the parser determines encoding.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]