Example: processing XML events

Although short, the following sample XML document contains many XML events. These events are shown below the document in the order in which they would occur during parsing, and the exact text for each XML event is delimited by angle brackets (<<>>).

In general, this text can be the content of either XML-TEXT or XML-NTEXT. The sample XML document below does not contain any text that requires XML-NTEXT, however, and thus uses only XML-TEXT.

This example begins with an XML declaration. If an XML declaration occurs in a document that you are parsing, the declaration must begin in the first byte of the document; otherwise, the parser generates an exception code. The attribute names that are coded in the XML declaration must be in all lowercase characters.

<?xml version="1.0" encoding="ibm-1140" standalone="yes" ?>
<!--This document is just an example-->
<sandwich>
  <bread type="baker&apos;s best" />
  <?spread please use real mayonnaise ?>
  <meat>Ham &amp; turkey</meat>
  <filling>Cheese, lettuce, tomato, etc.</filling>
  <![CDATA[We should add a <relish> element in future!]]>
</sandwich>junk
START-OF-DOCUMENT
The text for this sample is 336 characters in length.
VERSION-INFORMATION
<<1.0>>
ENCODING-DECLARATION
<<ibm-1140>>
STANDALONE-DECLARATION
<<yes>>
DOCUMENT-TYPE-DECLARATION
The sample does not have a document type declaration.
COMMENT
<<This document is just an example>>
START-OF-ELEMENT
In the order that they occur as START-OF-ELEMENT events:
  1. <<sandwich>>
  2. <<bread>>
  3. <<meat>>
  4. <<filling>>
ATTRIBUTE-NAME
<<type>>
ATTRIBUTE-CHARACTERS
In the order in which they occur as ATTRIBUTE-CHARACTERS events:
  1. <<baker>>
  2. <<s best>>

Notice that the value of the type attribute in the sample consists of three fragments: the string 'baker', the single character ''', and the string 's best'. The single-character fragment ''' is passed separately as an ATTRIBUTE-CHARACTER event.

ATTRIBUTE-CHARACTER
<<'>>
ATTRIBUTE-NATIONAL-CHARACTER
The sample does not contain a numeric character reference.
END-OF-ELEMENT
In the order that they occur as END-OF-ELEMENT events:
  1. <<bread>>
  2. <<meat>>
  3. <<filling>>
  4. <<sandwich>>
PROCESSING-INSTRUCTION-TARGET
<<spread>>
PROCESSING-INSTRUCTION-DATA
<<please use real mayonnaise >>
CONTENT-CHARACTERS
In the order that they occur as CONTENT-CHARACTERS events:
  1. <<Ham >>
  2. << turkey>>
  3. <<Cheese, lettuce, tomato, etc.>>
  4. <<We should add a <relish> element in future!>>

Notice that the content of the meat element in the sample consists of the string 'Ham ', the character '&', and the string ' turkey'. The single-character fragment '&' is passed separately as a CONTENT-CHARACTER event. Also notice the trailing and leading spaces, respectively, in these two string fragments.

CONTENT-CHARACTER
<<&>>
CONTENT-NATIONAL-CHARACTER
The sample does not contain a numeric character reference.
START-OF-CDATA-SECTION
<<<![CDATA[>>
END-OF-CDATA-SECTION
<<]]>>>
UNKNOWN-REFERENCE-IN-ATTRIBUTE
The sample does not have any unknown entity references.
UNKNOWN-REFERENCE-IN-CONTENT
The sample does not have any unknown entity references.
END-OF-DOCUMENT
XML text is empty for the END-OF-DOCUMENT event.
EXCEPTION
The part of the document that was parsed up to and including the point where the exception (the superfluous 'junk' after the </sandwich> tag) was detected.