During the SAX parse of your XML document, several XML events will
be passed to your XML-SAX handling procedure. To identify the events
within your procedure, use the special names starting with *XML, for
example *XML_START_ELEMENT.
For most events, the handling procedure will be passed a value
associated with the event. For example, for the *XML_START_ELEMENT
event, the value is the name of the XML element.
This sample XML document is referred to in the descriptions of
the XML events.
Figure 1. Sample XML document referred to in the descriptions
of the XML events<?xml version="1.0" encoding="ibm-1140" standalone="yes" ?>
<!DOCTYPE page [
<!ENTITY abc "ABC Inc">
]>
<!-- This document is just an example -->
<sandwich>
<bread type="baker's best" supplier="&abc;" />
<?spread please use real mayonnaise ?>
<spices attr="+">Salt & pepper</spices>
<filling>Cheese, lettuce,
tomato, = &xyz;
</filling>
<![CDATA[We should add a <relish> element in future!]]>
</sandwich>junk
- *XML_START_DOCUMENT
- This event occurs once, at the beginning of parsing the document.
Only the first two parameters are relevant for this event. Accessing
the String parameter will cause a pointer-not-set error to occur.
- *XML_VERSION_INFO
- This event occurs if the XML declaration contains version information.
The value of the string parameter is the version value from the XML
declaration.
- *XML_ENCODING_DECL
- This event occurs if the XML declaration contains encoding information.
The value of the string parameter is the encoding value from the XML
declaration.
- From the example:
- 'ibm-1140'
- *XML_STANDALONE_DECL
- This event occurs if the XML declaration contains standalone information.
The value of the string parameter is the standalone value from the
XML declaration.
- From the example:
- 'yes'
- *XML_DOCTYPE_DECL
- This event occurs if the XML declaration contains a DTD (Document
Type Declaration). Document type declarations begin with the character
sequence '<!DOCTYPE' and end with a '>' character.
Note: This
is the only event where the XML text includes the delimiters.
The
value of the string parameter is the entire DOCTYPE value, including
the opening and closing character sequences.
- From the example
-
'<!DOCTYPE page [LF <!ENTITY abc "ABC Inc">LF]>'
(LF represents
the LINE FEED character.)
- *XML_START_ELEMENT
- This event occurs once for each element tag or empty element tag.
The value of the string parameter is the element name.
- From the example, in the order they appear:
-
- 'sandwich'
- 'bread'
- 'spices'
- 'filling'
- *XML_CHARS
- This event occurs for each fragment of content. Content normally
consists of a single string, even if the text is on multiple lines.
It is split into multiple events if it contains references. The value
of the string parameter is the fragment of the content.
- From the example:
-
- 'Salt '
- ' pepper'
- 'Cheese, lettuce,WWWtomato, ', where WWW represents several "whitespace"
characters. See the Notes section.
- 'We should add a <relish> element in future!'
Note: - The content fragment '&' causes a *XML_PREDEF_REF event,
and the fragment '=' causes a *XML_UCS2_REF event.
- If the value spans multiple lines of the XML document, it will
contain end-of-line characters and it will possibly contain unwanted
series of blanks. In the example, "lettuce," and "tomato" are separated
by a line-feed character and several blanks. These characters are
called whitespace; whitespace is ignored
if it appears between XML elements, but it is considered to be data
if it appears within an element. If it is possible that the XML data
may contain unwanted whitespace, the data may need to be trimmed before
use. To trim unwanted leading and trailing whitespace, use the following
coding. See example Figure 4.
* x'15'=newline x'05'=tab x'0D'=carriage-return
* x'25'=linefeed x'40'=blank
D whitespaceChr C x'15050D2540'
/free
temp = %trim(value : whitespaceChr);
- *XML_PREDEF_REF
- This event occurs when content has one of the predefined single-character
references '&', ''', '>', '<', and '"'.
The value of the string parameter is the single-byte character:
| & |
& |
| ' |
' |
| > |
< |
| < |
> |
| " |
" |
Note: The string is a UCS-2 character if the parsing is being
done in UCS-2.
- From the example:
- '&', from the content for the "spices" element.
- *XML_UCS2_REF
- This event occurs when content has a reference of the form ''
or '', where 'd' and 'h' represent decimal and hexadecimal digits,
respectively. The value of the string parameter is the UCS-2 value
of reference.
Note: This parameter is a UCS-2 character (type C) even
if the parsing is being done in single-byte character.
- From the example:
- The UCS-2 value '=', appearing as "=", from the fragment
at the end of the "filling" element,
- *XML_UNKNOWN_REF
- This event occurs for an entity reference appearing in content,
other than the five predefined entity references as shown for *XML_PREDEF_REF
above. The value of the string parameter is the name of the reference;
the data that appears between the opening '&' and the closing
';'.
- From the example:
- 'xyz'
- *XML_END_ELEMENT
- This event occurs when the parser finds an element end tag or
the closing angle bracket of an empty element. The value of the string
parameter is the element name.
- From the example, in the order they occur:
-
- 'bread'
- 'spices'
- 'filling'
- 'sandwich'
- *XML_ATTR_NAME
- This event occurs once for each attribute in an element tag or
empty element tag, after recognizing a valid name. The value of the
string parameter is the attribute name.
- From the example, in the order they appear:
-
- 'type'
- 'supplier'
- 'attr'
- *XML_ATTR_CHARS
- This event occurs for each fragment of an attribute value. An
attribute value normally consists of a single string, even if the
text is on multiple lines. It is split into multiple events if it
contains references. The value of the string parameter is the fragment
of the attribute value.
- From the example, in the order they appear:
-
- 'baker'
- 's best'
Note: - The fragment ''' causes a *XML_ATTR_PREDEF_REF event
- See the discussion on *XML_CHARS for
recommendations for handling unwanted end-of-line characters and unwanted
blanks.
- *XML_ATTR_PREDEF_REF
- This event occurs when an attribute value has one of the predefined
single-character references '&', ''', '>',
'<', and '"'. The value of the string parameter
is the single-byte character:
| & |
& |
| ' |
' |
| > |
< |
| < |
> |
| " |
" |
Note: The string is a UCS-2 character if the parsing
is being done in UCS-2. - From the example, the value for the "type" attribute:
- ' (The apostrophe character, "&apos")
- *XML_ATTR_UCS2_REF
- This event occurs when an attribute value has a reference of the
form '&#dd..;' or '&#xhh..;', where 'd' and 'h' represent
decimal and hexadecimal digits, respectively. The value of the string
parameter is the UCS-2 value of the reference.
Note: This
parameter is a UCS-2 character (type C) even if the parsing is being
done in single-byte character.
- From the example, from the value of the "attr" attribute:
- The UCS-2 value '+', appearing as "+" in the document.
- *XML_UNKNOWN_ATTR_REF
- This event occurs for an entity reference appearing in an attribute,
other than the five predefined entity references as shown for *XML_ATTR_PREDEF_REF
above. The value of the string parameter is the name of the reference;
the data that appears between the opening '&' and the closing
';'.
- From the example:
- 'abc'
Note: The parser does not parse the DOCTYPE declaration,
so even though entity "abc" is defined in the DOCTYPE declaration,
it is considered undefined by the parser.
- *XML_END_ATTR
- This event occurs when the parser reaches the end of an attribute
value. The string parameter is not relevant for this event. Accessing
the string parameter will cause a pointer-not-set error to occur.
- From the example:
- For the attribute type="baker's best", the *XML_END_ATTR
event occurs after all three parts of the attribute value ("baker", '
and "s best") have been handled.
- *XML_PI_TARGET
- This event occurs when the parser recognizes the name following
the processing instruction (PI) opening character sequence '<?'.
Processing instructions allow XML documents to contain special instructions
for applications. The value of the string parameter is the processing
instruction name.
- From the example:
- 'spread'
- *XML_PI_DATA
- This event occurs for the data part of a processing instruction,
up to but not including the PI closing character sequence '?>'.
The value of the string parameter is the processing instruction data,
including trailing but not leading white space.
- From the example:
- 'please use real mayonnaise '
Note: See the discussion for
*XML_CHARS for recommendations for
handling unwanted end-of-line characters and unwanted blanks.
- *XML_START_CDATA
- This event occurs when a CDATA section begins. CDATA sections
begin with the string '<![CDATA[' and end with the string ']]>'.
Such sections are used to "escape" blocks of text containing characters
that would otherwise be recognized as XML markup. The parser passes
the content of a CDATA section between these delimiters as a single
*XML_CHARS event. The value of the string parameter is always the
opening character sequence '<![CDATA['.
- From the example:
-
'<![CDATA['
- *XML_END_CDATA
- This event occurs when a CDATA section ends. The value of the
string parameter is always the closing character sequence ']]>'.
- *XML_COMMENT
- This event occurs for any comments in the XML document. The value
of the string parameter is the data between the opening delimiter
'<!--' and the closing delimiter '-->' , including leading and
trailing white space.
- From the example:
- ' This document is just an example '
- *XML_EXCEPTION
- This event occurs when the parser detects an error. The value
of the string parameter is the "String" parameter is not relevant
for this event. Accessing the String parameter will cause a pointer-not-set
error to occur. The value of the string-length parameter is the length
of the document that was parsed up to and including the point where
the exception occurred. The value of the Exception-Id parameter is
the exception ID as assigned by the parser. The meaning of these exceptions
is documented in the section on XML return codes in the Rational Development Studio for i: ILE RPG Programmer's Guide.
- From the example:
- An exception event would occur when the parser encountered the
word "junk", which is non-whitespace data appearing after the end
of the XML document. (The XML document ends with the end-element
tag for the "sandwich" element.)
- *XML_END_DOCUMENT
- This event occurs when parsing has completed. Only the first two
parameters are relevant for this event. Accessing the String parameter
will cause a pointer-not-set error to occur.
Note: To aid in debugging an XML-SAX handling procedure, the Control
specification keyword DEBUG(*XMLSAX) can be specified. For more details
on this keyword, see
DEBUG{(*INPUT | *DUMP | *XMLSAX | *NO | *YES)} and the
Debugging chapter in the
Rational Development Studio for i: ILE RPG Programmer's Guide. For more
information about XML parsing, including limitations of the XML parser
used by RPG, see the XML chapter in the
Rational Development Studio for i: ILE RPG Programmer's Guide.