Input/Output Considerations for Distributed Files
The following considerations apply to input/output operations for distributed
files:
- For input of arrival sequence distributed files and keyed sequence distributed
files whose keyed access paths have been ignored at open time, the records
will be retrieved as follows:
- All records from the first node, as defined by the node group at file
creation time, will be retrieved in arrival sequence from the first node.
- After all records from the first node have been retrieved, then all records
from the second node will be retrieved in arrival sequence from the second
node.
- After all records from the second node have been retrieved, then all records
from the third node will be retrieved in arrival sequence from the third node.
- This will continue until the last node defined by the node group at file
creation time is reached.
- After all records from the last node have been retrieved in arrival sequence,
end-of-file is reached.
Thus, distributed files that are processed in arrival sequence will
not be processed in arrival sequence across the different nodes of the distributed
file.
- For input of keyed sequence distributed files whose keyed access paths
have not been ignored at open time, the records are retrieved as follows:
- The first-changed first-out (FCFO), first-in first-out (FIFO), or last-in
first-out (LIFO) order of records with duplicate key values will only be valid
for records that come from the same node.
- All records with duplicate key values from the first node as defined by
the node group at file creation time will be retrieved in the specified access
path order.
- After all records with duplicate key values from the first node have been
retrieved, then all records with duplicate key values from the second node
will be retrieved in the specified access path order.
- After all records with duplicate key values from the second node have
been retrieved, then all records with duplicate key values from the third
node will be retrieved in the specified access path order.
- This will continue until the last node as defined by the node group at
the file creation time is reached.
- After all records with duplicate key values have been retrieved from the
last node in the specified access path order, the next non-duplicate key value
will be retrieved.
Therefore, distributed files that have duplicate key values will not
be processed in the specified access path order across the different nodes
of the distributed file.
- When buffered retrieval (*BUFFERED) or protected buffered retrieval (*PROTECTED)
is being used:
- Records that are inserted or updated in the distributed file after the
open might not be seen while retrieving records even if their key values come
after the last record returned to your program. This is because each node
has its own key position based on the last get-by-key request. Example of How Records are Retrieved for Insert, Update, and Delete provides
an example of how duplicate key records are retrieved for insert or update.
- Records that are deleted from the distributed file after the open might
still be seen while retrieving records from the file.
- The only difference between buffered retrieval and protected buffer retrieval
is that protected buffer retrieval restricts the deleting, inserting, and
updating of records in the distributed file to your job.
- For output to distributed files, the system will process insert requests
one record at a time. If your distributed file open request is for output-only
and SEQONLY(*YES) processing, it will be changed to SEQONLY(*NO). The single
record output processing will provide feedback on a record-by-record basis
when the records are inserted into the file.
Example of How Records are Retrieved for Insert, Update, and Delete
Figure 129 shows the different record positions for a distributed
file after the first get-by-key request in buffered retrieval. This get-by-key
request has positioned the distributed file at the first record on each node.
Figure 129. First Duplicate Record Key Positions Across Nodes in a Distributed File
In this example, the first get-by-key request has returned record A to
your program. Because of the different record positions on the different nodes,
subsequent get-by-key-next requests would not return records that had been
inserted or updated on node 1 that preceded either Record H on Node 1 or Record
I on Node 3. An inserted or updated record that comes after the last record
returned to your program, but before the current key position for a particular
node, will not be seen by your program unless the direction in which you are
reading records is changed.
Records that have been deleted may also be seen by your program if they
have already been positioned to and retrieved from a particular node. For
example, if Record A from Node 2 has been returned to your program, Record
I from Node 3 will be returned to your program even if it has been deleted
prior to issuing the next get-by-key-next request set to retrieve it.
When non-buffered retrieval (*CURRENT) is being used, records that are
inserted or updated in the distributed file after the open will be retrieved
in the same way as they would have been for a non-distributed database file,
except for duplicate key values that span nodes. Records that are inserted
or updated in a distributed file after it has been opened for non-buffered
retrieval also might not be seen if its key value comes before the last record
that has been returned to your program. If you require that the keyed sequence
input to your distributed file retrieves the same records that would have
been retrieved for a non-distributed database file, except for duplicate key
values that span nodes, then you should override the open of your keyed distributed
file to non-buffered retrieval.
(C) Copyright IBM Corporation 1992, 2006. All Rights Reserved.