Start recovery when all nodes that were members of the
system are online and are in candidate status. If there are any nodes
that display error code 550 or error code 578, remove their system
data to place them into candidate status. Do not run the recovery
procedure on different nodes in the same system; this restriction
includes remote clustered systems.
About this task
Attention: This service action has serious implications
if not completed properly. If at any time an error is encountered
not covered by this procedure, stop and call IBM® Support.
Any one of the following
categories of messages may be displayed:
T3 successful
The volumes are online. Use
the final checks to make the environment operational; see What to check after running the system recovery.
T3 incomplete
One or more of the volumes is offline because there was fast
write data in the cache. Further actions are required to bring the volumes online; see Recovering from offline volumes using the CLI for details (specifically, see the task on
recovery from offline VDisks by using the command-line interface (CLI)).
T3 failed
Call IBM Support.
Do not attempt any further action.
Start the recovery procedure from any node in the system;
the node must not have participated in any other system. To receive
optimal results in maintaining the I/O group ordering, run the recovery
from a node that was in I/O group 0.
Note: Each individual
stage of the recovery procedure might take significant time to complete,
dependant upon the specific configuration.
Procedure
- Click the up or down button until the Actions menu option is displayed;
then, click Select.
- Click the up or down button until the Recover
Cluster? option is displayed, and then click Select;
the node displays Confirm Recover?.
- Click Select; the node displays Retrieving.
After a short delay, the second line displays a sequence of progress messages that indicate the
actions are taking place; for example, Finding qdisks. The backup files
are scanned to find the most recent configuration backup data.
After
the file and quorum data retrieval is complete, the node displays T3
data: on the top line.
- Verify the date and time on the second line of the display. The time stamp that is shown is the
date and time of the last quorum update and must be less than 30 minutes before the failure. The
time stamp format is YYYYMMDD hh:mm, where YYYY is the year,
MM is the month, DD is the day, hh is the
hour, and mm is the minute.
Attention: If the time stamp is not less than 30 minutes before the failure, call IBM
support.
- After you verify that the time stamp is correct, press and hold the UP ARROW and click
Select.
The node displays Backup file on the top line.
- Verify the date and time on the second line of the display. The time stamp that is shown is the
date and time of the last configuration backup and must be less than 24 hours before the failure.
The time stamp format is YYYYMMDD hh:mm, where YYYY is the
year, MM is the month, DD is the day, hh is
the hour, and mm is the minute.
Attention: If the time stamp is not less than 24 hours before the failure, call IBM
support.
Note: Changes that are made after the time of this configuration backup might not be
restored.
- After you verify that the time stamp is correct, press and hold the UP ARROW and click
Select.
The node displays Restoring. After a short delay, the second line
displays a sequence of progress messages that indicate the actions that are taking place; then, the
software on the node restarts.
The node displays Cluster on the top line and a management IP
address on the second line. After a few moments, the node displays T3
Completing.
Note: Any system errors that are logged might temporarily overwrite the display; ignore the message:
Cluster Error: 3025. After a short delay, the second line displays a
sequence of progress messages that indicate the actions that are taking place.
When each node is added to the system, the display shows Cluster: on
the top line, and the cluster (system) name on the second line.
Attention: After the last node is added to the system, there is a short delay to allow
the system to stabilize. Do not attempt to use the system. The recovery is still in progress. After
recovery is complete, the node displays T3 Succeeded on the top
line.
- Click Select to return the node
to normal display.