Recover system procedure
The saved configuration data is in the active quorum disk and the latest XML configuration backup file. The recovery might not be able to restore all volume data. This procedure is also known as Tier 3 (T3) recovery.
CAUTION:
If the system encounters a state where:
- No nodes are active
Do not attempt to initiate a node rescue (which the user can initiate either by using the service assistant GUI, or the satask rescuenode service CLI command). STOP and contact IBM® Remote Technical Support. Initiating this T3 recover system procedure while in this specific state can result in loss of the XML configuration backup files.
Attention:
- Run service actions only when directed by the fix procedures. If used inappropriately, service actions can cause loss of access to data or even data loss. Read and understand all of the instructions before you complete any action.
- The recovery procedure can take several hours if the system uses large-capacity devices as quorum devices.
Do not attempt the recover system procedure unless the following conditions are met:
- All of the conditions have been met in When to run the recover system procedure.
- All hardware errors are fixed. See Fix hardware errors
- All nodes have candidate status. Otherwise, see step 1.
- All nodes must be at the same level of code that the system had before the failure. If any nodes were modified or replaced, use the service assistant to verify the levels of code, and where necessary, to reinstall the level of code so that it matches the level that is running on the other nodes in the system. For more information, see Removing system information for nodes with error code 550 or error code 578 using the service assistant.
The system recovery procedure is one of several tasks
that must be completed. The following list is an overview of the tasks
and the order in which they must be completed:
- Preparing for system recovery
- Review the information regarding when to run the recover system procedure.
- Fix your hardware errors and make sure that all nodes in the system are shown in service assistant or in the output from sainfo lsservicenodes.
- Remove the system information for nodes with error code 550 or error code 578 by using the service assistant, but only if the recommended user response for these node errors has already been followed.
- For Virtual Volumes (VVols), shut down the services for any instances of Spectrum Control Base that are connecting to the system. Use the Spectrum Control Base command service ibm_spectrum_control stop.
- Running the system recovery. After you prepared the system for recovery and met all the
pre-conditions, run the system recovery.Note: Run the procedure on one system in a fabric at a time. Do not run the procedure on different nodes in the same system. This restriction also applies to remote systems.
- Completing actions to get your environment operational.
- Recovering from offline volumes by using the CLI.
- Checking your system, for example, to ensure that all mapped volumes can access the host.