Replacing a faulty node in a clustered system
You can use the command-line interface (CLI) and the SAN Volume Controller front panel to replace a faulty node in a clustered system.
Before you begin
About this task
If a node fails, the system continues to operate with degraded performance until the faulty node is repaired. If the repair operation takes an unacceptable amount of time, it is useful to replace the faulty node with a spare node. However, the appropriate procedures must be followed and precautions must be taken so you do not interrupt I/O operations and compromise the integrity of your data.
| Node attributes | Description |
|---|---|
| Front panel ID | This ID is the number that is printed on the front of the node and is used to select the node that is added to a system. |
| Node ID | This ID is assigned to the node. A new node ID is assigned each time a node is added to a system; the node name remains the same following service activity on the system. You can use the node ID or the node name to perform management tasks on the system. However, if you are using scripts to perform those tasks, use the node name rather than the node ID. This ID will change during this procedure. |
| Node name | The node name is the name that is assigned to the node. The system
automatically re-adds nodes that have failed back to the system. If the system reports an error for
a node missing (error code 1195) and that node has been repaired and restarted, the system
automatically re-adds the node back into the system. If you choose to assign your own names, you must type the node name on the Adding a node to a cluster panel. You cannot manually assign a name that matches the naming convention used for names assigned automatically by SAN Volume Controller. If you are using scripts to perform management tasks on the system and those scripts use the node name, you can avoid the need to make changes to the scripts by assigning the original name of the node to a spare node. This name might change during this procedure. |
| Worldwide node name | This is the WWNN that is assigned to the node. The WWNN is used to uniquely identify the node and the Fibre Channel ports. During this procedure, the WWNN of the spare node changes to that of the faulty node. The node replacement procedures must be followed exactly to avoid any duplication of WWNNs. This name does not change during this procedure. |
| Worldwide port names | These are the WWPNs that are assigned to the node. WWPNs are derived from the
WWNN that is written to the spare node as part of this procedure. For example, if the WWNN for a
node is 50050768010000F6, the four WWPNs for this node are derived as
follows:
These
names do not change during this procedure. |
Go to the procedure Replacing nodes nondisruptively for the specific steps to replace a faulty node in a system.