Removing and replacing a faulty node canister: SAN Volume Controller 2145-SV2 and 2145-SA2
You can use this procedure to remove a faulty node canister and replace it with a new node canister. You can remove the parts from the faulty node canister and reinstall them into the new node canister.
About this task
Notes:
- There are two different node canister types. Ensure the FRU part number (P/N) of the replacement part matches that of the failed node canister, or is an approved substitute. The FRU P/N is identified on the label of the canister and on the FRU packaging.
- No tools are required to complete this task. Do not remove or loosen any screws.
- Use care when you remove the node canister from the control enclosure. The node canister is long and its center of gravity is far forward. When you remove a node canister from the control enclosure. It can be helpful to have a lift or other sturdy, flat surface ready to receive the node canister during removal.
Procedure
- Review the Event Log to identify the faulty node canister.
- Follow Procedure: Powering off the node canister: 2145-SV2 and 2145-SA2 to verify that the hosts will not lose access to data in volumes.
- From the rear of the control enclosure, label each cable and remove it from the node canister.
Removing the faulty node canister
- Remove the node canister, as described in Removing and replacing the node canister in the node: 2145-SV2 and 2145-SA2, and place it on a flat, level surface.
-
Remove the new node canister from its packaging.
Ensure that the FRU P/N of the replacement node canister matches that of the failed node canister or that the new P/N is an approved substitute. See SAN Volume Controller 2145-SV2 and 2145-SA2 parts for more information.
- Remove the covers from the faulty and replacement node canisters and set them aside, as described in Removing and replacing the cover of the canister: 2145-SV2 and 2145-SA2.
-
Complete the following procedures to remove parts from the faulty node canister and install
them in the replacement canister.
- Removing and replacing a memory module (DIMM): 2145-SV2 and 2145-SA2
- Removing and replacing the Trusted Platform Module: 2145-SV2 and 2145-SA2
- Removing and replacing a fan module: 2145-SV2 and 2145-SA2
- Removing and replacing the node canister battery: 2145-SV2 and 2145-SA2
- Removing and replacing a PCIe riser: 2145-SV2 and 2145-SA2Notes:
- You do not need to remove each adapter from its riser. Each assembled riser and adapter are transferred to the replacement node canister.
- When you install each PCIe risers and adapter assembly into the replacement node canister, use the same numbered slot that was used in the faulty node canister.
- Removing and replacing a boot drive: 2145-SV2 and 2145-SA2
Note: Transfer each boot drive one at a time; ensure that you install the drive into the same slot in the replacement node canister.
Replacing the new node canister
- Replace the cover of the new node canister, as described in Removing and replacing the cover of the canister: 2145-SV2 and 2145-SA2.
- Install the new node canister into the control enclosure, as described in Removing and replacing the node canister in the node: 2145-SV2 and 2145-SA2.
- Reconnect the cables that were removed in step 3 to the appropriate ports in the replacement node canister.
- If the node canister was communicating with other node canisters using RDMA over Ethernet, then use the Service Assistant Tool or the sainfo lsnodeip command to check if the node IP configuration has been lost. Use the Service Assistant Tool or the satask chnodeip command to set the node IP if needed.
- Use the management GUI or service assistant GUI to check that the node canister is online (or is Active) in the system.
-
Enter the service assistant command satask chbootdrive -replacecanister to
update the drives to match the serial number of the new node canister.
Note: Node error code 545 is expected. For more information, see 545.
To help identify the node canister, the inside of the release levers are labeled with the serial number.
- Review the management GUI to determine that all errors are resolved.