Replacing the system board: 2145-DH8

All the components that were removed when you removed the system board are reused during the installation of the new system board.

Before you begin

The machine serial number or node serial number is on the MT-M SN label on the front of the SAN Volume Controller 2145-DH8 . It was also written to the system board and to each of the boot drives when the node was manufactured. When the system software starts, it reads the node serial number from the system board and uses the serial number as the panel ID for this node. The panel ID can be seen in many places such as in the service assistant GUI, the management GUI, and the output of many CLI commands.

If the system board is replaced with a FRU part, then it has a machine serial number of 0000000, and the SAN Volume Controller 2145-DH8 node has a panel_id of 0000000. This will not match with the node serial number stored on each of the boot drives, causing node error 545. If copies of the node serial number on each boot drive do not match, the node error is 543. The procedure for fixing node errors is described below.

Ensure that the following items available:
  • A VGA monitor and a USB keyboard might be needed.
  • Power cables for the node so that it may be turned on while out of the rack.
  • A computer with an Ethernet port and web browser that can be directly connected to the technician port, providing access to the service assistant GUI. Ssh capable software is required to access the CLI (PuTTY).
  • Alcohol wipes and thermal grease are required to correctly replace the microprocessors. You must remove the microprocessors when you replace the system board.
Note: When you reassemble the components in the node, be sure to route all cables carefully so that they are not exposed to excessive pressure.
DANGER
Multiple power cords. The product might be equipped with multiple power cords. To remove all hazardous voltages, disconnect all power cords. (L003)
Multiple plugs: first set
or
Multiple plugs: second set
or
Multiple plugs: third set

About this task

This service action assumes that:
  • The node is turned off.
  • The power cables are disconnected.
  • The node is removed from the rack.
  • The top cover is removed.
  • The air baffle is removed.
  • The PCI express riser-card assemblies are removed.
  • The cables that connect to the battery backplane are removed.
  • The system board is removed.
  • The new system board is from FRU stock and must not come from another SAN Volume Controller 2145-DH8 or from any other machine.
  • Avoid replacing both of the boot drives at the same time, otherwise it is not possible to recover without help from IBM remote technical support.

Perform the following steps to install the system board:

Procedure

  1. Align the system board at an angle, as shown in Figure 1.
  2. Rotate and lower the system board so that it is flat and slide it back toward the rear of the server. Make sure that the rear connectors extend through the rear of the chassis.
    Figure 1. Replacing the SAN Volume Controller 2145-DH8 system board
    Replacing the SAN Volume Controller 2145-DH8 system board
    •  1  Pin
    •  2  Thumbscrew
  3. Reconnect the system board cables that you disconnected.
  4. Rotate the system board thumbscrews toward the rear of the server until the latch clicks.
  5. Reinstall the microprocessor and heat sink, as described in Replacing the microprocessor: 2145-DH8.
  6. Reinstall the DIMMs, as described in Replacing the memory modules: 2145-DH8.
  7. Reinstall the fan bracket, as described in Replacing the SAN Volume Controller 2145-DH8 fan bracket.
  8. Reinstall the hot-swap fans, as described in Replacing the SAN Volume Controller 2145-DH8 fans.
  9. Reinstall the air baffle.
  10. Reinstall the power supply units.
  11. Replace the PCI express riser-card assemblies.
  12. Make sure that all cables, adapters, and other components are installed and seated correctly and that you have not left loose tools or parts inside the node. Make sure that all internal cables are correctly routed. If you disconnected the Fibre Channel and Ethernet cables, make sure that each cable is reconnected to the same port from which it was removed.
  13. Replace the top cover. See Replacing the top cover.
  14. If you removed any Fibre Channel, SAS cable, or Ethernet cables, use the labels that you placed on each cable to connect the cables to the same ports from which they were removed.
  15. Replace the power cords and the cable-retention brackets.
  16. Lift the locking levers ( 1  in Figure 2) on the slide rails and push the server  2  all the way into the rack until it clicks into place.
    Figure 2. Raising the SAN Volume Controller 2145-DH8 locking levers of the slide rails of the rack
    Raising the SAN Volume Controller 2145-DH8 locking levers of the slide rails of the rack
  17. Turn on the node. Wait for the node status LEDs to remain stable for at least 5 minutes before taking any further action.
    If you are a service representative completing this procedure, this procedure might take up to 2 hours to complete.
    Notes:
    • If the node status, node fault, and battery status LEDs remain off for more than 5 minutes, attach a monitor and a USB keyboard to change the default boot order.
    • If the repair was successful the node fault LED is on and node error 545 is seen, for this node, in the service assistant GUI:
      Notes:
      • Node error 545 means that the node serial number on the system board, used for the panel_id, does not match with the node serial number held on each of the two boot drives.
      • Use the service assistant GUI or the sainfo lsbootdrive CLI command to confirm that.
        • The node serial number on the system board is 0000000 (that is, seven zeros) shown as the panel_id.
        • The node serial number for each boot drive slot is exactly the same as that found on the MT-M SN label on the front of this node.
      • If the previous two conditions were met, then use the service assistant GUI or the following CLI command to change the node serial number on the system board:

        satask chvpd -type 2145-DH8 -serial <the SN value on the MT-M SN label>

      • The node reboots.
      • If there are no node errors, the node starts and rejoins the system if it was previously in the system. The node status LED is on if the node has rejoined the system.
    • If node error 543 is displayed instead of node error 545, check the following:
      Notes:
      • When the machine serial number on the system board is 0000000, node error 543 means that the copies of the node serial number on each boot drive do not match. For example, when the node serial number could not be read from off the boot drives because it is missing.
      • Use the service assistant GUI or the sainfo lsbootdrive CLI command to see the state of each boot drive slot. Refer to Boot drive problems to decide what to do next.
      • For example, if the output from the sainfo lsbootdrive shows:
        • The node number on the system board is 0000000 (that is seven zeros) shown as the panel_id.
        • The node serial number for one boot drive slot is exactly the same as that found on the MT-M SN label on the front of this node.
        • The status of the other boot drive slot is uninitialized.
      • Only use the service assistant GUI or the following CLI command to initialize the uninitialized boot drive if the three previous conditions above have been met:

        satask rescuenode

      • The node reboots
      • Node error 545 is displayed for this node in the service assistant GUI
      • Write the node serial number as stated above.
    • If the repair was successful but the node was not able to save its state data before shutting down, the node displays node error 578. Follow the procedures in Deleting a node from a clustered system by using the management GUI to delete the node from the cluster and then add it back into the cluster. If more than one node failed, ensure that the node is added back into its original I/O group.