Replacing nodes nondisruptively with a SAN Volume Controller 2145-SV2 or 2145-SA2 node

The following procedures describe how to nondisruptively replace most nodes with SAN Volume Controller 2145-SV2 or 2145-SA2 nodes.

Before you begin

The replacement procedures are nondisruptive if no changes are required to your networking environment. The replacement node uses the same worldwide node name (WWNN) as the node that you are replacing. An alternative to this procedure is to replace nodes disruptively either by moving volumes to a new I/O group or by rezoning the SAN. However, the disruptive procedures require more work on the hosts.

SAN Volume Controller 2145-SV2 and 2145-SA2 nodes have some restrictions compared to older SAN Volume Controller nodes.
  • They cannot do real-time compression.
  • They cannot be connected to SAS expansion enclosures.
  • The maximum number of network ports on SAN Volume Controller 2145-SV2 / 2145-SA2 nodes is less than the maximum number of ports on SAN Volume Controller 2145-DH8 / 2145-SV1 nodes.
If the system has less than four I/O groups, then the simplest approach to work around these restrictions might be to add the new SAN Volume Controller 2145-SV2 or 2145-SA2 nodes as a new I/O group. In other cases, the following items apply:
  • If real-time compression is in use by the I/O group that is being updated, then do not upgrade a node until the compressed volumes are removed from the I/O group, moved to another I/O group, or decompressed.
  • The volumes in the old I/O group can be moved to the new I/O group by using non-disruptive VDisk move, and then the old I/O group can be removed.
  • Decompress all compressed volumes in the I/O group before you replace the nodes.
    • Use the "change capacity savings" function in the GUI to convert to either thin or fully allocated volumes.
    • When no compressed volumes are in the I/O group, the nodes can be replaced. After both nodes are replaced, the volumes can be converted to compressed volumes in a data reduction pool, or note that some managed disks might be self-compressing such as IBM® NVMe FlashCore Module drives.
  • SAS drive arrays and expansion enclosures can be removed from the I/O group or moved to another I/O group. See Moving expansion enclosures.
  • You can use the Spectrum Virtualize Port Configurator to help plan that the number of ports in use is the maximum number of ports that is in use on the existing SAN Volume Controller 2145-DH8 or 2145-SV1 nodes in the I/O group.

Some system performance might be lost when the nodes are being replaced. Volumes that are managed by the I/O group that contains the node to be replaced becomes degraded when one of the nodes is shut down at the start of this procedure. System performance returns when both nodes are running and accessing the backend storage.

This task assumes that the following conditions are met. If any conditions are not met, do not continue this task unless you are instructed to do so by IBM Remote technical support.

  • Real-time compression is not in use by the I/O group that you are updating.
  • No SAS expansion enclosures are connected to or managed by the I/O group that you are updating.
  • Upgrade the system to 8.3.1.0.
    Important: Ensure that all other nodes in the system are running system software level 8.3.1. Otherwise, the replacement node is not recognized. Use the management GUI to display information about the system level or enter the lssystem command. For more information, see Updating the system software.
  • If encryption is enabled on the system, a new encryption license must be installed on each new SAN Volume Controller 2145-SV2 or 2145-SA2 node before it can be added to the system. Use the management GUI to install the new license; see Activating encryption license for more information.
  • The replacement SAN Volume Controller 2145-SV2 or 2145-SA2 node must have at least as many Fibre Channel and Ethernet ports as the node that is being replaced.
  • All nodes that are configured in the system are present and online.
  • All errors in the system event log are addressed and marked as fixed.
  • No volumes, managed disks (MDisks), or external storage systems have a status of degraded or offline.
  • You backed up the system configuration and saved the svc.config.backup.xml file.
  • Set the Fibre Channel device driver on each Fibre Channel attached host to time out a missing fiber path in 3 seconds or less. If it is not practical to check the parameters of the Fibre Channel driver on each host, you must reboot the new SAN Volume Controller 2145-SV2 or 2145-SA2 node shortly after it is added to the system. The fiber paths to the host then stop long enough to ensure that they are recovered properly when the node is active again.
    Tip: The timeout setting for the Emulex Fibre Channel device driver might default to 30 seconds, so it needs to be changed.
Important Notes:
  1. Review all of the following steps before you proceed with this task. If you are not familiar with the system environment or the tasks that are described, do not continue this procedure.
  2. Review the detailed information in Setting the Fibre Channel port mapping: 2145-SV2 or 2145-SA2. You need to use this information to complete this task.
  3. Ensure that the replacement SAN Volume Controller 2145-SV2 or 2145-SA2 node has at least as much RAM as the node that is being replaced.
  4. The node ID might change during this task; the node name might also change. After the system assigns the node ID, the ID cannot be changed. However, you can change the node name after this task is complete.

Procedure

  1. Confirm that the node you are replacing is running software level 8.3.1.0 or later. If the node is not running system software level 8.3.1 or later, the system software must be upgraded before you continue this procedure.
    You can use the management GUI to view and update the software level. For more information, see Updating the system software.

Stop remote copy partnerships

  1. To avoid potential disruption caused by maintenance, stop any remote copy partnerships between the local and remote systems that connect to the node that you are replacing. When you stop a remote copy partnership, consistency groups are also stopped.
    1. To complete this task by entering CLI commands, complete the following steps:
      1. Enter the lspartnership command to display information about the systems that are associated with the node you are replacing. Then, record the cluster IDs that are displayed in the command output.
      2. Stop the remote copy partnership by entering the chpartnership -stop cluster_id command, where cluster_id is the ID of the local or remote system.
      3. Continue to Step 3.
    2. To use the management GUI, complete the following steps:
      1. Select Copy Services > Partnerships to display the system information about the node you are replacing.
      2. Highlight the appropriate system name, right-click the entry, then select Stop.
      3. On the partner node, repeat Steps 3.b.i through 3.b.ii.
      4. Continue to Step 3.

Collect important information about the node you are replacing

  1. Determine the ID, name, I/O group ID, I/O group name, and system configuration node status for the node that you want to replace.

    To determine this information, you can use the management GUI or complete the following steps.

    1. Issue the lsnode command from the command line interface.
      svcinfo lsnode -delim : 
      The system displays information about the nodes that are currently defined in the system.
    2. Record the information from the lsnode command output in Table 1. This information identifies the node, the I/O group in which it belongs, and iSCSI information.
      Tip: If one of the nodes that you want to replace is the system configuration node (config_node:yes), replace it last.
      Table 1. Configuration information about the nodes to be replaced
      lsnode command output lsnodevpd command output
      id name WWNN IO_group_id IO_group_name config_node iscsi_name front_panel_id
                     
                     
                     
                     
    3. Find the front panel ID of the node you want to replace. Use this ID to determine the physical location of the node.
      Issue the lsnodevpd command, where node_name_or_node_id is the name or ID of the node. (If you already know the physical location of the node that you want to replace, you can go to the next step.)
      lsnodevpd node_name_or_node_ID
      The system displays detailed information about the node.
    4. Record the value in the front_panel_id column in Table 1.
  2. Confirm that no hosts depend on the node that you are replacing. Use either the management GUI or enter a command.
    If you used the management GUI in Step 3, complete these steps:
    1. In the management GUI, select Monitoring > System .
    2. On the System -- Overview page, use the directional arrow near the node Node Details page.
    3. Select Node Actions > Dependent Volumes
    If you entered commands in Step 3, enter the following command, where node_name_or_node_id is the name or ID of the node.
    lsdependentvdisks -node node_name_or_node_id
    The results display all the volumes that depend on that node.
    1. If dependent volumes exist, determine whether the volumes are being used.
      If the volumes are being used, either restore the redundant configuration or suspend the host application.
    2. If a dependent quorum disk is reported, repair the access to the quorum disk or modify the quorum disk configuration.
  3. Issue the lsservicestatus command to display information about the Fibre Channel ports of the node to be replaced.
    sainfo lsservicestatus
  4. Record the fc_io_port_id and fc_io_port_WWPN for each port in Table 2. This information is required to check the port mapping when you add the new node.
    Table 2. Information about the Fibre Channel ports of the node to be replaced
    lsservicestatus command output
    fc_io_port_id fc_io_port_WWPN
       
       
       
       
  5. If Ethernet port IP addresses are configured on the system, enter the lsportip command to display the current settings so that they can be applied to the replacement nodes.
    lsportip -delim : 
    The system displays information about the Ethernet ports that are defined on the specified node.
  6. Record the information about the Ethernet ports on the node that you want to replace in Table 3.
    Table 3. Information about the Ethernet ports of the node to be replaced
    lsportip command output
    node_id node_name IP_address subnet_mask IP_address_6 prefix gateway_port_id
                 
                 
                 
                 

Remove the node from the system

If RDMA over Ethernet is being used for node to node communication, then use the Service Assistant GUI or enter the sainfo lsnodeip command to display the current node IP settings so that they can be applied to the replacement node.

  1. Record and mark the order of the Fibre Channel or Ethernet cables with the node port number before you remove the cables from the back of the node.
    Important: Do not connect the replacement node to different ports on the switch or to a different switch.

    You must reconnect the cables in the exact order on the replacement node to avoid problems when the replacement node is added to the system. If the cables are not connected in the same order, the port IDs can change. If the port IDs change, the host system might not be able to access volumes. See the hardware documentation specific to your model to determine how the ports are numbered.

  2. If the node has 10 Gbps Ethernet IP addresses configured, delete these settings by using the rmportip command, ensuring that you note the current settings.
    rmportip -node node_ID_or_name port_ID
  3. If encryption is active on the node you are replacing, enter the following command to deactivate the feature.
    deactivatefeature feature_id

    Issue the lsfeature command to determine the correct license_key value. See Disabling encryption for more details.

  4. Issue the rmnode command to delete this node from the system and I/O group. The node_name_or_node_ID value identifies the node that you want to delete.
    rmnode node_name_or_node_ID
  5. Enter the lsnode command to ensure that the node is no longer a member of the system:
    lsnode
    The system displays a list of nodes. Before you continue to the next step, ensure that the removed node is not listed in the command output.
  6. Use the service assistant to change the WWNN and iSCSI name of the removed node to 1FFFF for CF8/CG8 nodes, or 500507680c00FFFF for DH8/SV1 nodes

Prepare the replacement SAN Volume Controller 2145-SV2 or 2145-SA2 node

  1. Install the replacement node in the rack. For more information, see Installing a SAN Volume Controller 2145-SV2 or 2145-SA2 node.
    Important: Do not connect the Fibre Channel or Ethernet cables during this step.
  2. Power on the replacement node.
  3. Use a CAT 5 Ethernet cable to directly attach a computer with a web browser to the technician port of the replacement node.
  4. Assign the WWNN and a hardware location in the new SAN Volume Controller 2145-SV2 or 2145-SA2 node for each FC port that is defined on the node you are replacing.
    Note: You must create the port mapping before you can add the new node to the system. For more information, see Setting the Fibre Channel port mapping: 2145-SV2 or 2145-SA2.
    When the command completes, the system creates the new port mappings on the replacement SAN Volume Controller 2145-SV2 or 2145-SA2 node. The node then reboots to apply the new settings.
  5. Attach the Fibre Channel and Ethernet cables to the replacement node.
  6. Verify that the last 5 characters of the WWNN are correct.

    To do so, use the management GUI or enter the lsnodecandidate command on the system command line.

    lsnodecandidate
  7. If encryption is active on the system, it must also be installed and active on the replacement node. To activate the feature, enter the following command, where key is the encryption key.
    activatefeature -licensekey key 

    If you do not activate the license on the new node, you receive message CMMVC8784E.

  8. If the node was communicating with other nodes by using RDMA over Ethernet, then use the Service Assistant Tool or the satask chnodeip command to set the Node IP.
  9. Add the new replacement node to the system. You can use the management GUI or enter the addnode command, where WWNN and iogroup_name_or_id are the values that you recorded for the original node.
    addnode -wwnodename WWNN -iogrp iogroup_name_or_id
    Ensure that the new node has the same name as the original node and is in the same I/O group as the original node. Refer to the data that you recorded in Table 1 in Step 3.b.

    The system reassigns the SAN Volume Controller 2145-SV2 or 2145-SA2 node with the name that was used originally for the node that was replaced. If the original name of the node was automatically assigned by the system, it is not possible to reuse that name. If the name starts with node, it was automatically assigned. In this case, either specify a different name that does not start with node or do not use the name parameter so that the system automatically assigns a new name to the node.

    Important: Ensure that all other nodes in the system are running system software level 8.3.1 or later. Otherwise, the replacement SAN Volume Controller 2145-SV2 or 2145-SA2 node is not recognized. For more information, see Updating the system software.
  10. If Ethernet IP addresses were previously configured on the replaced node, configure the Ethernet ports on the new node to reuse those settings.
    Ethernet port IP addresses can be configured by using the management GUI or the cfgportip command. Specify the appropriate values that you noted in Table 3 in Step 8.
    • For IPv4 IP addresses
      cfgportip -node node_name_or_node_ID -ip IPv4_addr
      -mask subnet_mask -gw gateway port ID
    • For IPv6 IP addresses
      cfgportip -node node_name_or_node_ID -ip_6 IPv6_addr
      -prefix_6 prefix -gw_6 gateway port ID
  11. Important Ask the host administrator to query the paths on each host to ensure that all paths to the replacement node are active before you proceed to the next step.
    If you are using the IBM Multipath Subsystem Device Driver (SDD) , enter the datapath query device command to query the paths. Documentation that is provided with your multipathing device driver shows how to query paths. If if the expected paths are not active, force the multipath driver to rescan for paths.

Restart remote copy partnerships

  1. When the maintenance process completes, you must restart the remote copy partnerships that were stopped. When you restart a remote copy partnership, consistency groups also start again. To restart a remote copy partnership, you can enter a CLI command or use the management GUI.
    1. To use the CLI, enter the chpartnership start - cluster_id command, where the cluster_id is the ID of the local or remote system.
    2. To use the management GUI, complete the following steps:
      1. Select Copy Services > Partnerships to display the system information about the node that you replaced.
      2. Highlight the appropriate system name, right-click the entry, and select Restart.
      3. Repeat the preceding steps on the partner node.
  2. Repeat Step 3 through Step 26 for each node that you replace.