Replacing nodes nondisruptively with a 2145-SV1 node
The following procedures describe how to nondisruptively replace most nodes with SAN Volume Controller 2145-SV1 nodes.
Before you begin
The replacement procedures are nondisruptive because no changes are required to your networking environment. The replacement 2145-SV1 node uses the same worldwide node name (WWNN) as the node that you are replacing. An alternative to this procedure is to replace nodes disruptively either by moving volumes to a new I/O group or by rezoning the SAN. However, the disruptive procedures require more work on the hosts.
Some system performance might be lost when the nodes are being replaced. Volumes that are managed by the I/O group that contains the node to be replaced becomes degraded when one of the nodes is shut down at the start of this procedure. System performance returns when both nodes are running and accessing the backend storage.
This task assumes that the following conditions are met. If any conditions are not met, do not continue this task unless you are instructed to do so by IBM® support.
-
Important: Ensure that all other nodes in the system are running system software level 7.7.1 or later. Otherwise, the replacement 2145-SV1 node will not be recognized. Use the management GUI to display information about the system level or enter the lssystem command. For more information, see Updating the system software.
- If encryption is enabled on the system, a new encryption license must be installed on each new node before it can be added to the system. Use the management GUI to install the new license; for more information, see Activating encryption license.
- The replacement 2145-SV1 node must have at least as many Fibre Channel, Fibre Channel over Ethernet (FCoE), and Ethernet ports as the node that is being replaced.
- All nodes that are configured in the system are present and online.
- All errors in the system event log are addressed and marked as fixed.
- No volumes, managed disks (MDisks), or external storage systems have a status of degraded or offline.
- You backed up the system configuration and saved the svc.config.backup.xml file.
- 2145-SV1 nodes support 4-port 16 Gbps Fibre Channel and 10 Gbps Ethernet adapters. 2145-SV1 can also support optional 2-port 25 Gbps Ethernet adapters (RoCE or iWARP) for iSCSI.
- Set the Fibre Channel device driver on each Fibre Channel attached host to time out a
missing fiber path in 3 seconds or less. If it is not practical to check the parameters of the Fibre
Channel driver on each host, you must reboot the new
2145-SV1
node shortly after it is
added to the system. The fiber paths to the host then stop long enough to ensure that they are
recovered properly when the
2145-SV1
is active again. Tip: The timeout setting for the Emulex Fibre Channel device driver might default to 30 seconds, so it needs to be changed.
- Review all of the following steps before you proceed with this task. If you are not familiar with the system environment or the tasks that are described, do not continue this procedure.
- Review the detailed information in Setting the Fibre Channel port mapping: 2145-SV1. You need to use this information to complete this task.
- Ensure that the replacement 2145-SV1 node has at least as much RAM as the node that is being replaced.
- The node ID might change during this task; the node name might also change. After the system assigns the node ID, the ID cannot be changed. However, you can change the node name after this task is complete.
Procedure
-
Confirm that the node you are replacing is running software level 7.7.1 or later. If the node
is not running system software level 7.7.1 or later, the system software must be upgraded before you
continue this procedure.
You can use the management GUI to view and update the software level. For more information, see Updating the system software.
Stop remote copy partnerships
- To avoid potential disruption caused by maintenance, it is recommended that you
stop any remote copy partnerships between the local and remote systems that connect to the node that
you are replacing. When you stop a remote copy partnership, consistency groups are also
stopped.
- To complete this task by entering CLI commands, complete the following steps:
- Enter the lspartnership command to display information about the systems that are associated with the node you are replacing. Then, record the cluster IDs that are displayed in the command output.
- Stop the remote copy partnership by entering the
chpartnership -stop cluster_idcommand, where cluster_id is the ID of the local or remote system. - Continue to Step 3.
- To use the management GUI, complete the following steps:
- To complete this task by entering CLI commands, complete the following steps:
Collect important information about the node you are replacing
-
Determine the ID, name, I/O group ID, I/O group name, and system configuration node status for
the node that you want to replace.
To determine this information, you can use the management GUI or complete the following steps.
-
Issue the lsnode command from the
command line interface.
svcinfo lsnode -delim :The system displays information about the nodes that are currently defined in the system. -
Record the information from the lsnode command output in Table 1. This information identifies the node,
the I/O group in which it belongs, and iSCSI information.
Tip: If one of the nodes that you want to replace is the system configuration node (config_node:yes), replace it last.
Table 1. Configuration information about the nodes to be replaced lsnode command output lsnodevpd command output id name WWNN IO_group_id IO_group_name config_node iscsi_name front_panel_id -
Find the front panel ID of the node you want to replace. Use this ID to determine the physical
location of the node.
Issue the lsnodevpd command, where node_name_or_node_id is the name or ID of the node. (If you already know the physical location of the node that you want to replace, you can go to the next step.)
lsnodevpd node_name_or_node_IDThe system displays detailed information about the node. - Record the value in the front_panel_id column in Table 1.
-
Issue the lsnode command from the
command line interface.
-
Confirm that no hosts depend on the node that you are replacing. Use either the management GUI or enter a command.
If you used the management GUI in Step 3, complete these steps:
- In the management GUI, select Monitoring > System .
- On the System -- Overview page, use the directional arrow near the node Node Details page.
- Select Node Actions > Dependent Volumes
If you entered commands in Step 3, enter the following command, where node_name_or_node_id is the name or ID of the node.
The results display all the volumes that depend on that node.lsdependentvdisks -node node_name_or_node_id-
If dependent volumes exist, determine whether the volumes are being used.
If the volumes are being used, either restore the redundant configuration or suspend the host application.
- If a dependent quorum disk is reported, repair the access to the quorum disk or modify the quorum disk configuration.
-
Issue the lsservicestatus
command to display information about the Fibre Channel ports of the node to be replaced.
sainfo lsservicestatus -
Record the fc_io_port_id and
fc_io_port_WWPN for each port in Table 2. This information is required to check
the port mapping when you add the new node.
Table 2. Information about the Fibre Channel ports of the node to be replaced lsservicestatus command output fc_io_port_id fc_io_port_WWPN -
If Ethernet port IP addresses are configured on the system, enter the lsportip command to display the
current settings so that they can be applied to the replacement nodes.
lsportip -delim :The system displays information about the Ethernet ports that are defined on the specified node. -
Record the information about the Ethernet ports on the node that you want to replace in
Table 3.
Table 3. Information about the Ethernet ports of the node to be replaced lsportip command output node_id node_name IP_address subnet_mask IP_address_6 prefix gateway_port_id
Remove the node from the system
If RDMA over Ethernet is being used for node to node communication, then use the Service Assistant GUI or enter the sainfo lsnodeip command to display the current node IP settings so that they can be applied to the replacement node.
-
Record and mark the order of the Fibre Channel or Ethernet cables with the node port number
before you remove the cables from the back of the node.
Important: Do not connect the replacement node to different ports on the switch or to a different switch.
You must reconnect the cables in the exact order on the replacement node to avoid problems when the replacement node is added to the system. If the cables are not connected in the same order, the port IDs can change. If the port IDs change, the host system might not be able to access volumes. See the hardware documentation specific to your model to determine how the ports are numbered.
-
If the node has 10 Gbps Ethernet IP addresses configured, delete these settings by using
the rmportip command,
ensuring that you note the current settings.
rmportip -node node_ID_or_name port_ID -
If encryption is active on the node you are replacing, enter the following command to
deactivate the feature.
deactivatefeature feature_idIssue the lsfeature command to determine the correct license_key value. See Disabling encryption for more details.
-
Issue the rmnode
command to delete this node from the system and I/O group. The
node_name_or_node_ID value identifies the node that you want to delete.
rmnode node_name_or_node_ID -
Enter the lsnode command to ensure that the node is no longer a member
of the system:
lsnodeThe system displays a list of nodes. Before you continue to the next step, ensure that the removed node is not listed in the command output. - Optional:
If you want to use the removed node as a spare node, change the WWNN and
iSCSI name of each node that you deleted to 1FFFF.
- Power on the node.
- Enter the following chvpd
command.
satask chvpd -wwnn FFFFFFFFFFFFFFFF
Prepare the replacement 2145-SV1 node
-
Install the replacement node in the rack. For more information, see
Installing the SAN Volume Controller 2145-SV1 hardware.
Important: Do not connect the Fibre Channel or Ethernet cables during this step.
- Power on the replacement node.
-
Use a CAT 5 Ethernet cable to directly attach a computer with a web browser to the
technician port of the replacement node.
- If DHCP is configured on the computer, the installation GUI automatically displays when
the new web page opens. For more information, see Technician port for node access.
To access the service assistant GUI, select the wrench (spanner) icon in the installation GUI.
- If Secure Shell (SSH) software is installed on the computer, you can also access the
command line interface at
192.168.0.1.You can then log on as
superuser, where the default superuser password ispassw0rd.
- If DHCP is configured on the computer, the installation GUI automatically displays when
the new web page opens. For more information, see Technician port for node access.
-
Find the WWNN of the replacement
2145-SV1
node. This name can
be reused by another
2145-SV1
node.
To find the WWNN, use the service assistant GUI or enter the following command.
sainfo lsservicestatus -
Assign the WWNN and a hardware location in the new
2145-SV1
node for each FC port that
is defined on the node you are replacing.
To do so, use the service assistant GUI or enter the appropriate chvpd command for the port mapping information.
satask chvpd -wwnn wwnn -fcportmap AB-CD,AB-CD,AB-CD,AB-CDNote: You must create the port mapping before you can add the new node to the system. For more information, see Setting the Fibre Channel port mapping: 2145-SV1.When the command completes, the system creates the new port mappings on the replacement 2145-SV1 node. The node then reboots to apply the new settings. - Attach the Fibre Channel and Ethernet cables to the replacement node.
-
Verify that the last 5 characters of the WWNN are correct.
To do so, use the management GUI or enter the lsnodecandidate command on the system command line.
lsnodecandidate -
If encryption is active on the system, it must also be installed and active on the replacement
node. To activate the feature, enter the following command, where key is the
encryption key.
activatefeature -licensekey keyIf you do not activate the license on the new node, you receive message CMMVC8784E.
-
Enter the lsservicestatus
command to verify that the fc_io_port_id and fc_io_port_WWPN
on the
2145-SV1
node match the
values that are recorded from the lsservicestatus output from the original
node.
sainfo lsservicestatus- If there are differences, review Setting the Fibre Channel port mapping: 2145-SV1 and correct the mapping, as needed.
- If the values match, connect the Fibre Channel or Ethernet cables to the host adapters.
- If the node was communicating with other nodes by using RDMA over Ethernet, then use the Service Assistant Tool or the satask chnodeip command to set the Node IP.
-
Add the new
2145-SV1
replacement node to the system. You can use the management GUI or enter the addnode command, where
WWNN and iogroup_name_or_id are the values that you
recorded for the original node.
addnode -wwnodename WWNN -iogrp iogroup_name_or_idEnsure that the new node has the same name as the original node and is in the same I/O group as the original node. Refer to the data that you recorded in Table 1 in Step 3.b.The system reassigns the 2145-SV1 node with the name that was used originally for the node that was replaced. If the original name of the node was automatically assigned by the system, it is not possible to reuse that name. If the name starts with
node
, it was automatically assigned. In this case, either specify a different name that does not start withnode
or do not use the name parameter so that the system automatically assigns a new name to the node.Important: Ensure that all other nodes in the system are running system software level 7.7.1 or later. Otherwise, the replacement 2145-SV1 node will not be recognized. For more information, see Updating the system software. -
If Ethernet IP addresses were previously configured on the replaced node, configure the
Ethernet ports on the new node to reuse those settings.
Ethernet port IP addresses can be configured by using the management GUI or the cfgportip command. Specify the appropriate values that you noted in Table 3 in Step 8.
- For IPv4 IP
addresses
cfgportip -node node_name_or_node_ID -ip IPv4_addr -mask subnet_mask -gw gateway port ID - For IPv6 IP
addresses
cfgportip -node node_name_or_node_ID -ip_6 IPv6_addr -prefix_6 prefix -gw_6 gateway port ID
Important:- Both nodes in the I/O group cache data; however, the cache sizes are asymmetric. The replacement node is limited by the cache size of the partner node in the I/O group. Therefore, it is possible that the replacement node does not use the full cache size until you replace the other node in the I/O group.
- You do not need to reconfigure the host multipathing device drivers because the replacement node uses the same WWNN and WWPN as the previous node. The multipathing device drivers detect the recovery of paths that are available to the replacement node.
- The host multipathing device drivers take approximately 30 minutes to recover the paths. Do not update the other node in the I/O group for at least 30 minutes after you successfully update the first node in the I/O group. If you have other nodes in different I/O groups to update, you can update the nodes while you wait.
- If you are unable to check that the Fibre Channel device driver of every host is set to timeout a Fibre Channel path in 3 seconds or less, reboot the new 2145-SV1 node now to ensure that the fiber path becomes active when the node becomes active again.
- For IPv4 IP
addresses
-
Important Ask the host administrator to query the paths on each host to ensure that all
paths to the replacement node are active before you proceed to the next step.
If you are using the IBM Multipath Subsystem Device Driver (SDD) , enter the datapath query device command to query the paths. Documentation that is provided with your multipathing device driver shows how to query paths. Force the multipath driver to rescan for paths if the expected paths are not active.
Restart remote copy partnerships
- When the maintenance process completes, you must restart the remote
copy partnerships that were stopped. When you restart a remote copy partnership, consistency groups
also start again. To restart a remote copy partnership, you can enter a CLI command or use the
management GUI.
- To use the CLI, enter the
chpartnership start - cluster_idcommand, where the cluster_id is the ID of the local or remote system. - To use the management GUI, complete the following steps:
- Select Copy Services > Partnerships to display the system information about the node that you replaced.
- Highlight the appropriate system name, right-click the entry, and select Restart.
- Repeat the preceding steps on the partner node.
- To use the CLI, enter the
- Repeat Step 3 through Step 28 for each node that you replace.