MAP 5350: Powering off a node

MAP 5350: Powering off a node helps you power off a single node to perform a service action without disrupting host access to volumes.

Before you begin

If the solution is set up correctly, powering off a single node does not disrupt the normal operation of a SAN Volume Controller system. A system has nodes in pairs called I/O groups. An I/O group continues to handle I/O to the disks it manages with only a single node powered on. However, performance degrades and resilience to error is reduced.

Be careful when powering off a SAN Volume Controller node to impact the system no more than necessary. If you do not follow the procedures outlined here, your application hosts might lose access to their data or they might lose data in the worst case.

You can use the following preferred methods to power off a node that is a member of a system and not offline:
  1. Use the Power off option in the management GUI or in the service assistant interface.
  2. Use the CLI command stopsystem –node name.

It is preferable to use either the management GUI or the command-line interface (CLI) to power off a node, as these methods provide a controlled handover to the partner node and provide better resilience to other faults in the system.

Only if a node is offline or not a member of a system must you power it off using the power button.

About this task

To provide the least disruption when powering off a node, all of the following conditions should apply:
  • The other node in the I/O group is powered on and active in the system.
  • The other node in the I/O group has SAN Fibre Channel connections to all hosts and disk controllers managed by the I/O group.
  • All volumes handled by this I/O group are online.
  • Host multipathing is online to the other node in the I/O group.

In some circumstances, the reason you power off the node might make these conditions impossible. For instance, if you replace a failed Fibre Channel adapter, volumes do not show an online status. Use your judgment to decide that it is safe to proceed when a condition is not met. Always check with the system administrator before proceeding with a power off that you know disrupts I/O access, as the system administrator might prefer to wait for a more suitable time or suspend host applications.

To ensure a smooth restart, a node must save data structures that it cannot recreate to its local, internal disk drive. The amount of data the node saves to local disk can be high, so this operation might take several minutes. Do not attempt to interrupt the controlled power off.

Attention: The following actions do not allow the node to save data to its local disk. Therefore, do not power off a node using the following methods:
  • Removing the power cable between the node and the uninterruptible power supply.

    Normally the uninterruptible power supply provides sufficient power to allow the write to local disk in the event of a power failure, but obviously it is unable to provide power in this case.

  • Holding down the power button on the node (unless it is a SAN Volume Controller 2145-SV1).

    When you press and release the power button, the node indicates this to the software so the node can write its data to local disk before the node powers off.

    When you hold down the power button, the hardware interprets this as an emergency power off indication and shuts down immediately. The hardware does not save the data to a local disk before powering down. The emergency power off occurs approximately four seconds after you press and hold down the power button.

  • Pressing the reset button on the light path diagnostics panel.
Important: If you power off a SAN Volume Controller 2145-DH8 node and might not power it back on the same day, follow these steps to prevent the batteries from being discharged too much while the node is connected to power but not powered on:
  1. Pull both batteries out of the node. Keep them out until you're ready to power on the node.
  2. Push the batteries in just before you press the power button to power on the node.
If you disconnect the power from a SAN Volume Controller 2145-DH8 node and might not reconnect power to it again within the next 24 hours, follow these steps to prevent the batteries from being discharged too much while the node is not connected to power:
  1. After both power cords are disconnected from the node, pull both batteries out of the node. This step completely turns off the battery backplane.
  2. Push the batteries back in again.

Using the management GUI to power off a system

Use the management GUI to power off a system.

Procedure

To use the management GUI to power off a system, complete the following steps:

  1. Launch the management GUI for the system that you are servicing.
  2. Select Monitoring > System.

    If the nodes to power off are shown as Offline, the nodes are not participating in the system. In such circumstances, use the power button on the offline nodes to power off the nodes.

    If the nodes to power off are shown as Online, powering off the nodes can result in their dependent volumes also going offline:

    1. Select the node and click Show Dependent Volumes.
    2. Make sure the status of each volume in the I/O group is Online. You might need to view more than one page.

      If any volumes are Degraded, only one node in the I/O is processing I/O requests for that volume. If that node is powered off, it impacts all the hosts that are submitting I/O requests to the degraded volume.

      If any volumes are degraded and you believe that this might be because the partner node in the I/O group has been powered off recently, wait until a refresh of the screen shows all volumes online. All the volumes should be online within 30 minutes of the partner node being powered off.

      Note: After waiting 30 minutes, if you have a degraded volume and all of the associated nodes and MDisks are online, contact support for assistance.

      Ensure that all volumes that are used by hosts are online before continuing.

    3. If possible, check that all hosts that access volumes managed by this I/O group are able to fail over to use paths that are provided by the other node in the group.

      Perform this check using the multipathing device driver software of the host system. Commands to use differ, depending on the multipathing device driver being used.

      If you use the System Storage® Multipath Subsystem Device Driver (SDD), the command to query paths is datapath query device.

      It can take some time for the multipathing device drivers to rediscover paths after a node is powered on. If you are unable to check on the host that all paths to both nodes in the I/O group are available, do not power off a node within 30 minutes of the partner node being powered on or you might lose access to the volume.

    4. If you decide that it is okay to continue with powering off the nodes, select the node to power off and click Shut Down System.
    5. Click OK. If the node that you select is the last remaining node that provides access to a volume, for example a node that contains flash drives with unmirrored volumes, the Shutting Down a Node-Force panel is displayed with a list of volumes that will go offline if the node is shut down.
    6. Check that no host applications access the volumes that are going offline. Continue with the shut down only if the loss of access to these volumes is acceptable. To continue with shutting down the node, click Force Shutdown.

What to do next

During the shut down, the node saves its data structures to its local disk and destages all write data held in cache to the SAN disks. Such processing can take several minutes.

At the end of this processing, the system powers off.

Using the SAN Volume Controller CLI to power off a node

Use the command-line interface (CLI) to power off a node.

Procedure

  1. Issue the lsnode CLI command to display a list of nodes in the system and their properties. Find the node to shut down and write down the name of its I/O group. Confirm that the other node in the I/O group is online.
    lsnode -delim : 
    
    id:name:UPS_serial_number:WWNN:status:IO_group_id: IO_group_name:config_node:
    UPS_unique_id 
    1:group1node1:10L3ASH:500507680100002C:online:0:io_grp0:yes:202378101C0D18D8 
    2:group1node2:10L3ANF:5005076801000009:online:0:io_grp0:no:202378101C0D1796 
    3:group2node1:10L3ASH:5005076801000001:online:1:io_grp1:no:202378101C0D18D8 
    4:group2node2:10L3ANF:50050768010000F4:online:1:io_grp1:no:202378101C0D1796

    If the node to power off is shown as Offline, the node is not participating in the system and is not processing I/O requests. In such circumstances, use the power button on the node to power off the node.

    If the node to power off is shown as Online, but the other node in the I/O group is not online, powering off the node impacts all hosts that are submitting I/O requests to the volumes that are managed by the I/O group. Ensure that the other node in the I/O group is online before you continue.

  2. Issue the lsdependentvdisks CLI command to list the volumes that are dependent on the status of a specified node.
    lsdependentvdisks group1node1 
    
    vdisk_id       vdisk_name
    0              vdisk0
    1              vdisk1

    If the node goes offline or is removed from the system, the dependent volumes also go offline. Before taking a node offline or removing it from the system, you can use the command to ensure that you do not lose access to any volumes.

  3. If you decide that it is okay to continue powering off the node, issue the stopsystem –node <name> CLI command to power off the node. Use the –node parameter to avoid powering off the whole system:
    stopsystem –node group1node1
    Are you sure that you want to continue with the shut down? yes
    Note: To shut down the node even though there are dependent volumes, add the -force parameter to the stopsystem command. The force parameter forces continuation of the command even though any node-dependent volumes will be taken offline. Use the force parameter with caution; access to data on node-dependent volumes will be lost.

    During the shut down, the node saves its data structures to its local disk and destages all write data held in the cache to the SAN disks, which can take several minutes.

    At the end of this process, the node powers off.

Using the SAN Volume Controller power control button

Do not use the power control button to power off a node unless an emergency exists or another procedure directs you to do so.

Before you begin

With this method, you cannot check the system status from the front panel, so you cannot tell if the power off is liable to cause excessive disruption to the system. Instead, use the management GUI or the CLI commands, described in the previous topics to power off an active node.

About this task

If you must use this method, notice in Figure 1 and Figure 2 that each model type has a power control button  1  on the front.

Figure 1. Power control button on the SAN Volume Controller 2145-CF8, 2145-CG8, and 2145-DH8 models
Power control button on the SAN Volume Controller models
Figure 2. Power control button and LED lights on the SAN Volume Controller 2145-SV1 model
Power control button on the SAN Volume Controller 2145-SV1 model
  •  1  Power-control button and power-on LED
  •  2  Identify LED
  •  3  Node status LED
  •  4  Node fault LED
  •  5  Battery status LED

When you determine it is safe to do so, press and immediately release the power button. On models other than the 2145-DH8 and 2145-SV1, the front panel display changes to display Powering Off and displays a progress bar.

Note: The 2145-DH8 and 2145-SV1 do not have a front panel display, but status LED  2 ,  3 ,  4 , and  5  in Figure 2 all turn off, and the power-on LED  1  goes from on to flashing.

The 2145-CG8 or the 2145-CF8 requires that you remove a power button cover before you can press the power button.

If you press the power button for too long, the node immediately powers down and cannot write all data to its local disk. An extended service procedure is required to restart the node, which involves deleting the node from the system before it is added back.

The following graphic shows how Powering Off is displayed on the front panel:
This figure shows how Powering off is displayed on the front panel

Results

The node saves its data structures to disk while it is powering off. The power off process can take up to 5 minutes.

When a node is powered off by using the power button (or because of a power failure), the partner node in its I/O group immediately stops using its cache for new write data and destages any write data already in its cache to the SAN-attached disks.

The time that the destage takes depends on the speed and utilization of the disk controllers. The time to complete is less than 15 minutes, but it might be longer. If data is waiting to be written to a disk that is offline, the destaging cannot complete.

A node that powers off and restarts while its partner node continues to process I/O might not be able to become an active member of the I/O group immediately. The node must wait until the partner node completes its destage of the cache.

If the partner node powers off during this period, access to the SAN storage that is managed by this I/O group is lost. If one of the nodes in the I/O group is unable to service any I/O, for example because the partner node in the I/O group is still flushing its write cache, volumes that are managed by that I/O group have a status of Degraded.