MAP 5350: Powering off a node

MAP 5350: Powering off a node helps you power off a single node to complete a service action without disrupting host access to volumes.

Before you begin

If the solution is set up correctly, powering off a single node does not disrupt the normal operation of a system. A system has nodes in pairs called I/O groups. An I/O group continues to handle I/O to the disks it manages with only a single node that is powered on. However, performance degrades and resilience to error is reduced.

Be careful when you power off a system node to impact the system no more than necessary.
Note: If you do not follow the procedures that are outlined here, your application hosts might lose access to their data or they might lose data in the worst case.
You can use the following preferred methods to power off a node that is a member of a system and not offline:
  1. Use the Power off option in the management GUI or in the service assistant interface.
  2. Use the CLI command stopsystem –node name.

It is preferable to use either the management GUI or the command-line interface (CLI) to power off a node. These methods provide a controlled handover to the partner node and provide better resilience to other faults in the system.

Only if a node is offline or not a member of a system must you power it off using the power push-button.

About this task

To provide the least disruption when you power off a node, all of the following conditions must apply.
  • The other node in the I/O group is powered on and active in the system.
  • The other node in the I/O group has SAN Fibre Channel connections to all hosts and disk controllers that are managed by the I/O group.
  • All volumes that are handled by this I/O group are online.
  • Host multipathing is online to the other node in the I/O group.

In some circumstances, the reason you power off the node might make these conditions impossible. For instance, if you replace a failed Fibre Channel adapter, volumes do not show an online status. Use your judgment to decide that it is safe to proceed when a condition is not met. Always check with the system administrator before you proceed with power off as that might disrupt I/O access. The system administrator might prefer to wait for a more suitable time or suspend host applications.

To ensure a smooth restart, a node must save data structures that it cannot re-create to its local, internal disk drive. The amount of data the node saves to local disk might be high, so this operation might take several minutes. Do not attempt to interrupt the controlled power off.

Attention: The following actions do not allow the node to save data to its local disk. Therefore, do not power off a node by using the following methods:
  • Holding down the power push-button on the node (unless it is a SAN Volume Controller 2145-SV1 ).

    When you press and release the power push-button, the node indicates this action to the software so the node can write its data to local disk before the node powers off.

    When you hold down the power push-button, the hardware interprets this action as an emergency power off indication and shuts down immediately. The hardware does not save the data to a local disk before you power down. The emergency power off occurs approximately 4 seconds after you press and hold down the power push-button.

  • Pressing the reset push-button on the light path diagnostics panel.
Important: Powering off a SAN Volume Controller 2145-DH8 node until possible the next day can drain the batteries. Follow these steps to prevent the batteries from being discharged too much while the node is connected to power but not powered on.
  1. Pull both batteries out of the node. Keep them out until you're ready to power on the node.
  2. Push the batteries in just before you press the power push-button to power on the node.
If you disconnect the power from a SAN Volume Controller 2145-DH8 node and might not reconnect power to it again within the next 24 hours, follow these steps to prevent the batteries from being discharged too much while the node is not connected to power:
  1. After both power cords are disconnected from the node, pull both batteries out of the node. This step completely turns off the battery backplane.
  2. Push the batteries back in again.

Using the management GUI to power off a system

Use the management GUI to power off a system.

Procedure

To use the management GUI to power off a system, complete the following steps:

  1. Start the management GUI for the system that you are servicing.
  2. Select Monitoring > System.

    If the nodes to power off are shown as Offline, the nodes are not participating in the system. In such circumstances, use the power push-button on the offline nodes to power off the nodes.

    If the nodes to power off are shown as Online, powering off the nodes can result in their dependent volumes also going offline:

    1. Select the node and click Show Dependent Volumes.
    2. Make sure the status of each volume in the I/O group is Online. You might need to view more than one page.
      You might need to view more than one page.

      If any volumes are Degraded, only one node in the I/O is processing I/O requests for that volume. If that node is powered off, it impacts all the hosts that are submitting I/O requests to the degraded volume.

      If any volumes are degraded and you believe that it might be because the partner node in the I/O group is powered off recently, wait until a refresh of the screen shows all volumes online. All the volumes must be online within 30 minutes of the partner node that is being powered off.

      Note: After you wait 30 minutes, if you have a degraded volume and all of the associated nodes and MDisks are online, contact support for assistance.

      Ensure that all volumes that are used by hosts are online before you continue.

    3. If possible, check that all hosts that access volumes that are managed by this I/O group are able to fail over to use paths that are provided by the other node in the group.

      Complete this check by using the multipathing device driver software of the host system. Commands to use differ, depending on the multipathing device driver that is being used.

      If you use the System Storage® Multipath Subsystem Device Driver (SDD), the command to query paths is datapath query device.

      It can take some time for the multipathing device drivers to rediscover paths after a node is powered on. If you are unable to check on the host that all paths to both nodes in the I/O group are available, do not power off a node within 30 minutes of the partner node that is being powered on or you might lose access to the volume.

    4. If you decide that it is okay to continue with powering off the nodes, select the node to power off and click Shut Down System.
    5. Click OK. If the node that you select is the last remaining node that provides access to a volume, for example a node that contains flash drives with unmirrored volumes, the Shutting Down a Node-Force panel is displayed with a list of volumes that go offline if the node is shut down.
    6. Check that no host applications access the volumes that are going offline. Continue with the shutdown only if the loss of access to these volumes is acceptable. To continue with shutting down the node, click Force Shutdown.

What to do next

During the shutdown procedure, the node saves its data structures to its local disk and destages all write data that is held in cache to the SAN disks. Such processing can take several minutes.

At the end of this processing, the system powers off.

Using the system CLI to power off a node

Use the command-line interface (CLI) to power off a node.

Procedure

  1. Issue the lsnode CLI command to display a list of nodes in the system and their properties. Find the node to shut down and write down the name of its I/O group. Confirm that the other node in the I/O group is online.
    lsnode -delim : 
    
    id:name:UPS_serial_number:WWNN:status:IO_group_id: IO_group_name:config_node:
    UPS_unique_id 
    1:group1node1:10L3ASH:500507680100002C:online:0:io_grp0:yes:202381001C0D18D8 
    2:group1node2:10L3ANF:5005076801000009:online:0:io_grp0:no:202381001C0D1796 
    3:group2node1:10L3ASH:5005076801000001:online:1:io_grp1:no:202381001C0D18D8 
    4:group2node2:10L3ANF:50050768010000F4:online:1:io_grp1:no:202381001C0D1796
    

    If the node to power off is shown as Offline, the node is not participating in the system and is not processing I/O requests. In such circumstances, use the power push-button on the node to power off the node.

    Powering off a node that is Online while its partner node is not online impacts all hosts with I/O requests to volumes that are managed by the I/O group. Ensure that the other node in the I/O group is online before you continue.

  2. Issue the lsdependentvdisks -node <name> CLI command to list the volumes that depend on the status of a specified node.
    lsdependentvdisks -node group1node1 
    
    vdisk_id       vdisk_name
    0              vdisk0
    1              vdisk1

    If the node goes offline or is removed from the system, the dependent volumes also go offline. Before you take a node offline or remove it from the system, you can use the command to ensure that you do not lose access to any volumes.

  3. If you decide that it is okay to continue powering off the node, enter the stopsystem –node <name> CLI command to power off the node. Use the –node parameter to avoid powering off the whole system:
    stopsystem –node group1node1
    Are you sure that you want to continue with the shut down? yes
    
    Note: To shut down a node with dependent volumes, add the -force parameter to the stopsystem command. The force parameter forces continuation of the command even though any node-dependent volumes will be taken offline. Use the force parameter with caution; access to data on node-dependent volumes will be lost.

    The node saves its data structures to its local disk as it shuts down and destages all write data in the cache to the SAN disks. Shutting down can take several minutes.

    At the end of this process, the node powers off.

Shutting down by using the system power control-button

Do not use the power control-button to power off a node unless an emergency exists or another procedure directs you to do so.

Before you begin

With this method, you cannot check the system status from the front panel, so you cannot tell if the power off is liable to cause excessive disruption to the system. Instead, use the management GUI or the CLI commands, described in the previous topics to power off an active node.

About this task

If you must use this method, notice in Figure 1 and Figure 2 that each model type has a power control button  1  on the front.

Figure 1. Power control-button on the SAN Volume Controller 2145-CF8, 2145-CG8 , and 2145-DH8 modelsPower control-button on the SAN Volume Controller 2145-DH8 model
Power control button on the 2145-DH8 model
Figure 2. Power control button and LED lights on the SAN Volume Controller 2145-SV1 model
Power control button on the SAN Volume Controller 2145-SV1 model
  •  1  Power-control button and power-on LED
  •  2  Identify LED
  •  3  Node status LED
  •  4  Node fault LED
  •  5  Battery status LED

When you determine it is safe to do so, press and immediately release the power button. On models other than the 2145-DH8 and 2145-SV1 , the front panel display changes to display Powering Off and displays a progress bar.

Note: The 2145-DH8 and 2145-SV1 do not have a front panel display, but status LED  2 ,  3 ,  4 , and  5  in Figure 2 all turn off, and the power-on LED  1  goes from on to flashing.

Results

The node saves its data structures to disk while it is powering off. The power off process can take up to 5 minutes.

When a node is powered off by using the power button (or because of a power failure), the partner node in its I/O group immediately stops using its cache for new write data and destages any write data already in its cache to the SAN-attached disks.

The destaging duration depends on the speed and utilization of the disk controllers. The time to complete is less than 15 minutes, but it might be longer. If data is waiting to be written to a disk that is offline, the destaging cannot complete.

A node that powers off and restarts while its partner node continues to process I/O might not be able to become an active member of the I/O group immediately. The node must wait until the partner node completes destaging the cache.

If the partner node powers off during this period, access to the SAN storage that is managed by this I/O group is lost. If one of the nodes in the I/O group is unable to service any I/O, volumes that are managed by that I/O group have a status of Degraded. For example, if the partner node in the I/O group is still flushing its write cache, it has a status of Degraded.