Stretched system configuration details

You can create an enhanced stretched system configuration where each node on the system is physically on a different site. When used with mirroring technologies, such as volume mirroring or Copy Services, these configurations can be used to maintain access to data on the system in the event of power failures or site-wide outages.

Note: If the objective of your solution design is high availability, then it is better to use an IBM® HyperSwap® topology instead of an enhanced stretched system configuration. However, if the objectives include topics like disaster recovery, complex copy services, or highest scalability, then consider the restrictions of the current version of HyperSwap. For more information, see Planning for high availability.

The enhanced stretched system configuration with the topology attribute of the system set to stretched is detailed here. Older ways of configuring a stretched system are described in previous versions of the IBM Knowledge Center that are still supported. It is possible to non-disruptively move to the current enhanced stretched system configuration by following the final configuration steps that are presented here so that you get better availability and disaster recovery. It is also possible to non-disruptively move from stretched system configuration to HyperSwap system configuration for even better availability, performance, and disaster recovery. Contact IBM Remote Technical support center for guidance on changing the topology of an existing system.

In a stretched system configuration, each site is defined as an independent failure domain. If one site experiences a failure, the other site can continue to operate without disruption. You must also configure a third site to host a quorum device that provides an automatic tie-break in the event of a potential link failure between the two main sites. The main site can be in the same room or across rooms in the data center, buildings on the same campus, or buildings in different cities. Different kinds of sites protect against different types of failures.
Sites are within a single location
If each site is a different power phase within single location or data center, the system can survive the failure of any single power domain. For example, one node can be placed in one rack installation and the other node can be in another rack. Each rack is considered a separate site with its own power phase. In this case, if power was lost to one of the racks, the partner node in the other rack could be configured to process requests and effectively provide availability to data even when the other node is offline due to a power disruption.
Each site is at separate locations
If each site is a different physical location, the system can survive the failure of any single location. These sites can span shorter distances. For example, two sites can be in the same city, or they can be spread farther geographically, such as two sites in separate cities. If one site experiences a site-wide disaster, the remaining site can remain available to process requests.
If configured properly, the system continues to operate after the loss of one site. The key prerequisite is that each site contains only one node from each pair of nodes. Simply placing a pair of nodes from the same system in different sites for a stretched system configuration does not provide high availability. You must also configure the appropriate mirroring technology and ensure that all configuration requirements for those technologies are properly configured.
Notes:
  • Stretched systems can be used with N_Port ID Virtualization (NPIV). In a site loss, the Fibre Channel failover ports on the remote site nodes open and present to the fabric the worldwide port names (WWPNs) of the Fibre Channel host ports from the local nodes. NPIV enables hosts to log back in to these ports without requiring rerouting from the multipath driver. In this case, more latency might be introduced by the round-trip data transit time to the ports that are physically at the remote site.
  • Stretched system Fibre Channel configurations with active/passive controllers such as IBM DS5000™, IBM DS4000®, and IBM DS3000 systems must be configured with sufficient connections such that all sites have direct access to both external storage systems. For iSCSI configurations with two or more active/passive controllers such as Storwize® family systems, the systems must be configured with sufficient connections such that all sites have direct access to both external storage systems. Quorum access for stretched system is possible only through the current owner of the MDisk that is being used as the active quorum disk.
You must configure a stretched system to meet the following requirements:
  • In Fibre Channel connections, directly connect each node to two or more SAN fabrics at the primary and secondary sites (2 - 8 fabrics are supported).In iSCSI connections, connect each node to two or more Ethernet fabrics at the primary and secondary sites. Sites are defined as independent failure domains. A failure domain is a part of the system within a boundary such that any failure (such as a power failure, fire, or flood) within that boundary is contained within the boundary and the failure does not propagate or affect parts outside of that boundary. Failure domains can be in the same room or across rooms in the data center, buildings on the same campus, or buildings in different towns. Different kinds of failure domains protect against different types of faults.
  • Use a third site to house a quorum disk or IP quorum application. Quorum disks cannot be located on iSCSI-attached storage systems; therefore, iSCSI storage cannot be configured on a third site.
  • If a storage system is used at the third site, it must support extended quorum disks. More information is available in the interoperability matrixes that are available at the following website:
    www.ibm.com/support
  • Place independent storage systems at the primary and secondary sites, and use volume mirroring to mirror the host data between storage systems at the two sites. Where possible, set the preferred node of each volume to the node in the same site as the host that the volume is mapped to.
  • Connections can vary based on fibre type and small form-factor pluggable (SFP) transceiver (longwave and shortwave).
  • Nodes that are in the same I/O group and separated by more than 100 meters (109 yards) must use longwave Fibre Channel or iSCSI connections. A longwave small form-factor pluggable (SFP) transceiver can be purchased as an optional component, and must be one of the longwave SFP transceivers listed at the following website:
    www.ibm.com/support
  • Avoid using inter-switch links (ISLs) in paths between nodes and external storage systems. If this is unavoidable, do not oversubscribe the ISLs because of substantial Fibre Channel traffic across the ISLs. For most configurations, trunking is required. Because ISL problems are difficult to diagnose, switch-port error statistics must be collected and regularly monitored to detect failures.
  • Using a single switch at the third site can lead to the creation of a single fabric rather than two independent and redundant fabrics. A single fabric is an unsupported configuration.
  • Ethernet port 1 on every node must be connected to the same subnet or subnets. Ethernet port 2 (if used) of every node must be connected to the same subnet (this might be a different subnet from port 1). The same principle applies to other Ethernet ports.
  • Some service actions require physical access to all nodes in a system. If nodes in a stretched system are separated by more than 100 meters, service actions might require multiple service personnel. Contact your service representative to inquire about multiple site support.

A stretched system locates the active quorum disk or an IP quorum application at a third site. If communication is lost between the primary and secondary sites, the site with access to the active quorum disk continues to process transactions. If communication is lost to the active quorum disk, an alternative quorum disk at another site can become the active quorum disk.

Although a system of nodes can be configured to use up to three quorum disks, only one quorum disk can be elected to resolve a situation where the system is partitioned into two sets of nodes of equal size. The purpose of the other quorum disks is to provide redundancy if a quorum disk fails before the system is partitioned.

Figure 1 illustrates an example stretched system configuration. When used with volume mirroring, this configuration provides a high availability solution that is tolerant of a failure at a single site. If either the primary or secondary site fails, the remaining sites can continue doing I/O operations. In this configuration, the connections between IBM Spectrum Virtualize™ nodes in the system are greater than 100 meters apart, and therefore must be longwave Fibre Channel connections.
Figure 1. A stretched system with a quorum disk at a third site
A stretched system with a quorum disk at a third site
In Figure 1, the storage system that hosts the third-site quorum disk is attached directly to a switch at both the primary and secondary sites by using longwave Fibre Channel connections. If either the primary site or the secondary site fails, you must ensure that the remaining site retains direct access to the storage system that hosts the quorum disks.
Restriction: Do not connect a storage system in one site directly to a switch fabric in the other site.

An alternative configuration can use an additional Fibre Channel switch at the third site with connections from that switch to the primary site and to the secondary site.

A stretched system configuration is supported only when the storage system that hosts the quorum disks supports extended quorum. Although IBM Spectrum Virtualize can use other types of storage systems for providing quorum disks, access to these quorum disks is always through a single path.

For quorum disk configuration requirements, see the technote Guidance for Identifying and Changing Managed Disks Assigned as Quorum Disk Candidates.

When you set up mirrored volumes in a stretched system configuration, consider whether you want to set the mirror write priority to redundancy to maintain synchronization of the copies through temporary delays in write completions. For more details, see the information about mirrored volumes.

Stretched system and Metro Mirror or Global Mirror

A stretched system is designed to continue operation after the loss of one failure domain.

The stretched system cannot guarantee that it can operate after the failure of two failure domains. If the enhanced stretched system function is configured, you can enable a manual override for this situation. You can also use Metro Mirror or Global Mirror on a second IBM Spectrum Virtualize system for extended disaster recovery with either an enhanced stretched system or a conventional stretched system. You configure and manage Metro Mirror or Global Mirror partnerships that include a stretched system in the same way as other remote copy relationships. IBM Spectrum Virtualize supports SAN routing technology, which includes FCIP links, for intersystem connections that use Metro Mirror or Global Mirror.

The partner IBM Spectrum Virtualize stretched system must not be in a production site of the IBM Spectrum Virtualize stretched system. However, it can be collocated with the storage system that provides the active quorum disk for the stretched system.

Configuration steps

These additional configuration steps can be done by using the command-line interface (CLI) or the management GUI.
  • Each IBM Spectrum Virtualize node in the system must be assigned to a site. Use the chnode CLI command.
  • Each back-end storage system must be assigned to a site. Use the chcontroller CLI command.
  • Each host must be assigned to a site. Use the chhost CLI command
  • After all nodes, hosts, and storage systems are assigned to a site, the enhanced mode must be enabled by changing the system topology to stretched.
  • For best results, configure an enhanced stretched system to include at least two I/O groups (four nodes). A system with just one I/O group cannot guarantee to maintain mirroring of data or uninterrupted host access in the presence of node failures or system updates.

The IBM Spectrum Virtualize stretched system cannot guarantee that it can operate after the failure of two failure domains. You can enable a manual override for this situation if the enhanced stretched system function is configured. You can also use Metro Mirror or Global Mirror with either an enhanced stretched system or a conventional stretched system on a second IBM Spectrum Virtualize system for extended disaster recovery. You configure and manage Metro Mirror or Global Mirror partnerships that include a stretched system in the same way as other remote copy relationships. IBM Spectrum Virtualize supports SAN routing technology (including FCIP links) for intersystem connections that use Metro Mirror or Global Mirror.