Requirements for stretched systems with Fibre Channel connections

If you are configuring a stretched system that uses Fibre Channel connections, ensure that all SAN and Fibre Channel-specific requirements are met.

Notes:
  • Stretched systems are not recommended for models that support nodes with internal flash drives.
  • Stretched system Fibre Channel configurations with active/passive controllers such as IBM® DS5000, IBM DS4000®, and IBM DS3000 systems must be configured with sufficient connections such that all sites have direct access to both external storage systems. For iSCSI configurations with two or more active/passive controllers such as Storwize® family systems, the systems must be configured with sufficient connections such that all sites have direct access to both external storage systems. Quorum access for stretched system is possible only through the current owner of the MDisk that is being used as the active quorum disk.
Use the following requirements to configure a stretched system with Fibre Channel connections:
  • Directly connect each node to two or more SAN fabrics at the primary and secondary sites (2 - 8 fabrics are supported). In iSCSI connections, connect each node to two or more Ethernet fabrics at the primary and secondary sites. Sites are defined as independent failure domains. A failure domain is a part of the system within a boundary such that any failure (such as a power failure, fire, or flood) within that boundary is contained within the boundary and the failure does not propagate or affect parts outside of that boundary. Failure domains can be in the same room or across rooms in the data center, buildings on the same campus, or buildings in different towns. Different kinds of failure domains protect against different types of faults.
  • If a storage system is used at the third site, it must support extended quorum disks. More information is available in the interoperability matrixes that are available at the following website:
    www.ibm.com/support
  • Place independent storage systems at the primary and secondary sites, and use volume mirroring to mirror the host data between storage systems at the two sites. Where possible, set the preferred node of each volume to the node in the same site as the host that the volume is mapped to.
  • Connections can vary based on fibre type and small form-factor pluggable (SFP) transceiver (longwave and shortwave).
  • Nodes that are in the same I/O group and separated by more than 100 meters (109 yards) must use longwave Fibre Channel or Ethernet connections. A longwave small form-factor pluggable (SFP) transceiver can be purchased as an optional component, and must be one of the longwave SFP transceivers listed at the following website:
    www.ibm.com/support
  • Avoid using inter-switch links (ISLs) in paths between nodes and external storage systems. If this is unavoidable, do not oversubscribe the ISLs because of substantial Fibre Channel traffic across the ISLs. For most configurations, trunking is required. Because ISL problems are difficult to diagnose, switch-port error statistics must be collected and regularly monitored to detect failures.
  • Using a single switch at the third site can lead to the creation of a single fabric rather than two independent and redundant fabrics. A single fabric is an unsupported configuration.
  • Ethernet port 1 on every node must be connected to the same subnet or subnets. Ethernet port 2 (if used) of every node must be connected to the same subnet (this might be a different subnet from port 1). The same principle applies to other Ethernet ports.
  • Some service actions require physical access to all nodes in a system. If nodes in a stretched system are separated by more than 100 meters, service actions might require multiple service personnel. Contact your service representative to inquire about multiple site support.
  • Use a third, dedicated site to house a quorum disk or an IP quorum application. For Fibre Channel-connected stretched system, quorum disks or IP quorum applications provide redundancy if communication is lost between the primary and secondary site. In addition both contain configuration metadata that is used to recover the system, if necessary. IP quorum applications are used for when the stretched system connects to iSCSI-attached storage systems. iSCSI storage systems cannot be configured on a third site.

A stretched system locates the active quorum disk or an IP quorum application at a third site. If communication is lost between the primary and secondary sites, the site with access to the active quorum disk continues to process transactions. If communication is lost to the active quorum disk, an alternative quorum disk at another site can become the active quorum disk.

Although a system of nodes can be configured to use up to three quorum disks, only one quorum disk can be elected to resolve a situation where the system is partitioned into two sets of nodes of equal size. The purpose of the other quorum disks is to provide redundancy if a quorum disk fails before the system is partitioned.

This figure illustrates an example stretched system configuration. When used with volume mirroring, this configuration provides a high availability solution that is tolerant of a failure at a single site. If either the primary or secondary site fails, the remaining sites can continue doing I/O operations. In this configuration, the connections between the nodes in the system are greater than 100 meters apart, and therefore must be longwave Fibre Channel connections.
Figure 1. A stretched system with a quorum disk at a third site
A stretched system with a quorum disk at a third site
In Figure 1, the storage system that hosts the third-site quorum disk is attached directly to a switch at both the primary and secondary sites by using longwave Fibre Channel connections. If either the primary site or the secondary site fails, you must ensure that the remaining site retains direct access to the storage system that hosts the quorum disks.
Restriction: Do not connect a storage system in one site directly to a switch fabric in the other site.

An alternative configuration can use an additional Fibre Channel switch at the third site with connections from that switch to the primary site and to the secondary site.

A stretched system configuration is supported only when the storage system that hosts the quorum disks supports extended quorum. Although other types of storage systems can be used to provide quorum disks, access to these quorum disks is always through a single path.

For quorum disk configuration requirements, see the technote Guidance for Identifying and Changing Managed Disks Assigned as Quorum Disk Candidates.