Requirements for stretched systems with RDMA-capable Ethernet ports

If you are configuring a stretched system that uses remote direct memory access (RDMA)-capable Ethernet ports, ensure that all RDMA-specific requirements are met.

Use the following requirements to configure a stretched system that uses RDMA-capable Ethernet ports:

  • Directly connect each node to two or more RDMA-capable Ethernet fabrics at the primary and secondary sites (2 - 4 fabrics are supported). In iSCSI connections, connect each node to two or more RDMA-capable Ethernet fabrics at the primary and secondary sites. Sites are defined as independent failure domains. A failure domain is a part of the system within a boundary such that any failure (such as a power failure, fire, or flood) within that boundary is contained within the boundary and the failure does not propagate or affect parts outside of that boundary. Failure domains can be in the same room or across rooms in the data center, buildings on the same campus, or buildings in different towns. Different kinds of failure domains protect against different types of faults.
  • RDMA-capable Ethernet ports can be used for both node-to-node communications and host attachment; however, do not share RDMA-capable Ethernet ports for hosts and node-to-node communications. RDMA-capable ports are not supported for connections to external storage. A variety of other protocols are also supported for host attachment and virtualization of external storage.
  • If a storage system is used at the third site, it must support extended quorum disks. More information is available in the interoperability matrixes that are available at the following website:

    www.ibm.com/support

  • Place independent storage systems at the primary and secondary sites and use volume mirroring to mirror the host data between storage systems at the two sites. Where possible, set the preferred node of each volume to the node in the same site as the host that the volume is mapped to.
  • Avoid using interswitch links (ISLs) in paths between nodes and external storage systems. If this configuration is unavoidable, do not oversubscribe the ISLs because of substantial RDMA traffic across the ISLs. For most configurations, trunking is required. Because ISL problems are difficult to diagnose, switch-port error statistics must be collected and regularly monitored to detect failures.
  • Using a single switch at the third site can lead to the creation of a single fabric rather than two independent and redundant fabrics. A single fabric is an unsupported configuration.
  • Nodes Ethernet port 1 on every node must be connected to the same subnet or subnets. Ethernet port 2 (if used) of every node must be connected to the same subnet (which might be a different subnet (recommended) from port 1). The same principle applies to other Ethernet ports.
  • Some service actions require physical access to all nodes in a system. If nodes in a stretched system are separated by more than 100 meters, service actions might require multiple service personnel. Contact your service representative to inquire about multiple site support.
  • Use a third, dedicated site to house a quorum disk or an IP quorum application. For a stretched system, quorum disks or IP quorum applications provide redundancy if communication is lost between the primary and secondary site. In addition, both contain configuration metadata that is used to recover the system, if necessary. IP quorum applications are used for when the stretched system connects to iSCSI-attached storage systems. iSCSI storage systems cannot be configured on a third site.

    A stretched system locates the active quorum disk or an IP quorum application at a third site. If communication is lost between the primary and secondary sites, the site with access to the active quorum disk continues to process transactions. If communication is lost to the active quorum disk, an alternative quorum disk at another site can become the active quorum disk.

    Although a system of nodes can be configured to use up to three quorum disks, only one quorum disk can be elected to resolve a situation where the system is partitioned into two sets of nodes of equal size. The purpose of the other quorum disks is to provide redundancy if a quorum disk fails before the system is partitioned.

An example stretched system that uses RDMA-capable Ethernet ports to connect nodes is illustrated inFigure 1.

Figure 1. Stretched system configuration over RDMA-capable Ethernet ports
An illustration of a stretched system configuration that uses RDMA-capable Ethernet ports.

In this configuration if either the primary site or the secondary site fails, you must ensure that the remaining site retains direct access to the storage system that hosts the quorum disks.

This configuration has the following restrictions:

  • Do not connect a storage system in one site directly to a switch fabric in the other site.
  • An alternative configuration can use an extra RDMA-capable switch at the third site with connections from that switch to the primary site and to the secondary site.
  • A stretched system configuration is supported only when the storage system that hosts the quorum disks supports extended quorum. Although other types of storage systems can be used to provide quorum disks, access to these quorum disks is always through a single path.