You can create an enhanced stretched system
configuration where each node on the system is physically on a different site. When used with
mirroring technologies, such as volume mirroring or Copy Services, these configurations can be used
to maintain access to data on the system in the event of power failures or site-wide
outages.
Note: If the objective of your solution design is high availability, then it is better to use an IBM®
HyperSwap® topology instead of an
enhanced stretched system configuration. However, if the objectives include topics like disaster
recovery, complex copy services, or highest scalability, then consider the restrictions of the
current version of HyperSwap. For more information,
see
Planning for high availability.
The enhanced stretched system configuration with the topology
attribute of the system set to stretched is detailed here. Older ways of configuring a stretched
system are described in previous versions of the IBM Knowledge Center that are still supported. It
is possible to non-disruptively move to the current enhanced stretched system configuration by
following the final configuration steps that are presented here so that you get better availability
and disaster recovery. It is also possible to non-disruptively move from stretched system
configuration to HyperSwap system
configuration for even better availability, performance, and disaster recovery. Contact IBM Remote Technical support center for guidance on changing the
topology of an existing system.
In a stretched system configuration, each site is defined as an independent
failure domain. If one site experiences a failure, the other site can continue to operate without
disruption. You must also configure a third site to host a quorum device that provides an automatic
tie-break in the event of a potential link failure between the two main sites. The main site can be
in the same room or across rooms in the data center, buildings on the same campus, or buildings in
different cities. Different kinds of sites protect against different types of failures.
- Sites are within a single location
- If each site is a different power phase within single location or data center, the system can
survive the failure of any single power domain. For example, one node can be placed in one rack
installation and the other node can be in another rack. Each rack is considered a separate site with
its own power phase. In this case, if power was lost to one of the racks, the partner node in the
other rack could be configured to process requests and effectively provide availability to data even
when the other node is offline due to a power disruption.
- Each site is at separate locations
- If each site is a different physical location, the system can survive the failure of any single
location. These sites can span shorter distances. For example, two sites can be in the same city, or
they can be spread farther geographically, such as two sites in separate cities. If one site
experiences a site-wide disaster, the remaining site can remain available to process requests.
If configured properly, the system continues to operate after the loss of one site. The key
prerequisite is that each site contains only one node from each pair of nodes. Simply placing a pair
of nodes from the same system in different sites for a stretched system configuration does not
provide high availability. You must also configure the appropriate mirroring technology and ensure
that all configuration requirements for those technologies are properly configured.
Notes: - In SAN Volume Controller 2145-DH8,
2145-CG8, or 2145-CF8 models, nodes with internal
flash drives are not recommended.
- Stretched systems can be used with N_Port ID Virtualization
(NPIV). In a site loss, the Fibre Channel failover ports on the remote site nodes open and present
to the fabric the worldwide port names (WWPNs) of the Fibre Channel host ports from the local nodes.
NPIV enables hosts to log back in to these ports without requiring rerouting from the multipath
driver. In this case, more latency might be introduced by the round-trip data transit time to the
ports that are physically at the remote site.
- Stretched system Fibre Channel configurations with active/passive controllers such as IBM
DS5000™, IBM
DS4000®, and IBM DS3000 systems must be configured
with sufficient connections such that all sites have direct access to both external storage systems.
For iSCSI configurations with two or more active/passive controllers such as
Storwize® family systems, the systems must
be configured with sufficient connections such that all sites have direct access to both external
storage systems. Quorum access for stretched system is possible only through the current owner
of the MDisk that is being used as the active quorum disk.
You must configure a stretched system to meet the following requirements:
- In Fibre Channel connections, directly connect each node to two or more SAN fabrics at the
primary and secondary sites (2 - 8 fabrics are supported).In iSCSI
connections, connect each node to two or more Ethernet fabrics at the primary and secondary
sites. Sites are defined as independent failure domains. A failure domain is a part of the
system within a boundary such that any failure (such as a power failure, fire, or flood) within that
boundary is contained within the boundary and the failure does not propagate or affect parts outside
of that boundary. Failure domains can be in the same room or across rooms in the data center,
buildings on the same campus, or buildings in different towns. Different kinds of failure domains
protect against different types of faults.
- Use a third site to house a quorum disk or IP quorum
application.
Quorum disks cannot be located on iSCSI-attached storage systems; therefore, iSCSI storage cannot be
configured on a third site.
- If a storage
system is used at the third site, it must support extended quorum disks. More information is
available in the interoperability matrixes that are available at the following website:
www.ibm.com/support
- Place independent storage systems at the primary and secondary sites, and use volume mirroring
to mirror the host data between storage systems at the two sites. Where possible, set the preferred
node of each volume to the node in the same site as the host that the volume is mapped to.
- Connections can vary based on fibre type and small form-factor pluggable (SFP) transceiver (longwave and shortwave).
- Nodes that are in the same I/O group and separated by more than 100 meters (109
yards) must use longwave Fibre Channel or iSCSI connections. A longwave small form-factor pluggable (SFP) transceiver can be purchased as an optional
component, and must be one of the longwave SFP
transceivers listed at the following website:
www.ibm.com/support
- Avoid using inter-switch links (ISLs) in paths between nodes and external storage systems. If
this is unavoidable, do not oversubscribe the ISLs because of substantial Fibre Channel traffic
across the ISLs. For most configurations, trunking is required. Because ISL problems are difficult
to diagnose, switch-port error statistics must be collected and regularly monitored to detect
failures.
- Using a single switch at the third site can lead to the creation of a single fabric rather than
two independent and redundant fabrics. A single fabric is an unsupported configuration.
- Ethernet port 1 on every node must be connected to the same subnet or subnets. Ethernet port 2
(if used) of every node must be connected to the same subnet (this might be a different subnet from
port 1). The same principle applies to other Ethernet ports.
- A node must be in the same rack as the 2145 UPS or 2145
UPS-1U that supplies its power.
- Some service actions require physical access to all nodes in a system. If nodes in a stretched
system are separated by more than 100 meters, service actions might require multiple service
personnel. Contact your service representative to inquire about multiple site support.
A stretched system locates the active quorum disk or an
IP quorum application at a third site. If communication is lost between the primary and
secondary sites, the site with access to the active quorum disk continues to process transactions.
If communication is lost to the active quorum disk, an alternative quorum disk at another site can
become the active quorum disk.
Although a system of nodes can be configured to use up to three quorum disks,
only one quorum disk can be elected to resolve a situation where the system is partitioned into two
sets of nodes of equal size. The purpose of the other quorum disks is to provide redundancy if a
quorum disk fails before the system is partitioned.
Figure 1 illustrates
an example stretched system configuration. When used with
volume mirroring, this configuration provides
a high availability solution that is tolerant of a failure at a single
site. If either the primary or secondary site fails, the remaining
sites can continue doing I/O operations.
In this configuration,
the connections between SAN Volume Controller nodes
in the system are greater than 100 meters apart, and therefore must
be longwave Fibre Channel connections.Figure 1. A stretched
system with a quorum disk at a third site
In
Figure 1, the storage
system that hosts the third-site quorum disk is attached directly
to a switch at both the primary and secondary sites by using longwave
Fibre Channel connections. If either the primary
site or the secondary site fails, you must ensure that the remaining
site retains direct access to the storage system that hosts the quorum
disks.
Restriction: Do not connect a storage
system in
one site directly to a switch fabric in the other site.
An alternative configuration can use an additional Fibre Channel switch at the third site with
connections from that switch to the primary site and to the secondary
site.
A stretched system configuration is supported
only when the storage system that hosts the quorum disks supports
extended quorum. Although SAN Volume Controller can
use other types of storage systems for providing quorum disks, access
to these quorum disks is always through a single path.
For quorum disk configuration requirements, see the
technote Guidance for Identifying and Changing Managed Disks
Assigned as Quorum Disk Candidates.
When
you set up mirrored volumes in a stretched system configuration, consider
whether you want to set the mirror write priority to redundancy to
maintain synchronization of the copies through temporary delays in
write completions. For more details, see the information about mirrored
volumes.
Stretched system and Metro
Mirror or Global Mirror
A stretched system is designed to continue operation after the loss of one failure
domain.
The stretched system cannot guarantee that it can operate after the failure of two
failure domains. If the enhanced stretched system function is configured, you can enable a manual
override for this situation. You can also use Metro
Mirror or Global Mirror on a second SAN Volume Controller system for extended disaster
recovery with either an enhanced stretched system or a conventional stretched system. You configure
and manage Metro
Mirror or
Global Mirror partnerships
that include a stretched system in the same way as other remote copy relationships. SAN Volume Controller supports SAN routing technology,
which includes FCIP links, for intersystem connections that use Metro
Mirror or Global Mirror.
The partner SAN Volume Controller stretched
system must not be in a production site of the SAN Volume Controller stretched
system. However, it can be collocated with the storage system that
provides the active quorum disk for the stretched system.
Configuration steps
These additional configuration
steps can be done by using the command-line interface (CLI) or the management GUI.- Each SAN Volume Controller node
in the system must be assigned to a site. Use the chnode CLI
command.
- Each back-end storage system must
be assigned to a site. Use the chcontroller CLI
command.
- Each host must be assigned to
a site. Use the chhost CLI command
- After all nodes, hosts, and storage systems are
assigned to a site, the enhanced mode must be enabled by changing
the system topology to stretched.
- For best results, configure an enhanced stretched
system to include at least two I/O groups (four nodes). A system with
just one I/O group cannot guarantee to maintain mirroring of data
or uninterrupted host access in the presence of node failures or system
updates.
The SAN Volume Controller stretched
system cannot guarantee that it can operate after the failure of two
failure domains. You can enable a manual override for this situation
if the enhanced stretched system function is configured. You can also
use Metro
Mirror or Global Mirror with
either an enhanced stretched system or a conventional stretched system
on a second SAN Volume Controller system
for extended disaster recovery. You configure and manage Metro
Mirror or Global Mirror partnerships
that include a stretched system in the same way as other remote copy
relationships. SAN Volume Controller supports
SAN routing technology (including FCIP links) for intersystem connections
that use Metro
Mirror or Global Mirror.