Compressed volumes

When you create volumes, you can specify compression as a method to save capacity for the volume. With compressed volumes, data is compressed as it is written to disk, saving more space.

Like thin-provisioned volumes, compressed volumes have virtual, real, and used capacities. Use the following guidelines before you work with compressed volumes.

  • Real capacity is the extent space that is allocated from the pool. The real capacity is also set when the volume is created and like thin-provisioned volumes, can be expanded or shrunk down to the used capacity.
  • Virtual capacity is available to hosts. The virtual capacity is set when the volume is created and can be expanded or shrunk afterward. The virtual capacity of a HyperSwap® volume cannot be changed. To expand or shrink the volume, you must convert the HyperSwap volume to a basic volume by removing a copy from the volume. The volume can then be resized and a new copy can be added again to convert the volume back into a HyperSwap volume.
  • Used capacity is the amount of real capacity that is used to store customer data and metadata after compression.
  • Capacity before compression is the amount of customer data that was written to the volume and then compressed. The capacity before compression does not include regions where zero data is written to unallocated space.

You can also monitor information on compression usage to determine the savings to your storage capacity when volumes are compressed. To monitor system-wide compression savings and capacity, select Monitoring > System. You can compare the amount of capacity that is used before compression is applied to the capacity that is used for all compressed volumes. In addition, you can view the total percentage of capacity savings when compression is used on the system. You can also monitor compression savings across individual pools and volumes. For volumes, you can use these compression values to determine the volumes that achieved the highest compression savings.

Benefits of compression

Using compression reduces the amount of physical storage across your environment. You can reuse free disk space in the existing storage without archiving or deleting data.

Compressing data as it is written to the volume also reduces the environmental requirements per unit of storage. After compression is applied to stored data, the required power and cooling per unit of logical storage is reduced because more logical data is stored on the same amount of physical storage. Within a particular storage system, more data can be stored which reduces overall rack unit requirements.

Compression can be implemented without impacting the existing environment and can be used with other storage processes, such as mirrored volumes and Copy Services functions.

Compressed volumes provide an equivalent level of availability as regular volumes. Compression can be implemented into an existing environment without an impact to service and existing data can be compressed transparently while it is being accessed by users and applications.

When you use compression, monitor overall performance and CPU utilization to ensure that other system functions have adequate bandwidth. If compression is used excessively, overall bandwidth for the system might be impacted. To view performance statistics that are related to compression, select Monitoring > Performance and then select Compression % on the CPU Utilization graph.

Common uses for compressed volumes

Compression can be used to consolidate storage in both block storage and file system environments. Compressing data reduces the amount of capacity that is needed for volumes and directories. Compression can be used to minimize storage utilization of logged data. Many applications, such as lab test results, require constant recording of application or user status. Logs are typically represented as text files or binary files that contain a high repetition of the same data patterns.

By using volume mirroring, you can convert an existing fully allocated volume to a compressed volume without disrupting access to the original volume content. The management GUI contains specific directions on converting a generic volume to a compressed volume.

Planning for compressed volumes

Before you implement compressed volumes on your system, assess the current types of data and volumes that are used on your system. Do not compress data that is already compressed as part of its normal workload. Data such as video, compressed file formats (.zip files), or compressed user productivity file formats (.pdf files), is compressed as it is saved. It is not effective to spend system resources for compression on these types of files since little extra savings can be achieved. Encrypted data also cannot be compressed.

There are two types of volumes to consider homogeneous and heterogeneous. Homogeneous volumes are typically better candidates for compression. Homogeneous volumes contain data that was created by a single application and these volumes store the same kind of data. Examples of homogeneous volumes include database applications, email, and server virtualization data. Heterogeneous volumes are volumes that contain data that was created by several different applications and contain different types of data. Since different data types populate such volumes, there are situations where compressed or encrypted data are stored on these volumes. In such cases, system resources can be spent on data that cannot be compressed. Avoid compressing heterogeneous volumes, unless the heterogeneous volumes contain only compressible, unencrypted data.

If your system currently does not use compression, the system automatically analyzes your configuration to determine the potential storage savings if compression is used. The management GUI incorporates the Comprestimator utility that uses mathematical and statistical algorithms to create potential compression savings for the system. The analysis for potential savings can be used to determine whether purchasing a compression license for the system is necessary to reduce cost of extra storage devices. To estimate compression savings in the management GUI, select Volumes > Actions > Space Savings > Estimate Compression Savings. For example, you can run the analyzevdisk command on a single volume. You can also use the analyzevdiskbysystem command to analyze all of the volumes that are on the system. Any volumes that are created after the compression analysis completes can be evaluated individually for compression savings. Ensure that volumes to be analyzed contain as much active data as possible rather than volumes that are mostly empty of data. Analyzing active data increases accuracy and reduces the risk of analyzing old data that is already deleted but can still have traces on the device.

After the analysis completes, you can download a savings report that shows estimated savings for all the volumes with enough data to be analyzed. This report lists all currently configured volumes on the system and their potential compressions savings. To download a report, select Volumes > Volumes > Actions > Space Savings > Download Savings Report. You can also display the results by using the lsvdiskanalysis command. You can display results for all the volumes or single volumes by specifying a volume name or identifier for individual analysis.

Various configuration items affect the performance of compression on the system. To attain high compression ratios and performance on your system, ensure that the following guidelines are met.
  • If you have only a small number (10 - 20) of compressed volumes, configure them on one I/O group and do not split compressed volumes between different I/O groups.
  • For larger numbers of compressed volumes on systems with more than one I/O group, distribute compressed volumes across I/O groups to ensure that access to these volumes is evenly distributed among the I/O groups.
  • Identify and use compressible data only. Different data types have different compression ratios, and it is important to determine the compressible data currently on your system. You can use tools that estimate the compressible data or use commonly known ratios for common applications and data types. Storing these data types on compressed volumes saves disk capacity and improves the benefit of using compression on your system. The following table shows the compression ratio for common applications and data types.
    Table 1. Compression ratio for data types. Table 1 describes the compression ratio of common data types and applications that provide high compression ratios.
    Data Types/Applications Compression Ratios
    Databases Up to 80%
    Server or Desktop Virtualization Up to 75%
    Engineering Data Up to 70%
    Email Up to 80%
  • Ensure that you have an extra 10% of capacity in the pools that are used for compressed volumes for the additional metadata and to provide an error margin in the compression ratio.
  • Use compression on homogeneous volumes.
  • Avoid using any client, file system, or application based-compression with the system compression.
  • Do not compress encrypted data.

To use compressed volumes without affecting performance of existing non-compressed volumes in a pre-existing system, ensure that you understand the way that resources are reallocated when the first compressed volume is created.

Compression requires dedicated hardware resources within the nodes which are assigned or de-assigned when compression is enabled or disabled. Compression is enabled whenever the first compressed volume in an I/O group is created and is disabled when the last compressed volume is removed from the I/O group.

As a result of the reduced hardware resources available to process non-compressed host-to-disk I/O, you should not create compressed volumes if the CPU utilization of nodes in an I/O group is consistently above certain values. Performance might be degraded for existing non-compressed volumes in the I/O group if compressed volumes are created.

Use Monitoring > Performance in the management GUI during periods of high host workload to measure CPU utilization.

Size limits

Compressed volumes have the following size limits. If a new or existing compressed volume approaches the maximum size, the system issues an alert.

96 TB
Maximum virtual size of a new, individual compressed volume. You cannot create a new compressed volume that exceeds this size. In addition, you cannot increase the size of an existing compressed volume beyond this value. If one or more compressed volumes in a cluster exceed this limit, you receive an alert. To reduce the risk of losing or corrupting data, take action soon to remove data from the compressed volume.
120 TB
Maximum virtual size of an existing compressed volume in a cluster. If any compressed volumes in the cluster approach or exceed this value, the system issues an alert.
Important: Immediate action is required to remove all data from the compressed volume and prevent the loss of data.
128 TB
Maximum physical size of a compressed volume.

For information about how to move data off a compressed volume, see the Flashes, alerts and bulletins website.