MAP 6001: This procedure replaces a flash drive that
has failed while it is still a member of a storage pool.
Before you begin
If you are not familiar with these maintenance analysis
procedures (MAPs), first read Using the maintenance analysis procedures.
This
map applies to models with internal flash drives.
Be sure that you know which model you are using before you start this
procedure. To determine which model you are working on, look for the
label that identifies the model type on the front of the node.
Attention: - Back up your SAN Volume Controller configuration
before you begin these steps.
- If the drive use property is member and
the drive must be replaced, contact IBM support before taking any
actions.
About this task
Perform the following steps only if a drive in a RAID 0 (striped)
array has failed:
Procedure
- Record the properties of all volume copies,
MDisks, and storage pools that are dependent on the failed drive.
- Identify the drive ID and the error sequence number
with status equals offline and use equals failed using
the lsdrive CLI command.
- Review the offline reason using the lsevent <seq_no> CLI
command.
- Obtain detailed information about the offline drive
or drives using the lsdrive <drive_id> CLI command.
- Record the mdisk_id, mdisk_name, node_id, node_name,
and slot_id for each offline drive.
- Obtain the storage pools of the failed drives using
the lsmdisk <mdisk_id> CLI command for each
MDisk that was identified in the substep 1c.
Continue
with the following steps by replacing all the failed drives in one
of the storage pools. Make note of the node, slot, and ID of the selected
drives.
- Determine all the MDisks in the storage
pool using the lsmdisk -filtervalue mdisk_grp_id=<grp
id> CLI command.
- Identify which MDisks are internal (ctrl_type equals
4) and which MDisks contain SSDs (ctrl_type equals
6).
- Find the volumes with extents in the storage pool using
the lsmdiskmember <mdisk_id> CLI command for
each MDisk found in substep 1f.
It is likely that the
same volumes will be returned for each MDisk.
- Record all the properties on each volume listed in step
1h by using the lsvdisk <vdisk_id> CLI command.
For each volume check if it has online volume copies which indicate
it is mirrored. Use this information in step 9.
- Obtain a list of all the drives in each internal MDisk
in the storage pool using the lsdrive -filtervalue mdisk_id=<mdisk_id> CLI
command. Use this information in step 8.
- Record all the properties of all the MDisks in the storage
pool using the lsmdisk <mdisk_id> CLI command.
Use this information in step 8.
- Record all the properties of the storage pool using
the lsmdisk <mdisk_id> CLI command. Use this
information in step 7.
Note: If a listed volume has a mirrored, online, and in-sync
copy, you can recover the copied volume data from the copy. All the
data on the unmirrored volumes will be lost and will need to be restored
from backup.
- Delete the storage pool using the rmmdiskgrp -force <mdiskgrp
id> CLI command.
All MDisks and volume copies
in the storage pool are also deleted. If any of the volume copies
were the last in-sync copy of a volume, all the copies that are not
in sync are also deleted, even if they are not in the storage pool.
- Using the drive ID that you recorded in substep 1e, set
the use property of the drive to unused using
the chdrive command.
chdrive -use unused <id of offline drive>
The
drive is removed from the drive listing.
- Follow the physical instructions to replace or remove a
drive. See the "Replacing a SAN Volume Controller
2145-CG8 flash drive" documentation
or the "Removing a SAN Volume Controller
2145-CG8 flash drive" documentation
to find out how to perform the procedures.
- A new drive object is created with the use attribute set
to unused. This action might take several
minutes.
Obtain the ID of the new drive using the lsdrive CLI
command.
- Change the use property for the new drive to candidate.
chdrive -use candidate <drive id of new drive>
- Create a new storage pool with the same properties
as the deleted storage pool. Use the properties that you recorded
in substep 1l.
mkmdiskgrp -name <mdiskgrp name as before> -ext <extent size as before>
- Create again all MDisks that were previously
in the storage pool using the information from steps 1j and 1k.
- For internal RAID 0 MDisks, use this command:
mkarray -level raid0 -drive <list of drive IDs> -name
<mdisk_name> <mdiskgrp id or name>
where -name <mdisk_name> is
optional, but you can use the parameter to make the new array have
the same MDisk name as the old array.
- For external MDisks, use the addmdisk CLI
command.
- For non-RAID 0 MDisks, use the mkarray CLI
command.
- For all the volumes that had online, in sync,
mirrored volume copies before the MDisk group was deleted, add a new
volume copy in the new storage pool to restore redundancy using the
following command:
addvdiskcopy -mdiskgrp <mdiskgrp id> -vtype striped -easytier
<on or off as before> <vdisk_id>
- For any volumes that did not have an online, in sync, mirrored
copy, create the volume again and restore the data from a backup or
use other methods.
- Mark the drive error as fixed using the error sequence
number from step 1b.
cherrstate -sequencenumber <error_sequence_number>