device mapper raid
© 2010, 2011, 2012 Dennis Leeuw dleeuw at made-it dot com
License: GPLv2 or later
RAID level | Technology | Min disks | Max failures | Capacity | Benefit | Downside |
---|---|---|---|---|---|---|
0 | Striping | 2 | 0 | N*s | Speed | No redundancy |
1 | Mirroring | 2 | 1 | s | Read double speed, write singel speed | |
3 | Parity | 3 | 1 | (N-1)*s | Safety is gained by a parity disk, which only costs one disk | Disk is only available for read or write |
4 | Parity per stripe | 3 | 1 | (N-1)*s | Safety is gained by a parity disk, which only costs one disk, read and write access | |
5 | Parity distributed over disks | 3 | 1 | (N-1)*s | No limit speed of parity disk | |
6 | 2 parity blocks distributed over disks | 4 | 2 | (N-2)*s | No limit speed of parity disk | High write penalty |
A combination of the above options is possible. Where one would create e.g. RAID 10, with 2 times 2 mirrored drives that hold a stripe set. Or RAID 01 where you would have a stripe set which is then mirrored. Capacity (N*s)/2, failure 1 disk, or an entire mirror set.
To get an overview of your RAID configuration use:
mdadm -D /dev/md0This will output something like:
/dev/md0: Version : 00.90.03 Creation Time : Tue Mar 10 11:16:54 2009 Raid Level : raid5 Array Size : 2930287488 (2794.54 GiB 3000.61 GB) Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) Raid Devices : 4 Total Devices : 3 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Mon Apr 23 10:39:03 2011 State : clean, degraded Active Devices : 3 Working Devices : 3 Failed Devices : 0 Spare Devices : 0 Layout : left-symmetric Chunk Size : 64K UUID : f23d7914:e24c3d1b:46cbea72:d9f44c76 Events : 0.146842 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc 2 0 0 2 removed 3 8 64 3 active sync /dev/sdeThis array was built with RAID5, containing 4 disks of 1T. One disk (/dev/sdd) is broken.
The striped type can be used to stripe data across two or more disks.
A striped map is defined as: start length striped #stripes chunk_size device1 offset1 ... deviceN
As described in the Introduction there is only one amount of sectors parameter in the table for striped setups. This means that the smallest disk is the limiting factor in creating a striped disk set. If you have unequal sized disks the remainig disk space can be used like described in the linear section. So nothing needs to go to waist.
New options in the table for a striped setup are the "amount of stripes" and the "chunk size". The chunk size is given in kilo bytes.
SIZE1=$(blockdev --getsize /dev/loop1) echo "0 ${SIZE1} striped 2 32 /dev/loop1 0 /dev/loop2 0" |\ dmsetup create striped
A mirror map is defined as: start length mirror log_type #logargs logarg1 ... logargN device1 offset1 ... deviceN offsetN
To add the new device to the software RAID configuration use:
mdadm --manage /dev/md0 --add /dev/sddThis notifies our RAID that there is a replacement disk and it will thus start to rebuild /dev/sdd. During the rebuild we are still vulnarable for a crash of one of other disks. One can monitor the current status with:
/dev/md0: Version : 00.90.03 Creation Time : Tue Mar 10 11:16:54 2009 Raid Level : raid5 Array Size : 2930287488 (2794.54 GiB 3000.61 GB) Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Mon Apr 23 11:19:00 2011 State : clean, degraded, recovering Active Devices : 3 Working Devices : 4 Failed Devices : 0 Spare Devices : 1 Layout : left-symmetric Chunk Size : 64K Rebuild Status : 1% complete UUID : f23d7914:e23c3d1b:46cbea62:d9f44b76 Events : 0.146916 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc 4 8 48 2 spare rebuilding /dev/sdd 3 8 64 3 active sync /dev/sdeThe new disk is being rebuilt and for 1% complete. https://raid.wiki.kernel.org/index.php/RAID_Recovery
This machine was not one of my own, but from a friend of a colleague. So no exact specs, just a single disk from a RAID1 set. The box crashed, now let's recover the data. Software we need, so install through apt-get:
We hooked up the device though an USB-craddle and it got detected by the kernel and was assigned /dev/sdb (see dmesg for your local system information)
Using lsblk we tried to figure out what is what:
root@host:~# lsblk -f /dev/sdb NAME FSTYPE LABEL MOUNTPOINT sdb ├─sdb1 ├─sdb2 linux_raid_member (none):0 ├─sdb3 ext2 /media/blaat/73ade21c-3a3e-4f03-984d-910532c7b3c8 ├─sdb4 linux_raid_member (none):2 ├─sdb5 linux_raid_member (none):3 ├─sdb6 linux_raid_member (none):4 ├─sdb7 linux_raid_member (none):5 ├─sdb8 linux_raid_member (none):6 ├─sdb9 linux_raid_member (none):7 └─sdb10 linux_raid_member BA-0010753892FE:8As you can see sdb3 was auto detected and mounted. But the real data is on sdb10 (this was deduced using the partition size which got from using parted -l.
Let's recreate a RAID1 with just this one disk:
root@host:~# mdadm --assemble --force /dev/md0 /dev/sdb10Our new raid-system should show up in mdstat:
root@host:~# cat /proc/mdstat Personalities : [raid1] md0 : active raid1 sdb10[1] 1949260819 blocks super 1.2 [2/1] [_U]
If we try to mount /dev/md0 to /mnt it complains that md0 is an LVM2_member. So we have to recover our LVM:
root@host:~# pvscan root@host:~# vgscan root@host:~# lvscan root@host:~# lvdisplay --- Logical volume --- LV Path /dev/vg8/lv8 LV Name lv8 VG Name vg8 LV UUID mZEj8v-qjpm-kinw-qyJa-8g7T-qi8V-KfiENS LV Write Access read/write LV Creation host, time , LV Status available # open 1 LV Size 1,82 TiB Current LE 475893 Segments 1 Allocation inherit Read ahead sectors auto - currently set to 256 Block device 252:2That looks like something we could use... but what is the FS that is on this system?
root@host:~# blkid /dev/vg8/lv8 /dev/vg8/lv8: UUID="9b39ef6d-8553-4d0e-ac82-f0789fca0bbf" TYPE="ext4"So it is an ext4 file system, and it thus might have a journal on board.
Since the disk comes from a crashed box, we first want to recover what can be recovered:
root@host:~# fsck.ext4 /dev/vg8/lv8
So now that all is fixed (I typed yes to all questions), let's mount it...:
root@host:~# mount -t ext4 /dev/vg8/lv8 /mnt/ mount: wrong fs type, bad option, bad superblock on /dev/mapper/vg8-lv8, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or soOeps... dmesg reports: EXT4-fs (dm-2): bad block size 65536 and google knows that means that we have a disk with a block size larger then 4k. So fuse to the rescue:
root@host:~# fuseext2 -o ro -o sync_read /dev/vg8/lv8 /mnt/Note the fact that we have mounted the filesystem read-only! Now we can finaly access the files again.
After rescueing the files we need to tear everything down again:
fusermount -u /mnt lvchange -an /dev/vg8/lv8 mdadm -S /dev/md0 umount /dev/sdb3 udisks --detach /dev/sdb