今天看RHCE Red Hat Certified Engineer Linux Study Guide (Exam RH302),看到software raid,有個lab,以前只做過linear的raid,現在有了qemu趕緊虛擬出4塊硬盤做做嘍。實驗了分別用raidtool和mdadm兩個工具創建各種software raid的方法,最后還試了下在raid上創建lvm,感覺不錯。:)
RAID 0
This level of RAID makes it faster to read and write to the hard drives. However, RAID 0 provides no data redundancy. It requires at least two hard disks.
Reads and writes to the hard disks are done in parallel, in other words, to two or more hard disks simultaneously. All hard drives in a RAID 0 array are filled equally. But since RAID 0 does not provide data redundancy, a failure of any one of the drives will result in total data loss. RAID 0 is also known as 'striping without parity.'
特征:并行讀寫數據,性能高,但沒有數據冗余,陣列中任何一個硬盤壞掉,意味著所有數據丟失
容量:所有硬盤容量之和
條件:至少兩塊硬盤,做為RAID的分區大小必須是幾乎相同的.
首先將各個分區的分區類型標識為FD:
[root@LFS ~]#fdisk /dev/hda
Command (m for help):t
Partition number (1-4):1
Hex code (type L to list codes):fd
Changed system type of partition 1 to fd (Linux raid autodetect)
Command (m for help):p
/dev/hda1 1 646 325552+ fd Linux raid autodetect
使用raidtools-1.00.3創建raid-0:
編寫raid的配置文件/etc/raidtab:
在/usr/share/doc/raidtools-1.00.3下有樣例文件
raiddev /dev/md0
raid-level 0
nr-raid-disks 2
nr-spare-disks 0
persistent-superblock 1
chunk-size 4
device /dev/hda1
raid-disk 0
device /dev/hdb1
raid-disk 1
mkraid依據raidtab創建raid:
[root@LFS ~]#mkraid /dev/md0
......
raid0: done.
raid0 : md_size is 650880 blocks
raid0 : conf ->hash_spacing is 650880 blocks
raid0 : nb_zone is 1.
raid0 : Allocating 4 byte for hash
使用mdadm創建raid-0:
[root@LFS ~]#mdadm --create --verbose /dev/md0 --level=raid0 \
--raid-devices=2 --chunk=4 /dev/hda1 /dev/hdb1
......
raid0: done.
raid0 : md_size is 650880 blocks
raid0 : conf ->hash_spacing is 650880 blocks
raid0 : nb_zone is 1.
raid0 : Allocating 4 byte for hash
mdadm: array /dev/md0 started .
[root@LFS ~]#
查看狀態:
[root@LFS ~]#cat /proc/mdstat
Personalities : [raid0]
md0 : active raid0 hdb1[1] hda1[0]
650880 blocks 4k rounding
unused devices:
[root@LFS ~]#
創建文件系統,掛載:
[root@LFS ~]#mkreiserfs /dev/md0
[root@LFS ~]#mount -t reiserfs /dev/md0 /mnt/raid0
加入到/etc/fstab,系統啟動自動掛載:
/dev/md0 /mnt/raid0 reiserfs defaults 1 2
Commands in raidtab
Command
Description
nr-raid-disks
Number of RAID disks to use
nr-spare-disks
Number of spare disks to use
persistent-superblock
Required for autodetection
chunk-size
Amount of data to read/write
parity-algorithm
How RAID 5 should use parity
RAID 1
This level of RAID mirrors information to two or more other disks. In other words, the same set of information is written to two different hard disks. If one disk is damaged or removed, you still have all of the data on the other hard disk. The disadvantage of RAID 1 is that data has to be written twice, which can reduce performance. You can come close to maintaining the same level of performance if you also use separate hard disk controllers. That prevents the hard disk controller from becoming a bottleneck.
<>And it is expensive. To support RAID 1, you need an additional hard disk for every hard disk worth of data. RAID 1 is also known as disk mirroring.
特征:數據冗余,可靠性強。任何一塊硬盤壞掉,不會丟失數據。寫入慢,讀取快。
容量:所有硬盤容量/2
條件:至少兩塊硬盤,做為RAID的分區大小必須是幾乎相同的.
raidtools-1.00.3:
編寫/etc/raidtab :
raiddev /dev/md1
raid-level 1
nr-raid-disks 2
nr-spare-disks 1
persistent-superblock 1
chunk-size 4
device /dev/hda2
raid-disk 0
device /dev/hdb2
raid-disk 1
device /dev/hdc2
spare-disk 0
[root@LFS ~]#mkraid /dev/md1
使用mdadm創建raid-1:
[root@LFS ~]#mdadm --create --verbose /dev/md1 --level=raid1 \
--raid-devices=2 --spare-devices=1 --chunk=4 /dev/hda2 /dev/hdb2 /dev/hdc2
[root@LFS ~]#cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 hdc2[2] hdb2[1] hda2[0]
325440 blocks [2/2] [UU]
unused devices:
[root@LFS ~]#mkreiserfs /dev/md1
[root@LFS ~]#mount -t reiserfs /dev/md1 /mnt/raid1
device /dev/hdc2
spare-disk 0
表示用/dev/hdc2做為spare disk,當hda2或hdb2壞掉時,raid會自動啟用/dev/hdc2做為鏡像.RAID 5
RAID 5 requires three or more disks. RAID 5 distributes, or 'stripes,' parity information evenly across all the disks. If one disk fails, the data can be reconstructed from the parity data on the remaining disks. RAID does not stop; all data is still available even after a single disk failure. RAID level 5 is the preferred choice in most cases: the performance is good, data integrity is ensured, and only one disk's worth of space is lost to parity data. RAID 5 is also known as disk striping with parity.
特征:采用奇偶效驗,可靠性強。只有當兩塊硬盤壞掉時才會丟失數據。并行讀寫數據,性能也很高。
容量:所有硬盤容量-1
條件:至少三塊硬盤,做為RAID的分區大小必須是幾乎相同的。
使用raidtools-1.00.3創建raid-5:
編寫/etc/raidtab:
raiddev /dev/md5
raid-level 5
nr-raid-disks 3
nr-spare-disks 0
persistent-superblock 1
chunksize 32
parity-algorithm left-symmetric
device /dev/hda3
raid-disk 0
device /dev/hdb3
raid-disk 1
device /dev/hdc3
raid-disk 2
[root@LFS ~]#mkraid /dev/md5
使用mdadm創建raid-5:
[root@LFS ~]#mdadm --create --verbose /dev/md5 --level=raid5 \
--raid-devices=3 --chunk=32 /dev/hda3 /dev/hdb3 /dev/hdc3
[root@LFS ~]#mkreiserfs /dev/md5
[root@LFS ~]#mount -t reiserfs /dev/md5 /mnt/raid5
parity-algorithm left-symmetric
parity-algorithm表示raid5的奇偶效驗的運算法則,可用選擇有:
left-symmetric left-asymmetric
right-symmetric right-asymmetric
最佳性能的是:left-symmetric
LVM+RAID :
LVM的物理卷(PV)可以是標準硬盤分區也可以是RAID設備,因此可以在RAID上使用LVM管理分區。
創建PV:
[root@LFS ~]#pvcreate /dev/md5
Physical volume "/dev/md5" suclearcase/" target="_blank" >ccessfully created
創建VG:
[root@LFS ~]#vgcreate raid_lvm /dev/md5
Volume group "raid_lvm" successfully created
創建LV:
[root@LFS ~]#lvcreate -L 300M -n "lv_data" raid_lvm
Logical volume "lv_data" created
創建reiserfs:
[root@LFS ~]#mkreiserfs /dev/raid_lvm/lv_data
[root@LFS ~]#mkdir /mnt/data
[root@LFS ~]#mount -t reiserfs /dev/raid_lvm/lv_data /mnt/data
About chunk-size :The chunk-size deserves an explanation. You can never write completely parallel to a set of disks. If you had two disks and wanted to write a byte, you would have to write four bits on each disk, actually, every second bit would go to disk 0 and the others to disk 1. Hardware just doesn't support that. Instead, we choose some chunk-size, which we define as the smallest "atomic" mass of data that can be written to the devices. A write of 16 kB with a chunk size of 4 kB, will cause the first and the third 4 kB chunks to be written to the first disk, and the second and fourth chunks to be written to the second disk, in the RAID-0 case with two disks. Thus, for large writes, you may see lower overhead by having fairly large chunks, whereas arrays that are primarily holding small files may benefit more from a smaller chunk size.
Chunk sizes must be specified for all RAID levels, including linear mode. However, the chunk-size does not make any difference for linear mode.
For optimal performance, you should experiment with the value, as well as with the block-size of the filesystem you put on the array.
The argument to the chunk-size option in /etc/raidtab specifies the chunk-size in kilobytes. So "4" means "4 kB".
補充幾個問題:
Q:如何得知當前內核是否支持RAID?
A:cat /proc/mdstat 有輸出信息則表示內核已經支持
或者dmesg |grep -i raid dmesg |grep -i md 都可以看到。
Q:如何得知當前內核支持哪幾種RAID?
A:安裝當前內核源碼包,將當前內核配置文件cp到內核源碼目錄下
cp /boot/config-xxx /usr/src/linux && make menuconfig
看看Device Drivers --->Multi-device support (RAID and LVM)這里的選項就知道了。
或者cat /lib/modules/`uname -r`/modules.alias |grep raid0
raid0為查看的級別:raid1,raid5...
如果有輸出則表示內核已經支持,并且/必須是做為模塊加載的。
Q:raidtool和mdadm應該使用哪個?哪里有下載?
A:mdadm可能更方便一些。mdadm與raidtool的區別:
The key differences between mdadm and raidtools are:
mdadm is a single program and not a collection of programs.
mdadm can perform (almost) all of its functions without having
a configuration file and does not use one by default. Also
mdadm helps with management of the configuration file.
mdadm can provide information about your arrays (through
Query, Detail, and Examine) that raidtools cannot.
mdadm does not use /etc/raidtab, the raidtools configuration file,
at all. It has a different configuration file with a different
format and an different purpose.
另外我看的這本書里,講解的是raidtool,這是在RHEL3中所使用的,但是我在RHEL4中做這個lab
發現四張盤里沒有raidtool,只有mdadm,看來RH也是偏向于使用mdadm的 :)
下載地址:
mdadm:
http://www.cse.unsw.edu.au/~neilb/source/mdadm/RPM/
http://www.cse.unsw.edu.au/~neilb/source/mdadm/
raidtool:
ftp://194.199.20.114/linux/fedora/core/2/i386/os/Fedora/RPMS/raidtools-1.00.3-8.i386.rpm
OK,這個lab完成啦。
有機會一定要在真實硬盤上試試raid-0+lvm+reiserfs,看看是什么感覺 ^_^