RAID Explained

Introduction Modern hard disks come in sizes as large as 1tb (Terra-Byte = 1000GB)and beyond.

However, sometimes even the largest or fastest hard disks are not enough for certain applications. The acronym ‘RAID’ stands for Redundant Array of Independent (or Inexpensive) Disks. It is generally recommended that all disks in a RAID should be identical (or at the very least, the same size and speed). There are several variations designed to meet different needs. Some are for making larger, faster storage solutions. Others trade off size for increased reliability. Yet others try and accomplish both. Here is a rundown of the basic types of RAID available today.

RAID 0 – a.k.a. Striping

This is a setup where data is striped across two or more disks. It is not redundant at all, but is still considered a variation of RAID. Each file is broken into several small parts, and the first part is placed on the first disk, the second on the second disk, the third on the third disk (or back on the first disk if there are only two), etc. This allows a computer to access data much faster than normal, because the data is split equally between multiple physical disks and they can all be put to use when retrieving data. Unfortunately, with this boost in speed comes a significant danger of data loss. If any disk in the array fails, you lose ALL the data in the array. So, even though more disks in the array means faster speed (up to a point), each additional disk also brings a greater threat of total information loss. Since there is no redundancy, all disk space is available – allowing very large drive spaces for non-critical data.

RAID 1 – a.k.a. Mirroring

This setup is basically the inverse of striping. Here, rather that putting different bits of data on each disk, a copy of every single piece of information is written to every disk in the array. This is usually only done with two disks, and it gives you a complete copy of everything so that if one disk fails you have a complete and up-to-date backup of everything. In fact, the array can continue to function as long as a single disk (with all the data) is intact and operational. For the record, since all the data is being written to every disk, there is no performance benefit – and it is conceivable there might be a slight performance decrease (especially when writing to the array). Additionally, only half the total disk space is available because of the mirroring process

RAID 0+1 – Mirroring Two RAID 0 Stripes

This is our first RAID style that attempts to give both a performance and reliability boost. It requires four or more disks to function, and effectively creates two RAID 0 arrays, and then mirrors them. Hence, either array could fail (either one or two disks in it) and the other would continue to function. This allows you a similar performance boost to a two-disk RAID 0 array, but with a good measure of the stability of a RAID 1 array. However, just like RAID 1, only half the total amount of disk space is usable (since all data is written twice).

RAID 10 – Striping Two RAID 1 Mirrors

Similar to the results of a RAID 0+1 (mentioned above), this setup provides both an increase in performance and reliability over a single disk drive. This is accomplished by creating two RAID 1 Mirror arrays and then striping data across them. Again, a single disk can fail and the array should continue to function. Four or more disks are required, and only half the total disk space is usable (the rest being taken up by redundant data).

RAID 5 – Striping with Parity

RAID 5 requires at least three disks, making it more affordable (in terms of disks) to operate than either RAID 0+1 or 10. The key to RAID 5 is ‘parity’ data. Parity data is special code generated when data is written to the array that allows it to rebuild a whole disk if one should fail. The array operates by striping a given amount of data across all the disks except one, and using that one to store parity data. The next piece of data is treated the same, except that a different disk is used to store the parity data – and so on. This way, the total storage available is the total amount of disk space less one disk’s worth (that gets used up for parity). Reading data from a RAID 5 array is not as fast as it would be from a RAID 0 (as there is a parity stream to check), but is still slightly faster than a single disk drive.

RAID 6 – RAID 5 on Steroids

This configuration uses the same basic idea as RAID 5, but creates two separate parity sets. This means it has to have four disks to function, and loses two disks worth of storage space to parity. However, it also means that any two disks can fail, and the array can still be rebuilt. Additionally, RAID 6 (and, to a certain extent, RAID 5) can scale up easily and give very large storage arrays while only losing a small portion of their overall drive space. For example, a 10 disk RAID 6 array would still have 8 disks worth of space and be able to handle two complete disk failures. Reading data from a RAID 6 array is not quite as fast as it would be from a RAID 5 (as there are two parity streams to check), but is still faster than a single disk.

JBOD – Just a Bunch Of Disks

While not a real RAID style, JBOD is an option on many disk controllers built into motherboards. It is simply a compilation of multiple physical disks into one big, fully usable drive. Sometimes this process is known as ‘spanning’. As it is in no way redundant, it offers no data safety. This setup is only useful if you have multiple hard disks but only want to work with a single, larger drive within your operating system. Also, unlike other true RAID forms, this allows you to use disks of different sizes at whim – and without wasting space.

Hot Spare Disks

In larger arrays, where data integrity is of paramount importance, a disk failure can still be a dangerous thing. For example, if a single disk in a RAID 5 fails, then the data is at additional risk until that drives is replaced. Until a new replacement disk can be installed and rebuilt into the array another disk failure would cause all data to be lost. While it is unlikely that a second failure would occur, sometimes it just isn’t worth the risk. The best option when this is the case is to have an extra disk already installed and ready to take over for whichever disk dies. This is called having a Hot Spare disk. Sometimes it is denoted by a +1 after the RAID title (i.e. RAID 5 +1). With a Hot Spare system that is properly setup, if a failure occurs then the spare is immediately rebuilt in place of the failed disk. Then, when the failed disk is replaced, it becomes the new spare.

Other types of RAID

There are other forms of RAID that are not as commonly used. Those variations include RAID 2, 3, 4, and 7. More detailed info one these can be found at other websites, including these:

http://www.pcguide.com/ref/hdd/perf/raid/levels/index.htm

http://www.acnc.com/04_00.html – good illustrations!

Additionally, there are variations on RAID 5 and 6 that are worth noting. RAID 50 and 60 are basically striped pairs of RAID 5 or 6 arrays, giving yet again increased performance at the cost of a small amount of statistical reliability. However, they require 6 or 8 disks (respectively) at a minimum and require advanced and often expensive controller cards, making them very rarely used options.
 

Informational Chart

RAID Type Minimum # of Disks Space lost to redundancy Read/Write Performance Data Safety
RAID 0 2 None Excellent Poor
RAID 1 2 50% Average Good
RAID 0+1 4 50% Good Good
RAID 10 4 50% Good Good
RAID 5 3 1 Disk Good Good
RAID 6 4 2 Disks Good Excellent

 

Summary

As you can see, there are a lot of options when it comes to RAID. But whether you need speed, data protection, or a combination of the two there are solutions available.

Author: admin