RAID is an acronym for Redundant Array of Independent (or Inexpensive) Disks. Fundamentally, RAID combines multiple hard disks into a single logical unit. This can offer fault tolerance and/or higher throughput levels than a single hard drive or group of independent hard drives. RAID can provide real-time data recovery when a hard drive fails, increasing system up-time and network availability which protecting against loss of data. Multiple drives working together can also increase system performance.
JBOD or "Just a Bunch Of Disks":
JBOD or spanning of disks is not one of the numbered RAID levels, but it is a method for combining multiple physical disk drives into a single virtual disk. As the name implies, disks are merely concatenated together, end to beginning, so they appear to be a single large disk. Some RAID controllers use JBOD to refer to configuring drives without RAID features. Each drive shows up separately in the OS. This JBOD is not the same as concatenation described above.
Striped set of atleast two disks without parity. RAID 0 provides improved performance and additional storage but no fault tolerance from disk errors or disk failure. Any disk failure destroys the array, which becomes more likely with more disks in the array. The reason a single disk failure destroys the entire array is because when data is written to a RAID 0 drive, the data is broken into "fragments". The number of fragments is dictated by the number of disks in the drive. Each of these fragments are written to their respective disks simultaneously on the same sector. This allows the entire chunk of data to be read off the drive in parallel, giving this type of arrangement huge bandwidth. When one sector on one of the disks fails though, the corresponding sector on every other disk is rendered useless because part of the data is now corrupted. RAID 0 does not implement error checking so any error is unrecoverable. More disks in the drive means higher bandwidth, but greater risk of data loss.
Mirrored set of atleast two disks without parity. RAID 1 provides fault tolerance from disk errors and single disk failure. Increased read performance occurs when using a multi-threaded operating system that supports split seeks, very small performance reduction when writing. Array continues to operate so long as at least one drive is functioning.
RAID 3 and RAID 4:
Striped set of atleast three disk with dedicated parity. RAID 3/4 Provides improved performance and fault tolerance similar to RAID 5, but with a dedicated parity disk rather than rotated parity stripes. The single disk is a bottle-neck for writing since every write requires updating the parity data. One minor benefit is the dedicated parity disk allows the parity drive to fail and operation will continue without parity or performance penalty.
Striped set at least three disks with distributed parity. Distributed parity requires all but one drive to be present to operate; drive failure requires replacement, but the array is not destroyed by a single drive failure. Upon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user. The array will have data loss in the event of a second drive failure and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive.
Striped set of at least four disks with dual distributed parity: RAID 6 provides fault tolerance from two drive failures; array continues to operate with up to two failed drives. This makes larger RAID groups more practical. This is becoming a popular choice for SATA drives as they approach 1 Terabyte in size. This is because the single parity RAID levels are vulnerable to data loss until the failed drive is rebuilt. The larger the drive, the longer the rebuild will take. With dual parity, it gives the array time to rebuild onto a large drive with the ability to sustain another drive failure.
Nested RAID Levels:
In nested RAID levels one RAID can use another as its basic element, instead of using physical drives. It is instructive to think of these arrays as layered on top of each other, with physical drives at the bottom.
Mirrored Set + Striped Set. It requires at least four disks.Always needs an even number of disks, provides fault tolerance and improved performance but increases complexity. Array continues to operate with one or more failed drives. The key difference from RAID 0+1 is that RAID 1+0 creates a striped set from a series of mirrored drives.
Striped Set + Mirrored Set. It requires at least four disks; Always needs an even number of disks, provides fault tolerance and improved performance but increases complexity. Array continues to operate with one failed drive. The key difference from RAID 1+0 is that RAID 0+1 creates a second striped set to mirror a primary striped set, and as a result can only sustain a maximum of a single disk loss, whereas 1+0 can sustain multiple drive losses as long as no two drive loss comprise a single pair.
A stripe across distributed parity RAID systems. RAID 5+1: A mirror striped set with distributed parity. Nested RAIDs are usually signified by joining the numbers indicating the RAID levels into a single number, sometimes with a '+' in between. For example, RAID 10 (or RAID 1+0) conceptually consists of multiple level 1 arrays stored on physical drives with a level 0 array on top, striped over the level 1 arrays. In the case of RAID 0+1, it is most often called RAID 0+1 as opposed to RAID 01 to avoid confusion with RAID 1. However, when the top array is a RAID 0 (such as in RAID 10 and RAID 50), most vendors choose to omit the '+', though RAID 5+0 is more informative.
A hardware implementation of RAID requires at a minimum a special-purpose RAID controller. On a desktop system, this may be a PCI expansion card, or might be a capability built in to the motherboard. In industrial applications, the controller and drives are provided as a stand-alone enclosure. The drives may be IDE/ATA, SATA, SCSI, SAS, Fibre Channel, or a combination thereof. The using system can be directly attached to the controller, or more commonly, connected via a SAN. The controller hardware handles the management of the drives, and performs any parity calculations required by the chosen RAID level. Most hardware implementations provide a non-volatile read/write cache, which depending on the I/O workload, will improve performance. Cached RAID controllers are most commonly used in industrial applications. Hardware implementations provide guaranteed performance, add no overhead to the local CPU and can support many operating systems, as the controller simply presents a logical disk to the operating system.
Hybrid RAID implementations have become very popular with the introduction of inexpensive RAID controllers, implemented using a standard disk controller and then implementing the RAID in the controllers BIOS extension and the operating system driver. These controllers actually do all calculations in software, not hardware. Like hardware RAID, they are typically proprietary to a given RAID controller manufacturer and typically cannot span multiple controllers. Both hardware and hybrid implementations may support the use of hot spare drives, a pre-installed drive which is used to immediately (and almost always automatically) replace a drive that has failed. This reduces the mean time to repair period during which a second drive failure in the same RAID redundancy group can result in loss of data. It also prevents data loss when multiple drives fail in a short period of time. This can happen when all drives in an array have undergone very similar use patterns, and experience wear-out failures. Zero
A type of RAID implementation in which a PCI RAID controller is designed to use the on-board SCSI channels of a motherboard to implement a cost-effective Hardware RAID solution.
Which is better, software RAID or hardware RAID?
Hardware RAID, by definition, requires a separate controller card often costing several hundred dollars and thus is more expensive than a software RAID solution. However, in its favor, hardware RAID is usually operating system independent which allows for use in dual-booting scenarios. Furthermore, since all the RAID functions (such as calculating parity) are performed on the RAID controller, there is no noticeable drain on system resources such as CPU or memory. Software RAID does not require additional hardware and is sometimes bundled with the operating system, so it is a less expensive solution than many RAID scenarios. The one area when software RAID cannot compare to certain hardware RAID solutions is data reliability and integrity across power losses. Some hardware RAID controllers feature a battery backup unit that allows any write requests sent to the controller but not yet stored to disk to be kept in the RAID controller memory in the event of a power outage. Upon reboot, the RAID controller will pick up where it left off and write this data to the array. Software RAID has no such functionality as any cached writes will be immediately lost as soon as the system loses power or the operating system crashes.
Copyright © 2011 HPC Systems1 2 3