RAID Levels
RAID Levels
Right RAID
Which RAID Level is Right for You?
Introduction
For any organization, whether it be small business or a data center, lost data means lost business. There are two common practices for protecting that data: backups
(protecting your data against total system failure, viruses, corruption, etc.), and RAID (protecting your data against drive failure). Both are necessary to ensure your data is
secure.
This white paper discusses the various types of RAID configurations available, their uses, and how they should be implemented into data servers.
NOTE: RAID is not a substitute for regularly-scheduled backups. All organizations and users should always have a solid backup strategy in place.
What is RAID?
RAID (Redundant Array of Inexpensive Disks) is a data storage structure that allows a system administrator/designer/builder/user to combine two or more physical storage
devices (HDDs, SSDs, or both) into a logical unit (an array) that is seen by the attached system as a single drive.
1. Striping (RAID 0) writes some data to one drive and some data to another, minimizing read and write access times and improving I/O performance.
2. Mirroring (RAID 1) replicates data on two drives, preventing loss of data in the event of a drive failure.
3. Parity (RAID 5 & 6) provides fault tolerance by examining the data on two drives and storing the results on a third. When a failed drive is replaced, the lost data is
rebuilt from the remaining drives.
It is possible to configure these RAID levels into combination levels — called RAID 10, 50 and 60.
The RAID controller handles the combining of drives into these different configurations to maximize performance, capacity, redundancy (safety) and cost to suit the user needs.
Software RAID runs entirely on the CPU of the host computer system.
In hardware RAID, a RAID controller has a processor, memory and multiple drive connectors that allow drives to be attached either directly to the controller, or placed in hot-
swap backplanes.
In both cases, the RAID system combines the individual drives into one logical disk. The OS treats the drive like any other drive in the computer — it does not know the
difference between a single drive connected to a motherboard or a RAID array being presented by the RAID controller.
Given its performance benefits and flexibility, hardware RAID is better suited for the typical modern server system.
From a RAID perspective, HDDs and SSDs only differ in their performance and capacity capabilities. To the RAID controller they are all drives, but it is important to take note of
the performance characteristics of the RAID controller to ensure it is capable of fully accommodating the performance capabilities of the SSD. Most modern RAID controllers
are fast enough to allow SSDs to run at their full potential, but a slow RAID controller could bottleneck data and negatively impact system performance.
Hybrid RAID
Hybrid RAID is a redundant storage solution that combines high capacity, low-cost SATA or higher-performance SAS HDDs with low latency, high IOPs SSDs and an SSD-
aware RAID adapter card (Figure 1).
Page 1 of 10
In Hybrid RAID, read operations are done from the faster SSD and write operations happen on both SSD and HDD for redundancy purposes.
Hybrid RAID arrays offer tremendous performance gains over standard HDD arrays at a much lower cost than SSD-only RAID arrays. Compared to HDD-only RAID arrays,
hybrid arrays accelerate IOPs and reduce latency, allowing any server system to host more users and perform more transactions per second on each server, which reduces
the number of servers required to support any given workload.
A simple glance at Hybrid RAID functionality does not readily show its common use cases, which include creating simple mirrors in workstations through to high-performance
readintensive applications in the small to medium business arena. Hybrid RAID is also used extensively in the data center to provide greater capacity in storage servers while
providing fast boot for those servers. Learn more about Hybrid RAID.
At some point in the life of a server, at least one drive will fail. Without some form of RAID protection, a failed drive’s data would have to be restored from backups, likely at the
loss of some data and a considerable amount of time. With a RAID controller in the system, a failed drive can simply be replaced and the RAID controller will automatically
rebuild the missing data from the rest of the drives onto the newlyinserted drive. This means that your system can survive a drive failure without the complex and long-winded
task of restoring data from backups.
The factors to consider when choosing the right RAID level include:
Capacity
Performance
Redundancy (reliability/safety)
Price
There is no one-size-fits all approach to RAID because focus on one factor typically comes at the expense of another. Some RAID levels designate drives to be used for
redundancy, which means they can’t be used for capacity. Other RAID levels focus on performance but not on redundancy. A large, fast, highlyredundant array will be
expensive. Conversely, a small, averagespeed redundant array won’t cost much, but will not be anywhere near as fast as the previous expensive array.
With that in mind, here is a look at the different RAID levels and how they may meet your requirements.
RAID 0 (Striping)
In RAID 0, all drives are combined into one logical disk (Figure 2). This configuration offers low cost and maximum performance, but no data protection — a single drive failure
results in total data loss.
As such, RAID 0 is not recommended. As SSDs become more affordable and grow in capacity, RAID 0 has declined in popularity. The benefits of fast read/write access are far
outweighed by the threat of losing all data in the event of a drive failure.
Usage: Suited only for situations where data isn’t mission critical, such as video/audio post-production, multimedia imaging, CAD, data logging, etc. where it’s OK to lose a
complete drive because the data can be quickly re-copied from the source. Generally speaking, RAID 0 is not recommended.
Page 2 of 10
RAID 1 (Mirroring)
RAID 1 maintains duplicate sets of all data on two separate drives while showing just one set of data as a logical disk (Figure 3). RAID 1 is about protection, not performance
or capacity.
Since each drive holds copies of the same data, the usable capacity is 50% of the available drives in the RAID set.
Usage: Generally only used in cases where there is not a large capacity requirement, but the user wants to make sure the data is 100% recoverable in the case of a drive
failure, such as accounting systems, video editing, gaming etc.
As in RAID 1, usable drive capacity in RAID 1E is 50% of the total available capacity of all drives in the RAID set.
Usage: Small servers, high-end workstations, and other environments with no large capacity requirements, but where the user wants to make sure the data is 100%
recoverable in the case of a drive failure.
Pros: » Redundant with better performance and capacity than RAID 1. In effect, RAID 1E is a mirror of an odd number of drives.
Cons: » Cost is high because only half the capacity of the physical drives is available.
Page 3 of 10
NOTE: RAID 1E is best suited for systems with three drives. For scenarios with four or more drives, RAID 10 is recommended.
RAID 5 read performance is comparable to that of RAID 0, but there is a penalty for writes since the system must write both the data block and the parity data before the
operation is complete.
The RAID parity requires one drive capacity per RAID set, so usable capacity will always be one drive less than the total number of drives in the configuration.
Usage: Often used in fileservers, general storage servers, backup servers, streaming data, and other environments that call for good performance but best value for the
money. Not suited to database applications due to poor random write performance.
RAID 6 requires a minimum of 4 drives and a maximum of 32 drives to be implemented. Usable capacity is always two less than the number of available drives in the RAID
set.
Usage: Similar to RAID 5, including fileservers, general storage servers, backup servers, etc. Poor random write performance makes RAID 6 unsuitable for database
applications.
Page 4 of 10
RAID 10 (Striping and Mirroring)
RAID 10 (sometimes referred to as RAID 1+0) combines RAID 1 and RAID 0 to offer multiple sets of mirrors striped together (Figures 7 and 8). RAID 10 offers very good
performance with good data protection and no parity calculations.
RAID 10 requires a minimum of four drives, and usable capacity is 50% of available drives. It should be noted, however, that RAID 10 can use more than four drives in
multiples of two. Each mirror in RAID 10 is called a “leg” of the array. A RAID 10 array using, say, eight drives (four “legs,” with four drives as capacity) will offer extreme
performance in both spinning media and SSD environments as there are many more drives splitting the reads and writes into smaller chunks across each drive.
Usage: Ideal for database servers and any environment with many small random data writes.
Page 5 of 10
RAID 50 (Striping with Parity)
RAID 50 (sometimes referred to as RAID 5+0) combines multiple RAID 5 sets (striping with parity) with RAID 0 (striping) (Figures 9 and 10). The benefits of RAID 5 are gained
while the spanned RAID 0 allows the incorporation of many more drives into a single logical disk. Up to one drive in each sub-array may fail without loss of data. Also, rebuild
times are substantially less than a single large RAID 5 array.
A RAID 50 configuration can accommodate 6 or more drives, but should only be used with configurations of more than 16 drives. The usable capacity of RAID 50 is 67%-94%,
depending on the number of data drives in the RAID set.
It should be noted that you can have more than two legs in a RAID 50. For example, with 24 drives you could have a RAID 50 of two legs of 12 drives each, or a RAID 50 of
three legs of eight drives each. The first of these two arrays would offer greater capacity as only two drives are lost to parity, but the second array would have greater
performance and much quicker rebuild times as only the drives in the leg with the failed drive are involved in the rebuild function of the entire array.
Usage: Good configuration for cases where many drives need to be in a single array but capacity is too large for RAID 10, such as in very large capacity servers.
Page 6 of 10
RAID 60 (Striping with Dual Party)
RAID 60 (sometimes referred to as RAID 6+0) combines multiple RAID 6 sets (striping with dual parity) with RAID 0 (striping) (Figures 11 and 12). Dual parity allows the failure
of two drives in each RAID 6 array while striping increases capacity and performance without adding drives to each RAID 6 array.
Like RAID 50, a RAID 60 configuration can accommodate 8 or more drives, but should only be used with configurations of more than 16 drives. The usable capacity of RAID
60 is between 50%-88%, depending on the number of data drives in the RAID set.
Note that all of the above multiple-leg configurations that are possible with RAID 10 and RAID 50 are also possible with RAID 60. With 36 drives, for example, you can have a
RAID 60 comprising two legs of 18 drives each, or a RAID 60 of three legs with 12 drives in each.
Usage: RAID 60 is similar to RAID 50 but offers more redundancy, making it good for very large capacity servers, especially those that will not be backed up (i.e. video
surveillance servers handling large numbers of cameras).
Pros: » Can sustain two drive failures per RAID 6 array within the set, so it is very safe.
» Very large and reasonable value for money, considering this RAID level won’t be used unless there are a large number of drives.
Cons: » Requires a lot of drives.
» Slightly more expensive than RAID 50 due to losing more drives to parity calculations.
Page 7 of 10
When to use which RAID level
We can classify data into two basic types: random and streaming. As indicated previously, there are two general types of RAID arrays: non-parity (RAID 1, 10) and parity
(RAID 5, 6, 50, 60).
Random data is generally small in nature (i.e., small blocks), with a large number of small reads and writes making up the data pattern. This is typified by database-type data.
Streaming data is large in nature, and is characterized by such data types as video, images, general large files.
While it is not possible to accurately determine all of a server’s data usage, and servers often change their usage patterns over time, the general rule of thumb is that random
data is best suited to non-parity RAID, while streaming data works best and is most cost-effective on parity RAID.
Note that it is possible to set up both RAID types on the same controller, and even possible to set up the same RAID types on the same set of drives. So if, for example, you
have eight 2TB drives, you can make a RAID 10 of 1TB for your database-type data, and a RAID 5 of the capacity that is left on the drives for your general and/or streaming
type data (approximately 12TB). Having these two different arrays spanning the same drives will not impact performance, but your data will benefit in performance from being
situated on the right RAID level.
Conversely, SSDs are often faster in larger capacities, so an 80GB SSD and an 800GB SSD from the same product family will have quite different performance characteristics.
This should be checked carefully with the product specifications from the drive vendor to make sure you are getting the performance you think you are getting from your drives.
With HDDs it is generally better to create an array with more, rather than fewer, drives. A RAID 5 of three 6TB HDDs (12TB capacity) will not have the same performance as a
RAID 5 array made from five 3TB HDDs (12TB capacity).
With SSDs, however, it is advisable to achieve the capacity required from as few as drives possible by using larger capacity SSDs. These will have higher throughput than their
smaller counterparts and will yield better system performance.
During the creation process, you can change the size of the array to a lesser size. The unused space on the drives will be available for creating additional RAID arrays.
A good example of this would be when creating a large server and keeping the operating system and data on separate RAID arrays. Typically you would make a RAID 10 of,
say, 200GB for your OS installation spread across all drives in the server. This would use a minimal amount of capacity from each drive. You can then create a RAID 5 for your
general data across the unused space on the drives.
This has an added benefit of getting around drive size limitations for boot arrays on non-UEFI servers as the OS will believe it is only dealing with a 200GB drive when
installing the operating system.
For example, a RAID 5 made of 32 6TB drives (186TB) will have very poor build and rebuild times due to the size, speed and number of drives. In this scenario, it would be
advisable to build a RAID 50 with two legs from those drives (180TB capacity). When a drive fails and is replaced, only 16 of the drives (15 existing plus the new drive) will be
involved in the rebuild. This will improve rebuild performance and reduce system performance impact during the rebuild process.
Note, however, that no matter what you do, when it comes to rebuilding arrays with 6TB+ SATA drives, rebuild times will increase beyond 24 hours in an absolutely perfect
environment (no load on server). In a real-world environment with a heavilyloaded system, the rebuild times will be even longer.
Page 8 of 10
Of course, rebuild times on SSD arrays are dramatically quicker due to the fact that the drives are smaller and the write speed of the SSDs are much faster than their spinning
media counterparts.
Page 9 of 10
Types of RAID
Types of Software-Based Motherboard-Based Adapter-Based
RAID
Included in the OS, such as Windows®, and Linux. Processor-intensive RAID operations Processor-intensive RAID operations are off-
Description
All RAID functions are handled by the host CPU which can are off-loaded from the host CPU to loaded from the host CPU to an external PCIe
severely tax its ability to perform other computations. a RAID processor integrated into the adapter.
motherboard. Battery-back write back cache can dramatically
increase performance without adding risk of data
loss.
Best used for large block applications such as data Inexpensive. Best used for small block applications such as
Typical
warehousing or video streaming. Also where servers have transaction oriented databases and web servers.
Usage
the available CPU cycles to manage the I/O intensive
operations certain RAID levels require.
Lower cost due to lack of RAIDdedicated hardware. Lower cost than adapter-based Offloads RAID tasks from the host system,
Pros
RAID. yielding better performance than software RAID.
Controller cards can be easily swapped out for
replacement and upgrades.
Data can be backed up to prevent loss in a power
failure.
Lower RAID performance as CPU also powers the OS and No ability to upgrade or replace the More expensive than software and integrated
Cons
applications. RAID processor in the event of RAID.
hardware failure.
May only support a few RAID levels.
Page 10 of 10