1. RAID Overview
In 1988, the University of California, Berkeley, proposed the concept of RAID (RedundantArrayofInexpensiveDisks), and as the cost of disks continued to decrease, RAID became (RedundantArrayofIndependentDisks), but the substance did not change. SNIA, Berkeley and other organizations have set the seven levels of RAID0~RAID6 as standard RAID levels, and standard RAID can be combined into other RAID levels, and the most used levels are RAID0, RAID1, RAID3, RAID5, RAID6 and RAID10. Each RAID level represents an implementation method and technology, and there is no distinction between levels. In practical applications, the appropriate RAID level and specific implementation methods should be selected based on the characteristics of the user's data application, considering availability, performance, and cost.
From the perspective of implementation, RAID is mainly divided into three types: soft RAID, hard RAID, and hybrid RAID. All the functions of soft RAID are done by the operating system and CPU, which is naturally the least efficient. Hard RAID is equipped with specialized RAID control/processing chips and I/O processing chips and array buffers, which do not take up CPU resources but are costly. Hybrid RAID has RAID control/processing chips, but lacks I/O processing chips, which require CPU and drivers to complete, and the performance and cost are between soft and hard RAID.
2. Basic principles
RAID is a disk subsystem consisting of multiple independent, high-performance disk drives, thereby providing higher storage performance and data redundancy technology than a single disk. RAID is a class of multi-disk management technology that provides high-performance storage with high data reliability at an affordable cost to the host environment. The two key goals of RAID are to improve data reliability and I/O performance. In a disk array, data is spread across multiple disks, but for a computer system, it is like a single disk. Redundancy is achieved by writing the same data to multiple disks at the same time (typically such as mirroring) or by writing computed check data to an array, ensuring that data loss is not caused when a single disk fails.
There are three main concepts and techniques in RAID: Mirroring, DataStripping, and Dataparity:
Mirroring, which replicates data to multiple disks, improves reliability on the one hand and improves read performance by reading data from two or more replicas concurrently. Obviously, the write performance of the image is slightly lower, and it takes more time to ensure that the data is written correctly to multiple disks. Data striping, which holds data shards on multiple different disks, and multiple data shards together form a complete copy of data, which is different from multiple copies of mirroring, and it is often used for performance considerations. Data strips have a higher concurrency granularity, and when data is accessed, it can read and write data on different disks at the same time, resulting in a very significant I/O performance improvement. Data verification, using redundant data for data error detection and repair, redundant data is usually calculated by Hemingway code, XOR operation and other algorithms. The verification function can greatly improve the reliability, robbery, and fault tolerance of disk arrays. However, data validation requires data to be read from multiple sources, calculated, and compared, which can affect system performance. Different grades of RAID employ one or more of the three technologies to achieve different data reliability, availability, and I/O performance. As for which RAID to design (or even a new grade or type) or what mode of RAID to use, it is necessary to make a reasonable choice based on a deep understanding of the system needs, and to comprehensively evaluate reliability, performance and cost to make a compromise.
Generally speaking, the main advantages of RAID are: large capacity, high performance, reliability, and manageability.
3. RAID rating
JBOD (JustaBunchOfDisks) is not a standard RAID tier, it is often used to represent a collection of disks that do not have control software to provide coordinated control. JBOD connects multiple physical disks in series to provide a huge logical disk. Storage performance is exactly the same as a single disk, and it doesn't provide data security. The available storage capacity is equal to the sum of the storage space of all member disks.
RAID0, called striping, is a simple, unchecked data striping technology. Performance is the highest of all RAID tiers. No redundancy policies of any kind are provided. 100% utilization of storage space.
RAID1 is called mirroring, and it writes data to the working disk and the mirrored disk completely consistently, and it has a disk space utilization of 50%. Performance is affected when data is written, but data is not read. It provides the best data protection, once the working disk fails, the system automatically reads the data from the mirrored disk, which will not affect the user's work.
RAID2 is called Heming Code Disk Array, and its design idea is to use Heming Code to achieve data verification redundancy. The larger the data width, the higher the storage space utilization, but the more disks you need. It has the ability to correct errors, but Hemingcode's data redundancy overhead is too large and data reconstruction is very time-consuming, so RAID2 is rarely used in practice.
RAID3 is called a dedicated parity strip, which uses a dedicated disk as the check disk, and the rest of the disks as the data disk, and the data is stored cross-stored in each data disk in bits and bytes. RAID3 requires at least three disks.
RAID4 and RAID3 work on much the same principle. Provides very good read performance, but poor write performance. And as the number of member disks increases, the system bottleneck of the checksum disk will become more prominent. It is rare in real-world applications, and mainstream storage products rarely use RAID4 protection.
RAID5 is called the distributed parity checksum strip, which should be the most common RAID level at present, and the principle is similar to that of RAID4, but there is no bottleneck in the performance of the check disk during concurrent write operations in RAID4.
RAID6, called the double parity strip, introduces the concept of double checks to solve the problem of data integrity when two disks fail at the same time that other RAID classes cannot solve. However, it costs much more than RAID5, has poor write performance, and is very complex to design and implement. Therefore, RAID6 is rarely used in practice, and is generally an economical alternative to RAID10 solutions.
Standard RAID tiers have their strengths and weaknesses. Combine multiple RAID levels to achieve complementary advantages and make up for each other's shortcomings, so as to achieve a RAID system with higher performance, data security and other indicators. Of course, the implementation cost of the combination level is generally very expensive and is only used in a few specific cases. In fact, only RAID01 and RAID10 are widely used.
RAID01 is striped first and then mirrored, which is essentially to image the physical disk; RAID10 is to image first and then stripe, which is to image the virtual disk. Under the same configuration, RAID01 usually has better fault tolerance than RAID10. RAID01 combines the advantages of RAID0 and RAID1, with an overall disk utilization of only 50%.
4. Comparison of mainstream RAID levels
RAID configuration
Level/Description: | Fault tolerance | merit | shortcoming | RAID 0
Map data across drives to create large virtual disks. Because each physical disk only processes a portion of the request, it can provide higher performance. However, if one drive fails, the virtual disk (VD) will become inaccessible and the data will be permanently lost. | not | Better performance Additional storage | It must not be used for critical data | RAID 1
Mirror data, store data redundancy on two drives. If one disk fails, the other disk will take over as the primary drive. | Disk error Single disk failure | High read performance Quickly recover after a drive failure Data redundancy | Disk overhead is large Limited capacity | RAID 5
Map data across drives and store the parity bits of each data strip on different drives in VD. The parity bit contains information that can be used to reconstruct data from a failed disk from another disk in the event of a single disk failure. | Disk error Single disk failure | Efficient use of drive capacity High read performance Medium to high write performance | Moderate disk failure impact Due to the recalculation of parity, the reconstruction time is longer | RAID 6
Map data across drives and store the parity bits of each data strip on different drives in VD. Unlike RAID 5, RAID 6 performs two parity calculations (P and Q), allowing it to withstand dual-disk failures. | Data redundancy High read performance | Disk error Dual disk failure | Write performance is reduced due to two parity calculations Since it is equivalent to using 2 disks for parity, there is an additional cost | RAID 10
Strips on the mirror set. Disk overhead is high, but it's a great solution for high performance, redundancy, and fast recovery in the event of a drive failure. | Disk error One disk failure per image set | High read performance RAID groups with up to 192 drives can be supported | The highest cost | RAID 50
RAID 5 strip on the set. By reducing disk reads per parity calculation, performance can be improved with RAID 5, depending on the configuration. | Disk error One disk failure per span | High read performance Medium to high write performance RAID groups with up to 192 drives can be supported | Moderate disk failure impact Due to the recalculation of parity, the reconstruction time is longer | RAID 60
RAID 6 strip on the set. With fewer disk reads per parity calculation, performance can be improved with RAID 6, depending on the configuration. | Disk error Two disks fail per span | High read performance RAID groups with up to 192 drives can be supported | Write performance is reduced due to two parity calculations Since it is equivalent to using 2 disks for parity, there is an additional cost |
5. RAID software and hardware difference
Soft RAID
Soft RAID does not have a dedicated control chip and I/O chip, and the operating system and CPU implement all RAID functions. Modern operating systems basically offer soft RAID support, providing an abstraction between physical and logical drives by adding a software layer on top of disk device drivers. Currently, the most common RAID ratings supported by the operating system are RAID0, RAID1, RAID10, RAID01, and RAID5. For example, Windows Server supports RAID0, RAID1 and RAID5, Linux supports RAID0, RAID1, RAID4, RAID5, RAID6, etc., and Mac OS X Server, FreeBSD, NetBSD, OpenBSD, Solaris and other operating systems also support corresponding RAID levels.
The configuration management and data recovery of soft RAID are relatively simple, but all RAID tasks are completely completed by the CPU, such as calculating check values, so the execution efficiency is relatively low.
Soft RAID is implemented by the operating system, so the partition where the system is located cannot be used as a logical member disk of RAID, and soft RAID cannot protect system disk D. For some operating systems, RAID configuration information is stored in system information rather than as a separate file on disk. This way, when the system crashes unexpectedly and needs to be reinstalled, the RAID information is lost. In addition, the fault tolerance technology of the disk does not fully support online replacement, hot swapping or hot swapping, and whether it can support the hot swap of the wrong disk is related to the implementation of the operating system.
Hard RAID
Hard RAID has its own RAID control processing and I/O processing chips, and even array buffering, which is the best of the three types of implementations in terms of CPU usage and overall performance, but also has the highest implementation cost. Hard RAID typically supports hot-swapping technology, which replaces failed disks while the system is running.
Hard RAID consists of a RAID card and a RAID chip integrated on the motherboard, and server platforms often use RAID cards. A RAID card consists of 4 parts: the RAID core processing chip (the CPU on the RAID card), the port, the cache, and the battery. Among them, ports refer to the types of disk interfaces supported by RAID cards, such as IDE/ATA, SCSI, SATA, SAS, FC, and other interfaces.
Mixed hard and soft RAID
Soft RAID is not very good and does not protect system partitions, making it difficult to apply to desktop systems. Hard RAID is very expensive, and different RADs are independent of each other and not interoperable. Therefore, people adopt a combination of software and hardware to implement RAID, so as to obtain a compromise between performance and cost, that is, a high cost performance.
Although this RAID uses a processing control chip, in order to save costs, the chip is often cheaper and has weaker processing power, and most of the task processing of RAID is still done by the CPU through firmware drivers.
6. RAID application selection
There are three main factors in choosing a RAID tier: data availability, I/O performance, and cost. If availability is not required, choose RAID0 for high performance. If availability and performance are important and cost is not a major factor, choose RAID1 based on the number of disks. If availability, cost, and performance are all equally important, choose RAID3 or RAID5 based on general data transfer and disk count. In practical applications, the appropriate RAID level should be selected based on the characteristics and specific conditions of the user's data application, considering availability, performance, and cost.
|