What is RAID technology?

The concept of RAID was developed by a group of scientists at the University of California at Berkeley in 1987. The scientists investigated using small disk drives clustered into an array (defined as two or more disks grouped together to appear as a single device to the host system) and compared the performance and cost of this type of storage configuration to the use of a Single Large Expensive Disk (SLED), common in mainframe applications.
Their conclusion was that arrays of smaller and less expensive disks offered the same or superior performance as the SLED. However, because there were more disks used in an array, the Mean Time Before Data Loss MTBDL - calculated by dividing the single drive Mean Time Between Failures (MTBF) by the number of disks in the array - would be unacceptably low.
The problem, then, became how to manage MTBF and prevent any single drive failure from causing data loss within an array. To address this, the UC Berkeley scientists proposed five types of redundant array architectures, defining them as RAID levels 1 through 5. Simply put, the RAID level is the architecture that determines how redundancy is achieved and data is distributed across the drives in the array. In addition to RAID 1 through 5, a non-redundant array configuration that employs data striping (that is, breaking files into smaller blocks and distributing these blocks evenly across multiple disks in the array) has become known as RAID 0. RAID 0 is somewhat of a misnomer because it does not proved data protection. However, RAID 0 does offer maximum throughput for some data-intensive applications, such as desktop digital video production.
Table 1 illustrates the key characteristics of each RAID level, including the minimum number of drives required a basic description of how fault tolerance is achieved and the relative strengths and weaknesses of each architecture.

Table I Comparing RAID-level architectures

RAID 1 (two drives minimum) RAID 1,

also known as disk mirroring or duplexing (when using two host bus adapters), provides high reliability through full data redundancy. This reliability is obtained by storing two copies of all data - one on a primary disk, the other on a secondary disk enabling on-line backup. Read performance for RAID 1 is very good because data can be read from either the primary or mirrored disk. However, because two requests must be issued to write the same data to both drives, write performance is slightly slower than with single disks.
A weakness of RAID 1 is that all data is stored in duplicate. This solution requires twice as much storage capacity, making large systems very expensive. When increasing capacity, drives must be added in pairs one for primary storage and one for backup data. For entry-level servers and low-end, midrange servers that are being expanded beyond two drives, a strong business case can be made to migrate to a RAID 5 solution.

RAID 2 (not used in LAN environments)

RAID 2 achieves data protection by using multiple, dedicated parity drives and incorporating bit-level data striping across mirrored, synchronized drive spindles. This RAID level was originally used both for RAM error correction (known as Hamming Code) as well as for disk drive error correction, before that function could be performed at the drive level. With little to offer LAN managers, RAID 2 solutions are not used in network environments.

RAID 3 (three drives minimum)

RAID 3 combines byte-level data striping across multiple disks (usually synchronized for performance) with a single, dedicated parity drive for data protection. RAID 3 provides excellent performance for applications requiring high throughput of large, sequential data files. Typical applications include CAD/CAM, video, image and signal processing. Because RAID 3 systems are optimized for large, sequential throughput, every drive in the array is accessed for each write request. This can cause rotational delay penalties if the drives are not well synchronized. As a result, this RAID architecture is not suited for transaction- oriented network applications that require processing multiple, simultaneous reads and writes.

RAID 4 (three drives minimum)

RAID 4 architecture is similar to RAID 5 except that it relies on a dedicated parity disk. This parity disk often creates a write request bottleneck. As a result, RAID 4 is not used for many transaction-oriented network applications.

RAID 5 (three drives minimum)

RAID 5 stripes blocks of data as well as parity data across all drives in the array, ensuring that no data will be lost in the event of a single drive failure. However, unlike most other RAID levels, RAID 5 also delivers improved performance by allowing multiple, simultaneous read and write requests. RAID 5 offers greater usable storage by requiring only the equivalent of one disk's worth of capacity for parity information, regardless of the number of disks in the array. (Think of the RAID 5 architecture as being similar to the spare tire carried in the trunk of a car; even though the car has four tires, only one spare is needed.) These characteristics make RAID 5 systems ideal for network servers, while providing an excellent cost advantage over RAID 1 solutions that have more than two drives (see Figure 2).
When expanding a RAID 5 system, each drive added beyond the initial three-drive configuration increases the total storage capacity. What's more, RAID 5 is the only fault-tolerant RAID level that provides scaleable performance as capacity is increased. RAID 5 read performance is excellent because multiple requests are handled in parallel, although it offers somewhat slower write performance than both RAID 0 and RAID I due to parity calculation overhead. This performance tradeoff can be minimized through the use of more efficient RAID 5 software code. In addition, RAID 5 reduces the simultaneous write bottlenecks associated with RAID 3 and RAID 4.

RAID 0 (two drives minimum)

RAID 0 is achieved by creating an array of striped disks. Striping is done at the block level (the same as RAID 4 and RAID 5) but without any redundancy. If a drive in a RAID 0 system fails, all data on the array will be lost. Used primarily to boost performance in certain types of applications, RAID 0 is typically not used in network applications. See RAID 0/1.

RAID 0/1 (four drives minimum)

RAID 0/1, also known as RAID 0+1 or RAID 10, combines the performance of data striping (RAID 0) with the fault tolerance of RAID 1. Offering the highest performance of all RAID architectures, RAID 0/1 is also the only RAID level that can tolerate multiple drive failures. Up to half of the disks in an array can fail, provided the failures do not include the same data. However, RAID 0/1 suffers from even more severe cost overhead disadvantages than RAID 1. RAID 0/1 requires a minimum of four drives (only two of which are used for data storage) and drives must be added in pairs when increasing capacity.

What does RAID provide?

Given that hard disk drives will eventually fail, the question is: will RAID prevent a hard drive failure? The answer is: absolutely not. However, RAID technology does provide a valuable insurance policy by enabling real-time recovery from a drive failure without data loss, while maintaining access to the information users need. Can users afford to have data uninsured? Considering that by 1997, industry analysts forecast hard disk drive failures will cost organizations worldwide more than $100 billion dollars, the answer is a resounding no.
Most disk drive manufacturers measure reliability in terms of MTBF. But as the UC Berkeley research showed, the MTBF of multiple disks - arranged in a just a Bunch Of Drives OBOD) configuration - is unacceptable for storing critical data. As a result, RAID storage systems are measured in terms of MTBDL, Mean Time of Data Availability (MTDA), or Mean Time To Repair (MTTR). The first two measurements are expressed in the millions of hours and are significantly better than the MTBF of a non-redundant disk array or a single disk drive. N= is the amount of time needed to bring the RAID storage system back to full redundancy after a component - such as a disk, power supply or fan - fails.
The ratings assigned to the various types of RAID solutions can be enhanced by the storage enclosure that is used. For example, using components such as redundant, hot-swappable power supplies and fans can greatly increase the data availability delivered by any RAID level system by maximizing uptime. Each RAID architecture provides excellent results when used in the proper computing environment. In networking environments, systems based on RAID 5, RAID 1 and RAID 0/1 have become the most common. This is because most multi-user network operating systems - such as Novell NetWare and Windows NT'- manage data in ways that are similar to how these RAID architectures perform. For example, Novell NetWare 3.xx manages data by breaking it into 4K blocks (default setting), pocketing the information and interleaving multiple user requests so the network does not get bogged down with one large request. RAID 5 is ideal for this type of network environment because the drives in the disk array perform multiple requests simultaneously. Conversely, RAID 3 is not well-suited for these tasks because any request for data is read or written in parallel to synchronized disks. Also, RAID 3 systems have a single, dedicated parity (XOR) drive that must be accessed for every write request, thereby prohibiting the simultaneous processing of multiple read and write requests. As mentioned earlier, RAID levels 2 and 4 are not typically used in LAN environments because of the way these RAID architectures were designed

The driving factors behind RAID

A number of factors are responsible for RAID's growing popularity among LAN managers. As today's newest applications create ever larger files, network storage needs are increasing proportionally. To accommodate expanding storage requirements, users need to add drives - raising the odds that a drive failure will occur at some time. In addition, with processor speeds spiraling upward, data transfer rates to the storage media have lagged, creating troublesome bottlenecks. RAID addresses these issues by offering a combination of outstanding data availability, extraordinary performance and high capacity that single drives cannot meet. Further, RAID has gained added credibility because many new operating system companies - such as Novell - include RAID 1 mirroring as part of their native software functions.
The following is an overview of the benefits of RAID technology.

Outstanding availability

Today, many organizations run the majority of their business on a network to improve information flow and productivity. A network may include departmental, workgroup or enterprise servers, depending on the size and needs of the company. While the distributed data stored on these servers provides substantial cost benefits, the savings can be quickly offset if data is lost or becomes inaccessible repeatedly. RAID solutions provide critical reliability by assuming there will be disk failures from time to time and enabling recovery with no loss of data or interruption of user access. The right RAID solution can also protect data against soft media failures and defects which are not factored into disk drive MTBF, but can contribute to data loss.

Scaleable performance

While disk drive performance has increased substantially, has not achieved the gains as processor speeds. As a result, 1/0 processing time and sustainable throughput are limited by the capabilities of any single disk. RAID solutions provide much greater performance scalability than individual drives as capacity is increased. In RAID systems, disks work together to handle multiple 1/0 requests simultaneously. Also, sustainable throughput can be improved because the disks can be read in parallel.

High capacity

By integrating multiple drives into a single array, organizations can create cost-effective, minicomputer- sized solutions of a terabyte or more of RAID 5 storage.
These factors clearly offer solid reasons for implementing RAID storage systems in virtually every type of network environment. The question, then, is: why are PCI RAID implementations not more common in networks supported by entry-level and midrange servers? The answer is simple. In the past, the price/performance capabilities of PCI RAID solutions have been targeted primarily at the high-end server segment. Now, INSC has in cooperation with Adaptec and DPT has designed a PCI RAID solution specifically to meet the requirements of entry-level and low- end midrange Novell and Windows NT servers. Let INSC technical staff design a new RAID level 5 server to replace your current fileserver today. (The above information was obtained from Adaptec RAID Ioware Guide)