The concept of RAID was developed by a group of scientists at
the University of California at Berkeley in 1987. The scientists
investigated using small disk drives clustered into an array (defined
as two or more disks grouped together to appear as a single device
to the host system) and compared the performance and cost of this
type of storage configuration to the use of a Single Large Expensive
Disk (SLED), common in mainframe applications.
Their conclusion was that arrays of smaller and less expensive
disks offered the same or superior performance as the SLED. However,
because there were more disks used in an array, the Mean Time
Before Data Loss MTBDL - calculated by dividing the single drive
Mean Time Between Failures (MTBF) by the number of disks in the
array - would be unacceptably low.
The problem, then, became how to manage MTBF and prevent any single
drive failure from causing data loss within an array. To address
this, the UC Berkeley scientists proposed five types of redundant
array architectures, defining them as RAID levels 1 through 5.
Simply put, the RAID level is the architecture that determines
how redundancy is achieved and data is distributed across the
drives in the array. In addition to RAID 1 through 5, a non-redundant
array configuration that employs data striping (that is, breaking
files into smaller blocks and distributing these blocks evenly
across multiple disks in the array) has become known as RAID 0.
RAID 0 is somewhat of a misnomer because it does not proved data
protection. However, RAID 0 does offer maximum throughput for
some data-intensive applications, such as desktop digital video
production.
Table 1 illustrates the key characteristics of each RAID level,
including the minimum number of drives required a basic description
of how fault tolerance is achieved and the relative strengths
and weaknesses of each architecture.
Table I Comparing RAID-level architectures
also known as disk mirroring or duplexing (when using two host
bus adapters), provides high reliability through full data redundancy.
This reliability is obtained by storing two copies of all data
- one on a primary disk, the other on a secondary disk enabling
on-line backup. Read performance for RAID 1 is very good because
data can be read from either the primary or mirrored disk. However,
because two requests must be issued to write the same data to
both drives, write performance is slightly slower than with single
disks.
A weakness of RAID 1 is that all data is stored in duplicate.
This solution requires twice as much storage capacity, making
large systems very expensive. When increasing capacity, drives
must be added in pairs one for primary storage and one for backup
data. For entry-level servers and low-end, midrange servers that
are being expanded beyond two drives, a strong business case can
be made to migrate to a RAID 5 solution.
RAID 2 achieves data protection by using multiple, dedicated parity
drives and incorporating bit-level data striping across mirrored,
synchronized drive spindles. This RAID level was originally used
both for RAM error correction (known as Hamming Code) as well
as for disk drive error correction, before that function could
be performed at the drive level. With little to offer LAN managers,
RAID 2 solutions are not used in network environments.
RAID 3 combines byte-level data striping across multiple disks
(usually synchronized for performance) with a single, dedicated
parity drive for data protection. RAID 3 provides excellent performance
for applications requiring high throughput of large, sequential
data files. Typical applications include CAD/CAM, video, image
and signal processing. Because RAID 3 systems are optimized for
large, sequential throughput, every drive in the array is accessed
for each write request. This can cause rotational delay penalties
if the drives are not well synchronized. As a result, this RAID
architecture is not suited for transaction- oriented network applications
that require processing multiple, simultaneous reads and writes.
RAID 4 architecture is similar to RAID 5 except that it relies
on a dedicated parity disk. This parity disk often creates a write
request bottleneck. As a result, RAID 4 is not used for many transaction-oriented
network applications.
RAID 5 stripes blocks of data as well as parity data across all
drives in the array, ensuring that no data will be lost in the
event of a single drive failure. However, unlike most other RAID
levels, RAID 5 also delivers improved performance by allowing
multiple, simultaneous read and write requests. RAID 5 offers
greater usable storage by requiring only the equivalent of one
disk's worth of capacity for parity information, regardless of
the number of disks in the array. (Think of the RAID 5 architecture
as being similar to the spare tire carried in the trunk of a car;
even though the car has four tires, only one spare is needed.)
These characteristics make RAID 5 systems ideal for network servers,
while providing an excellent cost advantage over RAID 1 solutions
that have more than two drives (see Figure 2).
When expanding a RAID 5 system, each drive added beyond the initial
three-drive configuration increases the total storage capacity.
What's more, RAID 5 is the only fault-tolerant RAID level that
provides scaleable performance as capacity is increased. RAID
5 read performance is excellent because multiple requests are
handled in parallel, although it offers somewhat slower write
performance than both RAID 0 and RAID I due to parity calculation
overhead. This performance tradeoff can be minimized through the
use of more efficient RAID 5 software code. In addition, RAID
5 reduces the simultaneous write bottlenecks associated with RAID
3 and RAID 4.
RAID 0 is achieved by creating an array of striped disks. Striping
is done at the block level (the same as RAID 4 and RAID 5) but
without any redundancy. If a drive in a RAID 0 system fails, all
data on the array will be lost. Used primarily to boost performance
in certain types of applications, RAID 0 is typically not used
in network applications. See RAID 0/1.
RAID 0/1, also known as RAID 0+1 or RAID 10, combines the performance
of data striping (RAID 0) with the fault tolerance of RAID 1.
Offering the highest performance of all RAID architectures, RAID
0/1 is also the only RAID level that can tolerate multiple drive
failures. Up to half of the disks in an array can fail, provided
the failures do not include the same data. However, RAID 0/1 suffers
from even more severe cost overhead disadvantages than RAID 1.
RAID 0/1 requires a minimum of four drives (only two of which
are used for data storage) and drives must be added in pairs when
increasing capacity.
Given that hard disk drives will eventually fail, the question
is: will RAID prevent a hard drive failure? The answer is: absolutely
not. However, RAID technology does provide a valuable insurance
policy by enabling real-time recovery from a drive failure without
data loss, while maintaining access to the information users need.
Can users afford to have data uninsured? Considering that by 1997,
industry analysts forecast hard disk drive failures will cost
organizations worldwide more than $100 billion dollars, the answer
is a resounding no.
Most disk drive manufacturers measure reliability in terms of
MTBF. But as the UC Berkeley research showed, the MTBF of multiple
disks - arranged in a just a Bunch Of Drives OBOD) configuration
- is unacceptable for storing critical data. As a result, RAID
storage systems are measured in terms of MTBDL, Mean Time of Data
Availability (MTDA), or Mean Time To Repair (MTTR). The first
two measurements are expressed in the millions of hours and are
significantly better than the MTBF of a non-redundant disk array
or a single disk drive. N= is the amount of time needed to bring
the RAID storage system back to full redundancy after a component
- such as a disk, power supply or fan - fails.
The ratings assigned to the various types of RAID solutions can
be enhanced by the storage enclosure that is used. For example,
using components such as redundant, hot-swappable power supplies
and fans can greatly increase the data availability delivered
by any RAID level system by maximizing uptime. Each RAID architecture
provides excellent results when used in the proper computing environment.
In networking environments, systems based on RAID 5, RAID 1 and
RAID 0/1 have become the most common. This is because most multi-user
network operating systems - such as Novell NetWare and Windows
NT'- manage data in ways that are similar to how these RAID architectures
perform. For example, Novell NetWare 3.xx manages data by breaking
it into 4K blocks (default setting), pocketing the information
and interleaving multiple user requests so the network does not
get bogged down with one large request. RAID 5 is ideal for this
type of network environment because the drives in the disk array
perform multiple requests simultaneously. Conversely, RAID 3 is
not well-suited for these tasks because any request for data is
read or written in parallel to synchronized disks. Also, RAID
3 systems have a single, dedicated parity (XOR) drive that must
be accessed for every write request, thereby prohibiting the simultaneous
processing of multiple read and write requests. As mentioned earlier,
RAID levels 2 and 4 are not typically used in LAN environments
because of the way these RAID architectures were designed
A number of factors are responsible for RAID's growing popularity
among LAN managers. As today's newest applications create ever
larger files, network storage needs are increasing proportionally.
To accommodate expanding storage requirements, users need to add
drives - raising the odds that a drive failure will occur at some
time. In addition, with processor speeds spiraling upward, data
transfer rates to the storage media have lagged, creating troublesome
bottlenecks. RAID addresses these issues by offering a combination
of outstanding data availability, extraordinary performance and
high capacity that single drives cannot meet. Further, RAID has
gained added credibility because many new operating system companies
- such as Novell - include RAID 1 mirroring as part of their native
software functions.
The following is an overview of the benefits of RAID technology.
Today, many organizations run the majority of their business on
a network to improve information flow and productivity. A network
may include departmental, workgroup or enterprise servers, depending
on the size and needs of the company. While the distributed data
stored on these servers provides substantial cost benefits, the
savings can be quickly offset if data is lost or becomes inaccessible
repeatedly. RAID solutions provide critical reliability by assuming
there will be disk failures from time to time and enabling recovery
with no loss of data or interruption of user access. The right
RAID solution can also protect data against soft media failures
and defects which are not factored into disk drive MTBF, but can
contribute to data loss.
While disk drive performance has increased substantially, has
not achieved the gains as processor speeds. As a result, 1/0 processing
time and sustainable throughput are limited by the capabilities
of any single disk. RAID solutions provide much greater performance
scalability than individual drives as capacity is increased. In
RAID systems, disks work together to handle multiple 1/0 requests
simultaneously. Also, sustainable throughput can be improved because
the disks can be read in parallel.
By integrating multiple drives into a single array, organizations
can create cost-effective, minicomputer- sized solutions of a
terabyte or more of RAID 5 storage.
These factors clearly offer solid reasons for implementing RAID
storage systems in virtually every type of network environment.
The question, then, is: why are PCI RAID implementations not more
common in networks supported by entry-level and midrange servers?
The answer is simple. In the past, the price/performance capabilities
of PCI RAID solutions have been targeted primarily at the high-end
server segment. Now, INSC has in cooperation with Adaptec and
DPT has designed a PCI RAID solution specifically to meet the
requirements of entry-level and low- end midrange Novell and Windows
NT servers. Let INSC technical staff design a new RAID level 5
server to replace your current fileserver today. (The above information
was obtained from Adaptec RAID Ioware Guide)