$$ - An Introduction To Raid
Most medical practices are well aware of the importance of performing daily backups. Unfortunately, while these are of great importance in the event of a data disaster, they do nothing to minimise the likelihood of data corruption or loss occurring in the first place.
The installation of a Redundant Array of Independent Disks (RAID) is a preventative technical solution that can minimse the chance of data loss and increase practice server uptime and clinical data availability.
RAID is a term that describes storage schemes that divide and/or replicate data across multiple hard disks. RAID combines multiple physical hard disks and presents a single logical unit to the operating system and applications by using specialised hardware or software.
There are three key concepts that the readers need to be aware of:
- Mirroring the duplication of data to more than one disk.
- Striping the splitting of data across more than one disk.
- Parity Data extra data written to disk to facilitate error detection and correction.
Common RAID Arrangements
There are several different ways to setup a RAID system, each configuration offering different combinations of performance and data redundancy. The three most popular configurations (referred to as levels) are outlined below:
Under this arrangement, data is striped over two or more disks. This arrangement results in improved disk performance, as it is possible to read and write parts of a file to different disks in parallel. Assuming identical disks are used (highly recommended), the amount of available storage is determined by the total capacity of the disks used in the array.
It is important to note that no extra data protection is provided by RAID 0 systems. Given that the failure of any disk in the RAID array will result in data loss, probability dictates that such an arrangement is less reliable than a single disk system.
Because of its properties, RAID 0 is commonly used where disk performance is paramount, and data redundancy less important. As such, RAID 0 systems are not suitable for medical practices seeking to improve the reliability of their server.
RAID 1 is a data mirroring scheme that utilises two or more disks to store multiple copies of the same information. The duplication of data provides obvious redundancy, and prevents data loss in the event of a disk failure. As each disk in the RAID set contains the same data, the effective storage capacity of RAID 1 systems is equal to the size of the smallest disk in the array. Compared to a single disk arrangement, RAID 1 offers a slight increase in read performance, as data can be retrieved from either disk. Unfortunately, this gain is offset by marginally slower disk writing performance.
Requiring just two disks, RAID 1 is both cost effective to implement and offers improved redundancy.
Conceptually harder to understand than RAID 0 or RAID 1, RAID 5 systems use striping in conjunction with parity data to provide a good mix of performance and redundancy. RAID 5 require a minimum of three disks.
If a disk fails, calculations based on the distributed parity data on the remaining disks mean that data would still be available to the users. A second disk failure however, will result in data loss.
RAID 5 combines good performance, good fault tolerance, high capacity and storage efficiency, making it a suitable but expensive solution for hosting clinical software databases.
RAID levels can be nested together so that a master array can use other arrays instead of single physical disks. Nested RAID systems are usually signified by joining the numbers indicating the RAID levels together, either directly or by using a +.
Nested RAID systems are usually deployed in an effort to boost the performance of a RAID arrangement designed for redundancy, such as RAID 1 or RAID 5. RAID 0 is the most commonly combined RAID arrangement because of its high-performance characteristics.
Two examples of nested RAID scenarios are outlined below:
Under this arrangement, two or more striped arrays (RAID 0) are connected using a mirroring array (RAID 1). This RAID setup provides fault tolerance and improved performance, but increases complexity compared to the RAID sets of which it consists. Four or more disks are required to establish a RAID 0+1 system.
The arrangement can survive a single disk failure without data loss, at which point the system assumes the properties of a single RAID 0 system. Unless the compromised disk is replaced, a subsequent disk failure will result in data loss.
Under this arrangement, two or more mirrored (RAID 1) arrays are connected using a striped array (RAID 0). RAID 1+0 has similar performance and slightly better redundancy properties than RAID 0+1.
A RAID 1+0 array can sustain multiple disk failures, so long as all disks in one of the stripes are not compromised. It requires a minimum of four disks to establish.
RAID can be implemented using specialised hardware, software, or the RAID functionality built into most modern operating systems. Disks can be housed either inside the computer, or within external disk enclosures.
A hardware RAID controller is simply an expansion card that is inserted into a slot on the computers motherboard. These cards are specially designed to perform parity calculations, come equipped with high-speed connectors for multiple hard disks, and offer many other features. RAID controller cards are available in port configurations that support either internal hard disks, external hard disk enclosures, or a combination of both.
Another option involves connecting the hard disks to the Serial-ATA ports found on modern computer motherboards, or optionally, installing a Serial-ATA card into an available motherboard expansion slot and plugging the hard disks into the ports on this card. As with dedicated RAID controllers, these Serial-ATA expansion cards are available in configurations that support internal hard disks, external hard disks, or a combination of both.
Unlike dedicated RAID controllers, Serial-ATA cards and the ports found on most motherboards do not have the ability to create RAID sets themselves. Instead, they are reliant on the servers operating system or a third-party software solution to arrange the connected hard disks into the desired RAID arrangement.
External Disk Enclosures
External disk enclosures are self-contained units that simply contain a power supply and space for hard disks. Some units also contain a RAID controller card, allowing the disks installed in the enclosure to appear outwardly as a single disk, negating the need for any additional configuration to be performed on the computer.
External disk enclosures can be hooked up via a USB port, or via a number of other connection types such as FireWire, SCSI or Serial-ATA. While USB and FireWire are suitable interfaces for external backup solutions (whether they be a RAID or a single disk product), Serial-ATA is the most suitable interface with which to connect a RAID solution hosting live practice data.
Rebuilding an Array
Redundancy-enabled RAID solutions provide the ability for the system to continue functioning, even when one of the disks in the array has failed. When this occurs however, performance is adversely affected and the array is described as operating in a degraded state.
If a failed disk is replaced with a functioning disk, the hardware or software controller will proceed to rebuild the RAID set, restoring full redundancy capabilities to the system. This restoration process can be time-consuming, and while the array will function properly during this time, the performance will be diminished.
Although replacing an internally installed hard disk is relatively straightforward for an IT savvy person, external disk enclosures that feature hard disks mounted in trays or drawers simplify the task to the point where even the most technologically shy practice staff member could be trained to perform the process unassisted, or guided through the steps involved over the phone by an IT support professional.
Excess disks in certain RAID arrangements can be designated as hot spares. When such a disk is available, the RAID controller can automatically start rebuilding the system, minimizing the amount of time that the array is vulnerable.
Do I Still Need To Backup?
While the use of an appropriate RAID scheme can reduce the risk of hardware related data loss, it certainly does not completely mitigate it. Further, RAID does not protect against user error (e.g. the accidental deletion of files), nor from corruption caused by application software or the operating system. As such, the existing backup arrangements practices have in place still need to be followed.
RAID has long been used by large organisations to maintain high levels of server availability. Recent improvements in storage technology and the falling cost of hard disks and associated hardware now make RAID easily accessible to small businesses, and ideal for deployment in medical practices.
Practices that do not yet enjoy the data protection that RAID delivers are advised to discuss the feasibility of implementing an appropriate RAID system with their IT support professional.
Posted in Australian eHealth