Objectives

Upon completion of this lesson, you will be able to:

  • Explain the different levels of RAID and what their advantages and disadvantages are
  • Create a simple RAID implementation (levels 0, 1, 4 or 5) including address translation, parity calculation, and recovery
  • Explain the role of a storage-area networks
  • Describe the differences between a solid-state and a mechanical disk

Acknowledgement

This lesson is a derivative work of Section 5.4 in “How Operating Systems Work” by Peter Desnoyer of Northeastern University which is permitted under the CC4 License. This license gives important freedoms, including the right to copy and distribute the chapter non-commercially without permission or fee. Specifically, it allows one to copy and redistribute the material in any medium or format and to remix, transform, and build upon the material which is what this lesson does.

Motivation

Most modern computer systems use a single solid state drive (SSD) or mechano-magnetic disk drive (aka hard disk) to store data, generally in a file system but might also be used to store data backups or databases. However, the logical block interface allows a wide variety of technologies to mimic a single disk drive, providing different advantages in performance, capacity, or reliability.

This lesson covers common combinations of multiple storage devices (SSD or mechanical hard disk), such as striping, mirroring, and RAID; solid-state drives, using semiconductor-based flash memory instead of magnetic disk; and data center technologies such as storage-area networks (SANs), cloud drives, and de-duplication.

Overview

This lesson introduces the structure and organization of “disk-like” storage devices, which behave like storage devices but aren’t actual physical disks; this includes multi-disk arrays, solid-state drives (SSDs), and other block devices.

Early disk drives used cylinder/head/sector addressing, required the operating system to be aware of the exact parameters of each disk so that it could store and retrieve data from valid locations. The development of logical block addressing, first in SCSI, then in IDE and SATA drives, allowed drives to be interchangeable: with logical block addressing the operating system only needs to know how big a disk is, and can ignore its internal details. This model is more powerful than that, however, as there is no need for the device on the other end of the SCSI (or SATA) bus to actually be a disk drive.

Instead the device on the other end of the bus can be an array of disk drives, a solid-state drive, a virtual network drive, cloud storage, or any other device which stores and retrieves blocks of data in response to write and read commands. Such disk-like devices are found in many of today’s computer systems, both on the desktop and especially in enterprise and data center systems, and include:

  • Partitions and logical volume management, for flexible division of disk space
  • Disk arrays, especially RAID (redundant arrays of inexpensive disks), for performance and reliability
  • Solid-state drives, which use flash memory instead of magnetic disks
  • Storage-area networks (SANs)
  • virtual, network, and cloud storage
  • De-duplication, to compress multiple copies of the same data

Almost all of these systems look exactly like a disk to the operating system. Their function, however, is typically (at least in the case of disk arrays) an attempt to overcome one or more deficiencies of disk drives, which include:

  • Performance: Disk transfer speed is determined by (a) how small bits can be made, and (b) how fast the disk can spin under the head. Rotational latency is determined by (b again) how fast the disk spins. Seek time is determined by (c) how fast the head assembly can move and settle to a final position. For enough money, you can make (b) and (c) about twice as fast as in a desktop drive, although you may need to make the tracks wider, resulting in a lower-capacity drive. To go any faster requires using more disks, or a different technology, like SSDs.
  • Reliability: Although disks are surprisingly reliable, they fail from time to time. If your data is worth a lot (like the records from the Bank of Lost Funds), you will be willing to pay for a system which doesn’t lose data, even if one (or more) of the disks fails.
  • Size: The maximum disk size is determined by the available technology at any time – after all, if a manufacturer could build them bigger for an affordable price, they would. If you want to store more data, you need to either wait until they can build larger storage devices, or use more than one. Conversely, in some cases (like dual-booting) a single disk may be more than big enough, but you may need to split it into multiple logical parts that are separated for different operating systems.

In this lesson we will look at drive re-mappings, where a logical volume is created which is a different size or has different properties than the storage device(s) from which it is built. These mappings are not complex – in most cases a simple mathematical calculation on a logical block address (LBA) within the logical volume will determine which storage device or devices the operation will be directed to, and to what LBA on that device. This translation may be done on an external device (for instance, in a RAID array), within a host bus adapter, transparently to the host (for instance in a RAID controller), or within the operating system itself (as might be the case for a “software RAID”), but the translations performed are the same in each case.

RAID 10 vs RAID01

RAID10 and RAID01 are combinations of striping and mirroring in order to get the speed of RAID0 and the fault tolerance (without parity calculations) of RAID1.

The video tutorial below summarizes the differences between the two configurations.

RAID10

RAID10 is more aptly described as RAID 1 + 0, i.e., a “stripe of mirrors”. It requires consequently a minimum of four storage devices and an even number of storage devices. The devices are paired (groups of 2) and mirrored within each group. The data is striped across the groups in sets of some number of sectors depending on the disks used. For example, in an array of four disks, there are two groups of pairs. The pairs are mirrored and the data is striped across the groups. In an array of 8 disks, there would be four pairs/groups. Again, within each pair, the data is mirrored. The diagram below illustrates four disks in a RAID10 configuration with a stripe set of three sectors.

RAID10 Configuration of 4 Disks
RAID10 Configuration of 4 Disks

In the above configuration, each write operation to a disk writes three sectors at a time. So, writing 12 sectors starting at logical block 0 would write sectors 0, 1, and 2 to Disk 0 in Group 0. Those sectors are simultaneously mirrored to Disk 1 in Group 0. While Sectors 0, 1, and 3 are written to the mirror in Group 0, Sectors 3, 4, and 5 are written to Disks 2 and 3 in Group 1. So, the total amount of time it takes to write 6 sectors in mirrored fashion is the time for a single write operation to one disk. Once Sector 0 to 6 are written, the controller can then write Sectors 6 to 8 and 9 to 11 in the same manner. Overall, the total time to write 12 sectors of data is the time it takes to write two sets of sectors.

RAID01

RAID01, or, more accurately, RAID 0 + 1, is a “mirror of stripes”. It also requires a minimum of four disks and the disks are broken into two groups. Like RAID10 it requires an even number of disks. The two groups are mirrors of each other. Within each group the data is striped. For example, eight disks would be configured in a RAID01 as two groups of four disks each and withing each of the groups and data is striped across the four disks. Likewise, four disks would be broken into two mirrored groups with data striped across the two disks within each group. The diagram below shows a RAID01 configuration.

RAID01 Configuration of 4 Disks
RAID01 Configuration of 4 Disks

In the above configuration, each write operation to a disk writes three sectors at a time. So, writing 12 sectors starting at logical block 0, would write sectors 0, 1, 2 to Disk 0 and 3, 4, 5 to Disk 1 simultaneously in a stripe while concurrently the same data is written to disks 2 and 3 in Group 1. The same is then performed for sectors 6 to 11. Overall, the total time to write 12 sectors of data is the time is takes to write two sets of sectors.

Choosing between RAID10 and RAID01

While storage capacity and read/write performance for both RAID10 and RAID01 are the same, there are some differences in terms of fault tolerance. If one disk fails, then the both will continue to work, but if two drives fail, then there may be a loss of data. For four disks, the fault tolerance is the same. If two disks fail, then there could be a loss of data in both. For RAID01 if the mirrored disks in each group fail, then there is a loss of data. For RAID10, if a group fails then there’s loss of data. However, if you have 6 or more disks, then RAID10 can sustain additional loss of drives and is thus, on average and assuming an equal likelihood of any disk failing, a bit more fault tolerant. Since otherwise performance is the same, we always want to choose a RAID10 configuration.

Summary


Files & Resources

All Files for Lesson 92.502

Errata

Let us know.

Acknowledgements

Desnoyer, Peter (2020). How Operating Systems Work. Content from this manuscript are provided under a CC4 license.