Weblog

All | General | Java | Music
« Previous day (Jun 13, 2005) | Main | Next day (Jun 15, 2005) »
20050614 Tuesday June 14, 2005

RAID 0+1 vs. RAID 1+0 and SVM RAID 0+1 vs. RAID 1+0 and SVM

http://opensolaris.org


RAID 0+1 vs. RAID 1+0 and SVM

Six years ago when I first started working on Solaris Volume Manager's earlier incarnation (known as SDS) I was confused about whether it implemented RAID 0+1 or RAID 1+0. The answer ended up being more complicated than simply one or the other. The same implementation has carried forward into the current version of SVM. Since this question still comes up with some regularity I thought it was worth spending some time describing how this particular part of SVM works.

Background

RAID stands for 'Redundant Array of Inexpensive Disks', and the different numbers correspond to differing ways of placing data on the disks. There are two basic RAID levels that pertain to this subject in general plus an additional logical device type that's involved when you're dealing with SVM:

Since RAID0 improves performance, and RAID1 provides redundancy, someone came up with the idea to combine them. Fast and reliable. Two great tastes that taste great together!

When combining these two types of 'logical' devices there's a choice to be made -- do you mirror two stripes, or do you stripe across multiple mirrors? There are pros and cons to each approach:

SVM specifics

So, does SVM do RAID 0+1 or RAID 1+0? The answer is, "Yes." So it gives you a choice between the two? The answer is "No."

Obviously further explanation is necessary...

In SVM, mirror devices cannot be created from "bare" disks. You are required to create the mirror on top of another type of SVM metadevice, known as a concat/stripe*. SVM combines concatenations and stripes into a single metadevice type, in which one or more stripes are concatenated together. When used to build a mirror these concat/stripe logical devices are known as submirrors. If you want to expand the size of a mirror device you can do so by concatenating additional stripe(s) onto the concat/stripe devices that are serving as submirrors.

So, in SVM, you are always required to set up a stripe (concat/stripe) in order to create a mirror. On the surface this makes it appear that SVM does RAID 0+1. However, once you understand a bit about the SVM mirror code, you'll find RAID 1+0 lurking under the covers.

SVM mirrors are logically divided up into regions. The state of each mirror region is recorded in state database replicas* stored on disk. By individually recording the state of each region in the mirror, SVM can be smart about how it performs a resync. Following a disk failure or an unusual event (e.g. a power failure occurs after the first side of a mirror has been written to but before the matching write to the second side can be accomplished), SVM can determine which regions are out-of-sync and only synchronize them, not the entire mirror. This is known as an optimized resync.

The optimized resync mechanisms allow SVM to gain the redundancy benefits of RAID 1+0 while keeping the administrative benefits of RAID 0+1. If one of the drives in a concat/stripe device fails, only those mirror regions that correspond to data stored on the failed drive will lose redundancy. The SVM mirror code understands the layout of the concat/stripe submirrors and can therefore determine which resync regions reside on which underlying devices. For all regions of the mirror not affected by the failure, SVM will continue to provide redundancy, so a second disk failure won't necessarily prove fatal.

So, in a nutshell, SVM provides a RAID 0+1 style administrative interface but effectively implements RAID 1+0 functionality. Administrators get the best of each type, the relatively simple administration of RAID 0+1 plus the greater resilience of RAID 1+0 in the case of multiple device failures.


* concat/stripe logical devices (metadevices)

The following example shows a concat/stripe metadevice that's serving as a submirror to a mirror metadevice. Note that the metadevice is a concatenation of three separate stripes:

** State database replicas

SVM stores configuration and state information in a 'state database' in memory. Copies of this state database are stored on disk, where they are referred to as state database replicas. The primary purpose of the state database replicas is to provide non-volatile copies of the state database so that the SVM configuration is persistant across reboots. A secondary purpose of the replicas is to provide a 'scratch pad' to keep track of mirror region states.


OpenSolaris

Solaris ( Jun 14 2005, 08:10:53 AM PDT ) Permalink Comments [7]

Calendar

RSS Feeds

Search

Links

Navigation

Referers