Multithreaded Musings
Stand back - I'm a scientist!
Archives
« November 2009
MonTueWedThuFriSatSun
      
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
      
Today
Click me to subscribe
Search

About Me
Known throughout Sun as a man of infinite wit, of jovial attitude, and of making things up about himself at the slightest whim.
Links
 

Today's Page Hits: 44

« Data Recovery Done... | Main | When Smilies Go Bad »
Tuesday Oct 16, 2007
ZFS and automatically growing pools

The question of replacing disks in ZFS pools comes up every so often. The most common thing that's asked is whether ZFS will see larger disks if they replace smaller disks. Let's go through an example:

First, we'll create some files to use as pool storage, and create a zpool out of the smaller two.

bash-3.00# mkfile 64m /var/tmp/a0 /var/tmp/b0
bash-3.00# mkfile 128m /var/tmp/a1 /var/tmp/b1
bash-3.00# zpool create tank /var/tmp/a0 /var/tmp/b0
bash-3.00# zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank   119M   111K   119M     0%  ONLINE  -
bash-3.00# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

	NAME           STATE     READ WRITE CKSUM
	tank           ONLINE       0     0     0
	  /var/tmp/a0  ONLINE       0     0     0
	  /var/tmp/b0  ONLINE       0     0     0

errors: No known data errors

Here we've striped a pair of 64MB files for our pool. Now we'll replace the two disks in our stripe with their 128MB counterparts:

bash-3.00# zpool replace tank /var/tmp/a0 /var/tmp/a1
bash-3.00# zpool replace tank /var/tmp/b0 /var/tmp/b1

We wait a few moments, and then check to see that we're done:

bash-3.00# zpool status
  pool: tank
 state: ONLINE
 scrub: resilver completed with 0 errors on Mon Oct 15 15:47:58 2007
config:

	NAME           STATE     READ WRITE CKSUM
	tank           ONLINE       0     0     0
	  /var/tmp/a1  ONLINE       0     0     0
	  /var/tmp/b1  ONLINE       0     0     0

errors: No known data errors

Everything seems to have gone well, and the resilvering is complete. Let's take a look at the pool now:

bash-3.00# zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank   247M   231K   247M     0%  ONLINE  -

This shows that it works with stripes. Will it work with raidz? Let's create a few more files and test.

bash-3.00# mkfile 64m /var/tmp/c0 /var/tmp/d0
bash-3.00# mkfile 128m /var/tmp/c1 /var/tmp/d1
bash-3.00# zpool destroy tank
bash-3.00# zpool create tank raidz /var/tmp/a0 /var/tmp/b0 /var/tmp/c0 /var/tmp/d0
bash-3.00# zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank   238M   177K   238M     0%  ONLINE  -
bash-3.00# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

	NAME             STATE     READ WRITE CKSUM
	tank             ONLINE       0     0     0
	  raidz1         ONLINE       0     0     0
	    /var/tmp/a0  ONLINE       0     0     0
	    /var/tmp/b0  ONLINE       0     0     0
	    /var/tmp/c0  ONLINE       0     0     0
	    /var/tmp/d0  ONLINE       0     0     0

errors: No known data errors

And now do the replace:

bash-3.00# for f in a b c d; do zpool replace tank /var/tmp/${f}0 /var/tmp/${f}1; done

We wait a little bit for the resilver to complete, and then check the status and size:

bash-3.00# zpool status
  pool: tank
 state: ONLINE
 scrub: resilver completed with 0 errors on Tue Oct 16 08:01:00 2007
config:

	NAME             STATE     READ WRITE CKSUM
	tank             ONLINE       0     0     0
	  raidz1         ONLINE       0     0     0
	    /var/tmp/a1  ONLINE       0     0     0
	    /var/tmp/b1  ONLINE       0     0     0
	    /var/tmp/c1  ONLINE       0     0     0
	    /var/tmp/d1  ONLINE       0     0     0

errors: No known data errors
bash-3.00# zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank   238M   408K   238M     0%  ONLINE  -

OK, so that didn't exactly work. The device list is correct, but the size is the same. Let's try export-import to see if that will allow ZFS to see the new size:

bash-3.00# zpool export tank
bash-3.00# zpool import -d /var/tmp tank
bash-3.00# zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank   494M   189K   494M     0%  ONLINE  -
bash-3.00# 

And it works! Of course, if you've got shared filesystems or volumes, via nfs or iscsi, it makes exporting and reimporting a bit trickier - you'd need to wait until your users have gone home for the day, or just reboot the machine (which does an implicit export/import). It'd be nice if this could happen automatically, as in the striping case above. A bug has been written for this (6606879)

The final case is mirroring:

bash-3.00# zpool destroy tank
bash-3.00# zpool create tank mirror /var/tmp/a0 /var/tmp/b0
bash-3.00# zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank  59.5M    94K  59.4M     0%  ONLINE  -
bash-3.00# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

	NAME             STATE     READ WRITE CKSUM
	tank             ONLINE       0     0     0
	  mirror         ONLINE       0     0     0
	    /var/tmp/a0  ONLINE       0     0     0
	    /var/tmp/b0  ONLINE       0     0     0

errors: No known data errors

OK, now we'll do the replace:

bash-3.00# zpool replace tank /var/tmp/a0 /var/tmp/a1
bash-3.00# zpool replace tank /var/tmp/b0 /var/tmp/b1
bash-3.00# zpool status
  pool: tank
 state: ONLINE
 scrub: resilver completed with 0 errors on Mon Oct 15 16:09:10 2007
config:

	NAME             STATE     READ WRITE CKSUM
	tank             ONLINE       0     0     0
	  mirror         ONLINE       0     0     0
	    /var/tmp/a1  ONLINE       0     0     0
	    /var/tmp/b1  ONLINE       0     0     0

errors: No known data errors
bash-3.00# zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank  59.5M   218K  59.3M     0%  ONLINE  -

The size is still 59.5M. As in the raidz case above, this will take an export/import in order to effect the size change:

bash-3.00# zpool export tank
bash-3.00# zpool import -d /var/tmp tank
bash-3.00# zpool status
  pool: tank
 state: ONLINE
 scrub: none requested
config:

	NAME             STATE     READ WRITE CKSUM
	tank             ONLINE       0     0     0
	  mirror         ONLINE       0     0     0
	    /var/tmp/a1  ONLINE       0     0     0
	    /var/tmp/b1  ONLINE       0     0     0

errors: No known data errors
bash-3.00# zpool list
NAME   SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
tank   124M   116K   123M     0%  ONLINE  -
bash-3.00# 

To summarise: for plain stripes, also known as RAID-0, ZFS can automatically grow the pool after a replace. For mirroring (a.k.a. RAID-1) and raidz/raidz2 (an improved RAID-5/6), you need to export and reimport (or reboot) to get the new size until 6606879 is fixed.

Posted at 01:10PM Oct 16, 2007 by Mark Musante in ZFS  |  Comments[2]

Comments:

Great post! Can zfs convert a mirror with 2 disks to a raidz1 with 3 disks? I am going to add a disk to my home system and want to increase the space and keep the redundancy.

Posted by Kevin on October 18, 2007 at 03:56 PM BST #

Unfortunately, adding devices to mirrors can only make 'wider' mirrors.

The only way I can think of to convert a mirror to a raidz would be to build a raidz separately, and do a zfs send to it.

Posted by Mark J Musante on October 22, 2007 at 02:23 PM BST #

Post a Comment:
Comments are closed for this entry.