Wednesday Nov 07, 2007
Wednesday Nov 07, 2007

Tuesday Oct 16, 2007
The question of replacing disks in ZFS pools comes up every so often. The most common thing that's asked is whether ZFS will see larger disks if they replace smaller disks. Let's go through an example:
First, we'll create some files to use as pool storage, and create a zpool out of the smaller two.
bash-3.00# mkfile 64m /var/tmp/a0 /var/tmp/b0 bash-3.00# mkfile 128m /var/tmp/a1 /var/tmp/b1 bash-3.00# zpool create tank /var/tmp/a0 /var/tmp/b0 bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 119M 111K 119M 0% ONLINE - bash-3.00# zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 /var/tmp/a0 ONLINE 0 0 0 /var/tmp/b0 ONLINE 0 0 0 errors: No known data errors
Here we've striped a pair of 64MB files for our pool. Now we'll replace the two disks in our stripe with their 128MB counterparts:
bash-3.00# zpool replace tank /var/tmp/a0 /var/tmp/a1 bash-3.00# zpool replace tank /var/tmp/b0 /var/tmp/b1
We wait a few moments, and then check to see that we're done:
bash-3.00# zpool status pool: tank state: ONLINE scrub: resilver completed with 0 errors on Mon Oct 15 15:47:58 2007 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 /var/tmp/a1 ONLINE 0 0 0 /var/tmp/b1 ONLINE 0 0 0 errors: No known data errors
Everything seems to have gone well, and the resilvering is complete. Let's take a look at the pool now:
bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 247M 231K 247M 0% ONLINE -
This shows that it works with stripes. Will it work with raidz? Let's create a few more files and test.
bash-3.00# mkfile 64m /var/tmp/c0 /var/tmp/d0 bash-3.00# mkfile 128m /var/tmp/c1 /var/tmp/d1 bash-3.00# zpool destroy tank bash-3.00# zpool create tank raidz /var/tmp/a0 /var/tmp/b0 /var/tmp/c0 /var/tmp/d0 bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 238M 177K 238M 0% ONLINE - bash-3.00# zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /var/tmp/a0 ONLINE 0 0 0 /var/tmp/b0 ONLINE 0 0 0 /var/tmp/c0 ONLINE 0 0 0 /var/tmp/d0 ONLINE 0 0 0 errors: No known data errors
And now do the replace:
bash-3.00# for f in a b c d; do zpool replace tank /var/tmp/${f}0 /var/tmp/${f}1; done
We wait a little bit for the resilver to complete, and then check the status and size:
bash-3.00# zpool status pool: tank state: ONLINE scrub: resilver completed with 0 errors on Tue Oct 16 08:01:00 2007 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /var/tmp/a1 ONLINE 0 0 0 /var/tmp/b1 ONLINE 0 0 0 /var/tmp/c1 ONLINE 0 0 0 /var/tmp/d1 ONLINE 0 0 0 errors: No known data errors bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 238M 408K 238M 0% ONLINE -
OK, so that didn't exactly work. The device list is correct, but the size is the same. Let's try export-import to see if that will allow ZFS to see the new size:
bash-3.00# zpool export tank bash-3.00# zpool import -d /var/tmp tank bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 494M 189K 494M 0% ONLINE - bash-3.00#
And it works! Of course, if you've got shared filesystems or volumes, via nfs or iscsi, it makes exporting and reimporting a bit trickier - you'd need to wait until your users have gone home for the day, or just reboot the machine (which does an implicit export/import). It'd be nice if this could happen automatically, as in the striping case above. A bug has been written for this (6606879)
The final case is mirroring:
bash-3.00# zpool destroy tank bash-3.00# zpool create tank mirror /var/tmp/a0 /var/tmp/b0 bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 59.5M 94K 59.4M 0% ONLINE - bash-3.00# zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 /var/tmp/a0 ONLINE 0 0 0 /var/tmp/b0 ONLINE 0 0 0 errors: No known data errors
OK, now we'll do the replace:
bash-3.00# zpool replace tank /var/tmp/a0 /var/tmp/a1 bash-3.00# zpool replace tank /var/tmp/b0 /var/tmp/b1 bash-3.00# zpool status pool: tank state: ONLINE scrub: resilver completed with 0 errors on Mon Oct 15 16:09:10 2007 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 /var/tmp/a1 ONLINE 0 0 0 /var/tmp/b1 ONLINE 0 0 0 errors: No known data errors bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 59.5M 218K 59.3M 0% ONLINE -
The size is still 59.5M. As in the raidz case above, this will take an export/import in order to effect the size change:
bash-3.00# zpool export tank bash-3.00# zpool import -d /var/tmp tank bash-3.00# zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 /var/tmp/a1 ONLINE 0 0 0 /var/tmp/b1 ONLINE 0 0 0 errors: No known data errors bash-3.00# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT tank 124M 116K 123M 0% ONLINE - bash-3.00#
To summarise: for plain stripes, also known as RAID-0, ZFS can automatically grow the pool after a replace. For mirroring (a.k.a. RAID-1) and raidz/raidz2 (an improved RAID-5/6), you need to export and reimport (or reboot) to get the new size until 6606879 is fixed.
Friday Sep 14, 2007
Phase One, in which Alec gets his data back after a terror-inducing message.
Lesson learned: Don't Panic!
Thursday Sep 06, 2007
How many people out there have huge PATHs partly because your work requires various tools from different installations, and partly because you end up with multiple instances of the same directory within your PATH?
I suspect more than a few. I just recently added this to the end of my .bashrc, but I welcome more efficient ways of accomplishing the same thing:
PATH="$(echo "$PATH" | tr : '\012' |
perl -n -e 'chomp; if (!defined($h{$_})) { print ":" if $. > 1; print "$_"; $h{$_} = 1; }')"
Invoking perl every time I start a shell is, shall we say, less than satisfying.
Update: From Nevin comes this bash function, which couldn't be formatted properly in the comments section, so I add it here between a couple of pretty <pre> tags:
strippath()
{
local ifs="$IFS"
IFS=:
local path
path=(${1:-${PATH}})
local p=${#path[*]}
until ((0 == p))
do
((--p))
if [ -d ${path[$p]} ]
then
local e;
for ((e = 0; e != p; ++e))
do
if [ "${path[$p]}" == "${path[$e]}" ]
then
unset path[$p]
break;
fi
done
else
unset path[$p]
fi
done
echo "${path[*]}"
IFS="$ifs"
}
PATH="$(strippath)"
Tuesday Aug 28, 2007
Monday Aug 27, 2007
Well, it looks like we're partway there. The URL http://finance.yahoo.com/q?s=java currently displays "invalid ticker symbol" just beside a flash ad for Scottrade with a description of Sun Microsystems.

Give it a few more hours...
Friday Aug 24, 2007
Jonathan Schwartz has been doing great things for Sun. he seems to know the market well, and he has done some very bold things with our IP that have been paying off.
Once in a while, though, you have to scratch your head in wonder. There have been a fair number of responses to the stock symbol name change, both pro and con, and I've been thinking about it too.
Many of the reactions remind me of similar reactions to relabeling Pluto as a "dwarf planet". Those reactions also varied from positive to negative. Those against the change were saying that they grew up having learnt that Pluto was a planet and nothing was going to change their minds about it. Those for it felt that Pluto just didn't fit in with the other eight somehow.
This isn't the first time the CEO has done something in an attempt to use a popular item in order to increase awareness - consider the time he tried to use the iPod to sell storage racks. So now he's trying to use the more well-known technology, Java, to sell the Sun brand.
Some might dislike giving up 'SUNW', to which I respond 'eh, it's just a sequence of letters'. Some might consider the move an unreasonable one to make. After all, we're not renaming the company. My response is to wait and see. It didn't cost a lot of money (relatively speaking) to implement the stock symbol change. If it does nothing, we haven't lost much. If it works, well then. We'll have gained much for almost nothing.
If you still think it unreasonable, consider again that well-worn saying about unreasonable people.
Tuesday Aug 07, 2007
Monday Jul 16, 2007
I keep dropping my fork.
There's a canteen at the Sun campus in Burlington, MA. They've called it the Liberty Café, but it serves more than one would expect from a café. They've got a wide variety of food, and the quality is ok.
Occasionally I'll buy food there and walk back to my office to eat it. They thoughtfully provide plastic utensils for that very purpose, so I grab a fork before heading back. However, fumbling with a drink, some napkins, a plastic fork, a container of food, and dealing with badge-accessible doors is a nontrivial task for a klutz like myself, and I invariably drop the fork before making it all the way back. Recently I've been trying a work-around: grab two forks whenever I choose to bring food back to my office.
So now, instead of dropping a fork on the way back, I drop two of them.
Friday Jun 29, 2007
Bill just can't resist the allure of ZFS.
I had no idea. It sounds like the folks at Apple are going to release something called an iPod that plays music and video only. Sort of like a stripped-down iPhone. I guess they're trying to follow the initial 'iP' letters of iPhone, otherwise why not call it iMusic or iMP3?
Anyway, I don't see the point. It's basically an iPhone with a smaller screen, no WiFi, and no cellphone capabilities. Who in their right mind would buy one?
Tuesday Jun 26, 2007
An exciting milestone has been reached in ZFS land. Mark Shellenbaum has just checked into the Nevada gate some bits that will allow non-root users to have access to zfs. The short version: with two new zfs subcommands, 'allow' and 'unallow', a sysadmin can grant specific permissions to users (taking snapshots, setting properties, &c). The long version: see Mark's blog entry I just linked to.
Monday Jun 25, 2007
I took last week off.
Good: rest and relaxation
Bad: Thinking about my current project anyway
Good: spending time with my kids
Bad: Coming back to a very full inbox
![]() |
Good: Getting some work done around the house & garden
Bad: sunburn
Good: Spending time with friends I haven't seen in a week
I think I'll end it there on a high note. After all, it's been far more good than not.
Wednesday Jun 06, 2007
Matthew Lee Hinman posted a perl script for automatically showing differences in files between the 'current' filesystem and any given snapshot. Looks rather handy.
As a recent video showed (thanks Geoff) even if I'm one in a million, that means there are 300 people like me here in the States. And 1,600 people like me in China. And 6,000 people like me in the world.
At Sun, there are over 40 people, including myself, with the first name Mark whose last name starts with M. One of them even works on the same project as myself.
So, it was rather vital that, when the OpenID announcement was made today I hop onto the site and grab http://openid.sun.com/markm. Bwa ha ha.
*cough*
There are certainly benefits and drawbacks to having such a common first name. As my last name, Musante, is rather unusual here in the States, it takes the edge off. Moreover, there's an odd affinity one feels for those who share your name. By coincidence, my best friend in High School was named Mark, but no doubt our initial friendship was helped by our name. It's also fun to tease those with alternate spellings as those of you with Jeff/Geoff, or Neil/Neal, or Erik/Eric names are aware. (The spelling 'Marc' is clearly incorrect. And, although Marcus is fine, Markus is just plain wrong.)
The main drawback is that it's SO common. There are over 370 employees at Sun with the first name Mark, and almost 60 with Marc (see? it's wrong!). So when we named our children, we deliberately picked names that were relatively uncommon, but common enough to offset the unusualness of Musante: Alec, Samantha, Zoë. Only one bloke here at Sun is named Alec (although there is an Aleck too). Only 9 Samanthas, and only 3 Zoë's.