On an internal discussion list, someone mentioned that for a Mac Mini, an external firewire drive was much faster than the internal drive. As I had no firewire, I decided to buy one and try it out. I went with the WD MyBook 500GB drive.
I downloaded SuperDuper to clone my freshly installed internal drive to the external one. It reformatted the default FAT32 partition on the WD and backed everything up. The SuperDuper documentation did warn that not all firewire drives would boot.
I could reboot and press option to load from the external drive. I used a black and white background for the internal and a vibrant flower for the external. But I couldn't get the drive to automatically boot up. Every time I thought I had it, I got the dull black and white.
I figured out that I needed to use System Preferences -> Startup Disk in order to set the boot disk. But all I could see there was the internal and a network boot option. I could see the firewire disk was loaded on the screen. And depending on how I booted it, I could find it in the namespace.
And quite frankly, google failed me here. I didn't find much and what I found did not work.
So I decided to start from scratch and load directly onto the drive. But the installation declared that the firewire drive was not bootable. I reformatted it and then figured out that it had a MBR and not a GUID partitioning scheme. Crap! I fixed that and suddenly I could start installing and booting off of the external drive.
I went and saw a neat presentation by Scott McCord at the Oklahoma City OpenSolaris User Group meeting last night. He mainly talked about our Unified Storage 7000 products, building up the rational for why a customer would want to use them.
The timing was awesome as it dovetails into my exploration of the systems. One of the key take aways for me is that the morphing of OpenSolaris offerings into Unified Storage was not a small step. Yes it is built on the shoulders of everyone who has ever worked on Solaris, but there is a lot of small and subtle changes which really transform a collection of hardware into a powerful appliance.
I'm going to highlight what I learned, which may or may not dovetail into what Scott presented.
I had kept on wondering why Sun wasn't exploring NVRAM and the answer is Write Optimized SSD. Sweet! And then Scott showed a slide of a clustered 7410 and the Write Optimized SSD wasn't in the head, it was with the disk shelves. The implications were immediate:
This went over well in the room and I think might actually be one of the hidden definitions of OpenStorage. :-> It is simple, you buy one of our appliances and everything that is supported by the hardware is immediately open for you to start deploying. You don't have to contact a sales rep, you don't have to rush through a PO because your management chain decided to start using your NFS appliance as a CIFS appliance as well.
And when you download the Sun Unified Storage Simulator, you can play with all of the features right away. You don't have to ask for permission.
This is basically a fancy way of saying that we have a graphical frontend to DTrace. And DTrace is the coolest customer support tool ever. It sounds weird, but let me explain.
Before DTrace, if a customer was hitting an issue that didn't present a core dump and static trace points weren't capturing the data, the only solution was to prepare a custom kernel to ship to the customer. This would entail the developer guessing at what data they wanted to collect, unit testing, and passing the resultant kernel off to QA. They would then do unit testing, perhaps try to recreate the customer environment (and a shout-out goes to Bill Snider who was really good at doing this when I was at NetApp), and then do regression testing.
The kernel would then go to the customer. And they would then normally run some of their own regression testing. They would then put it on the production system. If you were lucky, you got the data you needed. If you were unlucky, then you either had to roll another version (with the customer losing faith in your abilities) or you got a core dump. And again, you might still get lucky and catch the problem in the core. But you really had an irate customer either way.
With dynamic tracing, you remove that whole cycle of verifying a custom build. And remember, that cycle could be several weeks long! Instead, you piggyback right off of a well tested product and the resulting quality. You focus right in on the issue at hand. And even better, there are some pretty sharp sysadmins out there who can run DTrace on their own!
Along these lines, I asked Scott to include the video from Brendan Greg's blog on Unusual disk latency (video is here).
I'd heard out at Connectathon 2009 that our appliances would fall over if you shouted at them. I knew that our competitors were going to use this as FUD. So I was really happy to find Brendan's openess to the issue.
But don't lose sight of Brendan's real message here - Sun has the only tool dynamic enough to measure the impact of shouting at a disk. Any competitor's disks are going to react the exact same way in the face of a shout. But can they measure the impact without that cost of developing special static trace points?
My point to Scott is that the video is exciting to watch (thanks to Brendan's upbeat personality) and drives home the point of what DTrace and Analytics brings to the table.
Someone in the audience stated that they just got a Thor box on a Sun Try and Buy program and wanted to know if they could convert it directly to a Unified Storage product. The short answer is to turn that Sun Fire X4540 Server back in and get a Sun Storage 7210 Unified Storage System out for an evaluation.
Scott didn't blindly recommend this, he asked the audience member what they were doing and the answer was it was intended to be a CIFS Home-directory server. If the answer had been along the lines of they wanted to run something on the box, then the recommendation would have been to stick with the Thor.
The point here is that the Unified Storage is optimized as an appliance. We aren't going to let you run a program on it. The focus is to leverage the reliability of OpenSolaris, the rock solidness of the underlying hardware, the experience we've had in configuring ZFS, tested hardware configurations, and the ease of use of the BUI.
We haven't moved away from our stance that we have great OpenStorage systems that allow you to run your application right on the server. Instead, we are focusing on taking that same hardware and software base and turning it into an appliance tailored to serving up data. Our value add is the way you interact with the appliance, the testing we put into both the software and hardware configurations, etc. I.e., we make sure the configurations work right out of the box. And in a stroke of genius, we make sure that the configuration of the box is simple.
So I had a workspace I needed to get integrated back to the ON gate. I last worked on it about Feb 14th and I couldn't remember which machine, let alone which workspace it was located in. I found one that looked right and started the merge off:
[th199096@aus-build-x86 smurf]> hg pull -u pulling from ssh://onnv.eng//export/onnv-clone searching for changes adding changesets adding manifests adding file changes added 646 changesets with 5986 changes to 4868 files (+1 heads) not updating, since new heads added (run 'hg heads' to see heads, 'hg merge' to merge) [th199096@aus-build-x86 smurf]> hg merge abort: outstanding uncommitted changes
I've seen this before, it normally means my workspace has been corrupted. I can at least recover to get the original version of the changes:
[th199096@aus-build-x86 smurf]> hg rollback rolling back last transaction [th199096@aus-build-x86 smurf]> hg commit nothing changed [th199096@aus-build-x86 smurf]> hg status ! usr/src/pkgdefs/SUNWmptsas/copyright ! usr/src/pkgdefs/SUNWmptsas/depend
But I don't see what I never committed! Try again:
[th199096@aus-build-x86 smurf]> hg merge
abort: there is nothing to merge
[th199096@aus-build-x86 smurf]> hg pull -u
pulling from ssh://onnv.eng//export/onnv-clone
searching for changes
adding changesets
adding manifests
adding file changes
added 646 changesets with 5986 changes to 4868 files (+1 heads)
not updating, since new heads added
(run 'hg heads' to see heads, 'hg merge' to merge)
[th199096@aus-build-x86 smurf]> hg merge
abort: outstanding uncommitted changes
[th199096@aus-build-x86 smurf]> hg rollback
rolling back last transaction
[th199096@aus-build-x86 smurf]> hg list
modified:
usr/src/uts/common/fs/nfs/nfs_server.c
Okay, that looks right for what I need to modify. I then proceeded to create a new workspace into which I was going to copy over the file. Once I did that, I did a diff and realized that the reason I was getting an error was that this set of changes had already been integrated into the ON gate!
Okay, I took the Sun Unified Storage Simulator out for a light walk the other day. And I made sure to avoid the documentation. Now I want to see what I can do with the documentation!
Fun Fact #1: Did you know that the CLI supports tab completion? Way cool.
So, what do I need to do to see all of the network interfaces?
snarky:> configuration
snarky:configuration> net
snarky:configuration net> interfaces show
Interfaces:
INTERFACE STATE CLASS LINKS ADDRS LABEL
e1000g0 up ip e1000g0 192.168.111.128/24 Untitled Interface
e1000g1 up ip e1000g1 192.168.2.139/24 outbound
Too long, can we do better?
snarky:> configuration net interfaces show
Interfaces:
INTERFACE STATE CLASS LINKS ADDRS LABEL
e1000g0 up ip e1000g0 192.168.111.128/24 Untitled Interface
e1000g1 up ip e1000g1 192.168.2.139/24 outbound
I tried to do this with slashes first, like a path, but I should have known better because we use a 'show' command, not a 'dir' like in the LOM interfaces.
Now, we get to see some of the power of this CLI! First, we can add ssh keys.
snarky:> configuration preferences keys snarky:configuration preferences keys>
I was able to tab complete my way with each sub-context.
snarky:configuration preferences keys> create
snarky:configuration preferences key (uncommitted)> set type=DSA
type = DSA (uncommitted)
snarky:configuration preferences key (uncommitted)> set key="copy from your id_dsa.pub"
key = copy from your id_dsa.pub (uncommitted)
snarky:configuration preferences key (uncommitted)> set comment="thud-key1"
comment = thud-key1 (uncommitted)
snarky:configuration preferences key (uncommitted)> commit
error: An unanticipated system error occurred: Illegal character in key: ' '
This may be due to transient failure, or a software defect. If this problem
persists, contact your service provider.
snarky:configuration preferences key (uncommitted)> set key="copy-from-your-id_dsa.pub"
key = copy-from-your-id_dsa.pub (uncommitted)
snarky:configuration preferences key (uncommitted)> commit
The CLI will deny a command if it has invalid input. This feature really drove me to understand what was going on.
Anyway, the biggest stumbling block I had here was in what went into the 'key' property. Sometimes I am guilty of being too terse and connecting all of the dots in my head. I think the documentation here has done that as well. I really worked my way around the CLI and the BUI (Browser User Interface) trying to figure this one out. Along the way, I found that as I learned the CLI interface, the BUI interface started to make more sense. And that in turn made the CLI more natural. I liked the positive feedback loop here.
The BUI finally clued me in on what to do here:
I needed to be entering the contents of my 'id_dsa.pub' file, which I previously I had generated with 'ssh-keygen'. Once I figured that out, and that it was easier to enter with the CLI (using a browser to add a long key is normally error prone because of automatic insertion of spaces), I was good to go:
[thud@adept ~]> ssh root@snarky configuration net interfaces show
Interfaces:
INTERFACE STATE CLASS LINKS ADDRS LABEL
e1000g0 up ip e1000g0 192.168.111.128/24 Untitled Interface
e1000g1 up ip e1000g1 192.168.2.139/24 outbound
And now, can I do the shares?
[thud@adept ~]> ssh root@snarky shares select default select jaloppy get sharenfs
sharenfs = rw,anon=braves,root=pseries.internal.excfb.com:adept.internal.excfb.com
I struggled for a bit on how to chain the 'select', I wanted to do something like a shell script '(cd dir1; ..)', but I did find this shining example on how to automate ssh scripts to our boxes: Snapshots with PostgreSQL and Sun Storage 7000 Unified Storage Systems.
I'll save for another day creating a script to drill down and get all of the NFS shares from a Sun Storage 7000 Unified Storage Systems.
The CLI for the Sun Storage 7000 Unified Storage System actually blows away my experience with the NetApp CLI!
On March 10th, I'll be heading out to Oklahoma City for the next OKCOSUG meeting: OpenSolaris Project: Oklahoma City OpenSolaris User Group. I find the topic to be interesting and I like meeting people who use our product.
You can click on the link above for directions and such - also, please be sure to register to let Bryan know how many to plan for getting refreshments. And a quick summary of the agenda is:
Agenda 5:00PM to 5:30PM Meet and Greet 5:30PM to 7:00PM Sun Unified Storage 7000 Technical Overview 7:00PM to 7:15PM Summary and open for questions
I got an email asking me what I didn't like about the Fishwork's CLI (see Impressions of Sun Unified Storage Simulator). I thought I would answer here.
I'd say the main complaint I had with it would be I couldn't issue 'ifconfig' or 'share'. I was trying to help someone over on a Linux mailing list with an issue (you can find the thread here -- Permission problem with NFSv4 mount and [Fwd: Re: Permission problem with NFSv4 mount]) and I wanted to just dive right in like it was any other OpenSolaris box.
It was actually easier for me to use the web interface than the CLI. Was the CLI that hard to use? No - just different. It was easy enough to figure out how to move about in with a couple of minutes of experimentation. I never pulled out any documentation.
So, it would be easiest for me if the CLI had normal unix commands. That doesn't mean that it would be easiest for everyone. This new product was the perfect time to change the CLI - you do not want local access on a appliance. Within pNFS, we are struggling with having local access for the MDS and DSes. Rob Thurlow is busy working on Proxy IO in case the MDS needs to read/write directly to the DSes (it also is a cheeky way to provide NFSv3 and CIFS access to the community).
So given that you want to deny local access, why not change the interface? And if you are going to change it, why not start off from scratch?
The whole GUI design reeks of object oriented design. Why not also have a CLI that does that?
Hmm, I'm not hitting specifics, which tells me a couple of things:
I can't stress that I went into using the sim convinced I was going to hate it. I liked the NetApp sims, in part because I worked on it. But the NetApp sim was in no way as easy to configure or deploy. It took me a couple of minutes to get it up and running on the VMware network and only a couple of more to get it running across the physical wire. In the meantime, I had Debian Lenny not install on both a VMware machine and a mac mini. I finally got it working on an old laptop.
I walked away from using the simulator very happy with my experience. I think I was able to replicate the issue that a customer was seeing - I'm still waiting on a reply from him to confirm that. I'm confident that I can use it to troubleshoot other issues that might be reported on NFSv3 and NFSv4 for the Sun Storage 7000 Unified Storage Systems.
I also know that a lot of effort was put into observability for these boxes - just look at Bryan's entry on Eulogy for a benchmark or Brendan's entry on Unusual disk latency. My take away from both of these is that we can now measure things that other vendors can not. Before Dtrace, you had to systematically add trace points, recompile, QA, and then ship new bits to customer sites. And they may have had to QA the incoming bits. Now you can just start asking the interesting questions right away and getting answers.
Without delving too deeply, I expect all of this to be more integrated with the new UIs. I suspect there is a power here. I'm pretty sure I can quickly dig through this stuff on the web interface. And I'm just as confident that the CLI has the same power, but it just isn't my parent's '>' prompt...
So I have to admit that I hate GUI interfaces to appliances and I love simulators. At NetApp, I loved the really streamlined console you had for a filer and hated FilerView. I would roll my own Perl scripts to maintain a large pool of filers when I was a filer admin for both Corporate and Engineering IT over there.
And when I worked for NetApp's NFS team, I loved the sims we used for quick and painless testing.
So I approached the Sun Unified Storage Simulator with a bit of trepidation. It turns out that you can ssh into the box, but doing remote administration that way is no where near as nice as the NetApp filer console was three years ago. (Note: I have no experience with the GX series console.) Update: I've quantified my views on the ssh access over at So what didn't I like about the CLI?.
The better news is that the remote http access is way better than FilerView. I was able to quickly understand the relationship between projects and shares. I was able to quickly configure a complicated share on my storage. I loved it.
I also loved that it worked straight away, right out of the box. I happened to already have VMware Workstation 6.5.1 on my home desktop, so I was up and running. I was a little disappointed that it only supported local access on the desktop, until I realized I had 3 other network interfaces I could configure. Before I knew it, I had mounts going to it from remote computers.
I really found the GUI to be intuitive and I only struggled for a bit wishing for a CLI. I also used the minimal CLI to gather some data on the share. It reminded me a lot of the way our LOM managers work.
The other real plus in comparing this simulator versus the ones NetApp provides is that I don't need any license keys. Sweet!
I totally trashed my old Roller template and went with a simpler scheme. With the experience I got from doing the web sites for Connectathon.org, BlitzUnited.org, and 97Red.com, it was pretty simple.
The hardest issue was getting the Categories to be formatted. I wanted them to have better class names, but in the end I looked in the resulting html file and just changed my style file to match.
I wish I knew how to see other's custom templates. It is pretty easy to see any style files they develop, but I'm afraid that the templates are within a database.
I'm done for now. Oh, and I'd like to figure out why the colors I used for the site banner with Gimp are coming out different than the background color I set. This is pretty consistent across browsers. The only other thing I see is the custom borders for the content looks washed in Safari.
I'm trying to redesign my blog layout. Here is what I thought was a cool pNFS image that I was going to put in my banner. It didn't work, but I did spend a chunk of time in OmniGraffle, so enjoy!
It kinda reminds me of those stickish soccer figures which are popular in club logos.
My colleague, Piyush Shivam, was slated to give a 15-20 minute presentation at the pNFS BoF at FAST 2009. Due to a miscommunication, this slide deck was never presented: pNFS.
He put a lot of hard work into it, so enjoy!
I'm trying to help someone on nfsv4@linux-nfs.org with a NFSv4 issue between a Linux client and a Sun Storage 7410 system. I don't have my hands on a real 7410, but I found this great resource for letting me simulate one: Sun Unified Storage Simulator.
It downloaded easy and I'm about to configure it...
The client team gave an interesting presentation at cthon: Solaris pNFS Client WIPs. Some useful tips I walked away with are:
[root@pnfs-9-23 ~/cthon04]> nfsstat -c -v 41 ... bind_conn_to_session exchange_id create_session 0 0% 3 0% 11 0% ...
Check to see that your exchange_id is at least the number of servers. In my case, I have 3 of them and we can see that the client has done IO with all of them. This must mean that a layout has been granted.
[root@pnfs-9-23 nfs41]> cat >> foo kdkdk lddkk
And check with:
[root@pnfs-9-23 ~]> nfsstat -l /net/pnfs-9-26/pnfs1/nfs41/foo Layout unacquired
Okay, I had a mds panic earlier, so I suspect that it does not know about the ds servers. After a quick reboot, we see:
[root@pnfs-9-23 ~]> nfsstat -l /net/pnfs-9-26/pnfs1/nfs41/loompa
Number of layouts: 1
Proxy I/O count: 0
DS I/O count: 1
Layout [0]:
Layout creation timestamp: Mon Mar 2 12:27:50:25677 2009
Layout [0]:, iomode: LAYOUTIOMODE_RW
offset: 0, length: EOF
num stripes: 2, stripe unit: 32768
Stripe [0]:
tcp:pnfs-9-25.Central.Sun.COM:10.1.233.67:48217 OK
Stripe [1]:
tcp:pnfs-9-24.Central.Sun.COM:10.1.233.66:62363 OK
From this I can tell which machines are my DS, but I can't tell which zfs data sets I am writing to on those machines.
I'm wanting to write about how we would go about creating a consistent snapshot of a pnfs community. But first I want to try and understand how zfs does this for a single filesystem. And I'm going to be braindead about figuring out how. I.e., I'm playing dumb.
The question at hand is whether or not zfs snapshot flushes writes or not? I can determine this by:
I wrote a simple Perl script to test this:
[thud@warlock test]> more slam.pl #!/usr/bin/perl use Time::HiRes qw(time gettimeofday); open(FP, ">$ARGV[0]") || die "Can't open for writing $ARGV[0]: $!\n"; print FP time . "\n"; `zfs snapshot tank/test\@$ARGV[0]`; close(FP);
Now time to test:
[thud@warlock ~]> zfs create tank/test [thud@warlock ~]> chmod 777 /tank/test/ [thud@warlock test]> cd /tank/test [thud@warlock test]> ./slam.pl one [thud@warlock test]> zfs clone tank/test@one tank/one [thud@warlock test]> more ../one/one 1235973947.93013
Which would indicate that the flush had to occur. What if we add another write after the snapshot?
[thud@warlock test]> vi slam.pl [thud@warlock test]> ./slam.pl two [thud@warlock test]> zfs clone tank/test@two tank/two [thud@warlock test]> more ../one/one 1235973947.93013 [thud@warlock test]> more ../two/one 1235973947.93013 [thud@warlock test]> more ../two/two 1235974891.62171 [thud@warlock test]> more two 1235974891.62171 1235974891.84716
That is a strong indication that zfs is flushing writes. Hmm, what if we add a pause to the script and see if we are flushing the writes to the active filesystem?
[thud@warlock test]> more slam.pl #!/usr/bin/perl use Time::HiRes qw(time gettimeofday); open(FP, ">$ARGV[0]") || die "Can't open for writing $ARGV[0]: $!\n"; print FP time . "\n"; `zfs snapshot tank/test\@$ARGV[0]`; print FP time . "\n"; print "Type something, hit return\n"; my ($pause) =; print "$pause\n"; print FP time . "\n"; close(FP); [thud@warlock test]> ./slam.pl pause Type something, hit return
And meanwhile, in another window:
[thud@warlock test]> zfs clone tank/test@pause tank/pause [thud@warlock test]> more ../pause/pause 1235975458.21851 [thud@warlock test]> more pause 1235975458.21851
So, the zfs snapshot is causing a write to be flushed that wouldn't normally be flushed. I.e., we are waiting on input and haven't flushed the second write. What happens if we take another snapshot manually and look at the contents?
[thud@warlock test]> zfs snapshot tank/test@pause2 [thud@warlock test]> zfs clone tank/test@pause2 tank/pause2 [thud@warlock test]> more ../pause/pause 1235975458.21851 [thud@warlock test]> more ../pause2/pause 1235975458.21851 [thud@warlock test]> more pause 1235975458.21851
Very, very interesting - this dirty write is clearly not flushed.
[thud@warlock test]> ./slam.pl pause Type something, hit return unpause unpause
Time to stop playing dumb, even though it is fun to experiment here. I'll go look at the code tomorrow.