Saturday Oct 31, 2009

Happy Z-Day everyone!

Breaking with tradition somewhat, I'm not sure there's going to be any fireworks photos here this year: we're down at my parents house, and are less likely to get the sort of sustained shelling that we normally experience in Raheny each year.

On the plus side, we had homemade pumpkin soup for lunch, so was at least able to get this shot. If there's any fireworks later on, I'll update this post - but otherwise, here's some Halloween cheer:

The day's been great so far - lovely birthday presents (a Merino base-layer and a copy of Neverwhere from the lovely missus, and DVDs of the first two Ice Age films from E (I think she had an ulterior motive there!) and a nice fleece from my folks)

I also popped out for a birthday run around Greystones this afternoon, only 10k but it was enough for me to realise I'm far from being back in running form: the recovery is going to last a few more weeks I think!

My folks are baby-sitting tonight, so myself and the missus get to go out for a grown-up dinner, which I'm really looking forward to. Happy Halloween!

Update: There were fireworks after all, here's a few shots: fantastic, the tradition continues!

Tuesday Oct 27, 2009

I ran the Dublin City Marathon yesterday in approx. 3h 30m (more on that later) - that's my first marathon, but I suspect not my last: to anyone even half-thinking of running 26.2 miles, you have to try it.

My motivation for running started several months back on a low note. The background was that I was increasingly working from home, with work being busy and having a desire to eat dinner with the family and be around to put the kids to bed, I was noticing that there were days where I wasn't leaving the house at all. At the same time, the rumours started about Sun being in talks with various companies about a possible acquisition.

When the Oracle deal was announced it made things worse - what had previously just been rumours in the press became a lot more believable. Usually when something like this is going on, I'll write my thoughts about it here: but doing so would have been unprofessional. The furthest I ever went was the occasional emoticon on my twitter feed.

The solution for both problems, I decided, was to get out in the mornings and run: exercise, and a little time to think.

As the months went by, I was running further and further, logging my progress to @timfoster as I went ,and generally having a whale of a time. One weekend, we had a few friends over for lunch - Kev, Nic, Mike & Maria. A bit of the conversation went something like this:

"That's great running you're doing Tim, you should do the marathon"

Well, that was it, I'd had vague thoughts of doing it one day, but hearing someone else say it out loud was enough to make me register that night - with some trepidation, I might add. The registration form grouped entrants into three categories in order of expected finishing time: 3h or less, 3:30-4:15, 4:15+. I had no clue where I belonged, so popped myself in the middle and got on with it - that was July 28th this year.

I figured I ought to be following some sort of formal training plan. The athletics forum on boards.ie was a great resource, and were pointing newbies like me to Hal Higdon's running site . I chose the Intermediate II schedule, as I figured I was already reasonably fit with the daily commute on bike from time to time, and all the running I'd done so far. I'd missed a few weeks at the beginning of the formal program, so worked out where I should be (which turned out to be a good place to start given the running I'd already done) and stuck to it.

As I got closer to the race, I was dutifully doing my Long Slow Runs each weekend and was coming up to running the last of the three 20 milers when I became anxious about what sort of pace I'd manage in the race - could I go fast over a long distance? I didn't know. I didn't want to risk burning myself out in the early stages of the race, and end up not finishing. I decided to push it, and do a Long Fast Run instead - you're not supposed to do this during training, I found out why. Yes, I discovered that I actually could manage a 7:15 min/mile pace over 20 miles, but I also managed to injure my leg in the process. To make matters worse, I'd miscalculated where I was joining the schedule, and it left me with only 2 weeks to taper before the race rather than 3, and most of those 2 weeks were spent just resting my leg rather than doing the suggested mileage.

With that, race day was upon me. I was aiming for a 3h15m finish - but given the last few weeks, this was probably unrealistic.

The atmosphere around the start was tense, but an amazing experience - very very well organised I thought: a record turn-out of 12,500 people this year.

After a lot of limbering up, and shedding of bin-liners, the starting gun fired. We moved very slowly at first, eventually getting past the starting line. The race started gently: I was in the middle of the 3:30-4:15 pen and first few miles were depressingly slow, difficult to overtake slower runners and I was already well down on my target. They say one of the most common mistakes by new marathon runners is starting too fast, so I kept repeating that to myself, and tried to stay calm.

By mile 5, I'd escaped the crowds in Phoenix Park and picked up the pace, passing Kev & Nic at mile 10 who were out cheering for me (thanks!!) and managed to keep pretty much on target till about mile 16 where I started slowing down, only to slow down further on miles 20/21 (the dreaded Roebuck/Foster Avenue hill) The missus & the kids, and Mum & Dad were cheering for me there, and I spotted my friend Barry too, who was marshaling for the race: familiar faces making the run a lot easier.

In general, the support from the crowd on the day was phenomenal: I really hope the people who got up early on an October Bank Holiday Monday appreciate what a difference them cheering really makes to runners - and, to the lady watching the race who gave me a jelly baby around Kimmage to whom I forgot to say "thankyou" - many many thanks, it was yummy, and much needed!

Things took a turn for the worse though around mile 22 - going down Nutley Lane, I felt a twinge in my thighs that I'd felt once before during training: the onset of cramp. I had to stop stretch/shake out my legs periodically, eat more jelly babies and start again. My times were tumbling now, and I watched in dismay as the 3:30 pacer balloons passed me on Merrion Road. For the rest of the race, I kept going as fast as I could manage, but it wasn't enough.

Rounding the final corner onto Pearse St. I went for it, eating my remaining sweeties and telling myself it'd soon be over. I crossed the line, and stopped my watch, which told me 3:30:46. I was tired but happy - I'd missed the 3:15 target, but there's always the next marathon.

As for official timing, I'm still a wee bit confused - the timing service that was tied to the chip on my bib told me I'd finished in 3:32:04, yet when I visit the results page on the Dublin Marathon website and search for my number (4871), at the time of writing, it confirms that chip time of 3:34:04, but tells me my finish time is 3:30:34. I know there's a difference between chip time and gun-time, but I'd always thought gun-time should be longer than chip-time (as it doesn't account for the time it takes you to actually reach the starting line, whereas chip time is registered from the moment you cross the start-line) Perhaps they got the numbers mixed up - 3:30:34 is closer to what was on my stopwatch. Anyway - it doesn't really matter.

I had a ball on my first marathon - I finished 1516th out of a field of 12,500, which I'm happy about. I've got memories I'll never forget and a belief in myself that I never had before. Sure, I'm walking around like Boris Karloff today (legs rather sore) and am wishing our house didn't have stairs, but I'm extremely proud of what I've achieved. I think I'll keep running and would strongly recommend everyone to have a go at completing a marathon: it's a truly unforgettable experience.

Here's the route map, and the splits from my watch - I missed a few mile markers here & there, so just rolled up those times into a single row. As you can see, I lack consistency here - so that's something to work on for next time.

MileTime by my watch (min:sec)
18:02
2, 3, 426:12
57:55
67:40
77:21
87:06
97:02
107:10
117:06
127:36
137:33
147:16
157:33
167:50
177:40
187:47
198:02
208:31
218:41
227:57
23, 2418:34 (yeah, cramps started about here)
259:33
268:33
0.21:52

Monday Sep 21, 2009

I presented An Introduction to OpenSolaris last Saturday at OSS BarCamp - my contribution to Software Freedom Day 2009.

You can download the odp presentation or the pdf, which I've exported with my notes for the talk that explain each of the slides a little more - I hope this is useful if you're planning on giving a similar talk.

A few things struck me while preparing for and giving the presentation. Firstly, it seemed odd to be giving an introduction to an operating system that's been around for quite a while now: clearly we haven't been doing enough of this sort of thing (and from personal experience, yes, ie-osug isn't as active as I'd like, I just sadly don't have the bandwidth)

Then secondly, it's really hard to cover all of the interesting features of OpenSolaris in sufficient detail over the course of an hour. My take, was to try to whet the appetite, rather than explain every feature fully - and in some cases, go for the features I thought the audience might be interested in (for example, mentioning NWAM as one of the major networking features - perhaps it's not as full of rocket science as other aspects of Solaris networking, but it makes a huge difference to the novice user)

Finally, and slightly embarrassingly, I had to spend about 5 minutes in front of an expectant audience futzing around with the display settings on my R500 laptop to get it to talk to the projector. It was doubly annoying that both xrandr (and indeed gnome-display-properties) were able to see the separate screen, but try as I might, I couldn't get any output to appear externally. Ultimately, a kind audience member offered me a USB key by which I transferred the pdf over to my EeePC (running nv_122) which was able to see the projector, but didn't have any of my demo material setup (some ZFS settings, some zones, crossbow, flows etc.) Oh well.

Here's hoping that at least some of the audience left with an impression that OpenSolaris was worth taking a second (or perhaps a first?) look at, despite the brevity of my talk and the initial teething problems I had. My new mantra:

Never work with children, animals, or weird exernal VGA projectors.

Monday Sep 07, 2009

Original image by Tim Foster

Discovered via Boing Boing this morning, I found this article particularly enlightening. Bruce Sterling talking about the future, picking five technologies from today, "The Cloud! Web Squared! The Internet of Screens! The Internet of Things! Augmented Reality!" and showing how they would seem completely normal to anyone in the future.

They're phantom far-out notions gobbled up by the real world. They packed in there so deep that nobody notices them. So, yes, I can write about it. It's just: it doesn't look futuristic. It looks way too real.

Why isn't it grand? Why isn't it as fantastically grand as the spectrum of all possibility? Well, why isn't today grand? Why didn’t we wake up this morning in direct confrontation with the entirety of past and future? The present day is the only day we’re ever given.

[ Bruce Sterling, writing for Webstock ]

Definitely worth reading the whole article.

Tuesday Jul 07, 2009

I'm honestly not sure the world needs another "here's how I solved my problem with Crossbow" blog post - it's been covered pretty well already by Nicholas, Chris, Ben, Joerg and likely many others, but still - adding to the collective knowledge of your search engine of choice, here's my version:

Original image by ToastyKen Original image by ToastyKen

Introduction

Support for virt-install to initiate PV and HVM installs of OpenSolaris guests using OpenSolaris AI (Automated Installer) is on it's way once our 3.3 wad integrates, and for the most part, I did that work on a standalone network of two machines. This meant that I had to go into the office every time I wanted to try something out. Of course, with Crossbow, I needn't have gone to that much trouble - so, in case anyone else finds themselves in the same situation, here's how to do it.

Setting up an AI server for testing on a network where you don't own the dhcp server can be a pain. However, given a .vdi that already houses the AI server, all you need is the right network configuration in your host to make testing AI + xVM easy - thanks Crossbow! I now just need to ship around a domU .vdi with my AI server on it, and the corresponding virsh xml definition and I can have it running on any machine in a jiffy.

The things you want are:

  1. a dedicated etherstub in dom0 that you can create vnics from
  2. vnics on that etherstub, each with a local network address, one of which is plumbed in your host
  3. ipnat rules in dom0 to allow that local network out through your primary interface, and ssh into the guest on defined ports
  4. ipfilter rules in dom0 to block stuff you don't want reaching the outside
  5. ( and visa versa, perhaps providing port redirection into your guest )
Implementation

Here's what you need to run:

# dladm create-etherstub timswitch0
# dladm set-linkprop -p mtu=1500 timswitch0
# dladm create-vnic -l timswitch0 timnic0
# ifconfig timnic0 plumb 192.168.1.1 netmask 255.255.255.0 up
# (if necess.) ifconfig timnic1 plumb
   otherwise, pass virt-install timswitch0, and the vif-vnic script
   will create a vnic on top of it to plug guests into that etherstub

# routeadm -u -e ipv4-forwarding

then having edited /etc/ipf/ipf.conf and /etc/ipf/ipnat.conf (see below)

# svcadm enable ipfilter
# ipnat -f /etc/ipf/ipnat.conf

This gives us the following configuration:

# dladm show-link
LINK        CLASS    MTU    STATE    OVER
ath0        phys     1500   up       --
atge0       phys     1500   unknown  --
timswitch0  etherstub 1500  unknown  --
timnic1     vnic     1500   up       timswitch0
timnic0     vnic     1500   up       timswitch0


# ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
	inet 127.0.0.1 netmask ff000000 
ath0: flags=1104843 mtu 1500 index 15
	inet 10.0.0.10 netmask ffffff00 broadcast 10.0.0.255
	ether 0:15:af:6a:64:ec 
atge0: flags=201100843 mtu 1500 index 16
	inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
	ether 0:1e:8c:bb:8e:7a 
timnic0: flags=1100843 mtu 1500 index 18
	inet 192.168.1.1 netmask ffffff00 broadcast 192.168.1.255
	ether 2:8:20:7a:17:4f 
lo0: flags=2002000849 mtu 8252 index 1
	inet6 ::1/128 
ath0: flags=2004841 mtu 1500 index 15
	inet6 fe80::215:afff:fe6a:64ec/10 
	ether 0:15:af:6a:64:ec 

Here's what you need in /etc/ipf. Replace 'ath0' with your primary interface. In ipnat.conf we redirect port 9022 to the ssh port in our guest on 192.168.1.2 (the guest containing the AI server) so to ssh in to the guest, do:

 # ssh -p 9022 <IP address of dom0>

----------- /etc/ipf/ipnat.conf --------------
map ath0 192.168.1.0/24 -> 0/32 portmap tcp/udp auto
map ath0 192.168.1.0/24 -> 0/32
rdr ath0 0.0.0.0/0 port 9022 -> 192.168.1.2 port 22
--

Various IP filter rules applied in dom0. At the moment, we're logging packets that aren't allowed through timnic0. Run 'ipmon' in dom0 to watch this log - useful when debugging (and choosing what other rules to add)

----------- /etc/ipf/ipf.conf -------------------
#
# ipf.conf
#
# IP Filter rules to be loaded during startup
#
# See ipf(4) manpage for more information on
# IP Filter rules syntax.

# allow ssh into our guest from to anywhere
pass out quick on timnic0 from any to any port = 22 keep state

# allow dns, http, ssh in from timnic0, remember we're using NAT here,
# and we're a router, so this is actually traffic from the guest network
pass in quick on timnic0 proto udp from any to any port = 53 keep state
pass in quick on timnic0 proto tcp from any to any port = 22 keep state
pass in quick on timnic0 proto tcp from any to any port = 80 keep state

# allow anything at all inside our guest network
pass in quick on timnic0 from 192.168.1.0/24 to 192.168.1.0/24
pass out quick on timnic0 from 192.168.1.0/24 to 192.168.1.0/24

# allow nothing else through timnic0
block in log quick on timnic0
block out log quick on timnic0
--

In a guest, we see:

root@opensolaris:/tmp# ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
	inet 127.0.0.1 netmask ff000000 
xnf0: flags=1004843 mtu 1500 index 2
	inet 192.168.1.45 netmask ffffff00 broadcast 192.168.1.255
	ether 0:16:36:5f:7f:c5 
lo0: flags=2002000849 mtu 8252 index 1
	inet6 ::1/128 

root@opensolaris:/tmp# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface 
-------------------- -------------------- ----- ----- ---------- --------- 
default              192.168.1.1          UG        1         76 xnf0      
192.168.1.0          192.168.1.45         U         1         31 xnf0      
127.0.0.1            127.0.0.1            UH        1         28 lo0       

Routing Table: IPv6
  Destination/Mask            Gateway                   Flags Ref   Use    If   
--------------------------- --------------------------- ----- --- ------- ----- 
::1                         ::1                         UH      1       0 lo0 

Update: had a typo in one of the commands above, thanks for spotting it seanmcg

Wednesday Jun 24, 2009

I think I'm about done with the next release. Here's the Changelog entry:

0.12

  • Add event-based snapshots
  • Add support to change the separator character in snapshot names
    • set the default value of "zfs/sep" to "_"
    • useful for CIFs clients that previously choked on colons in snapshot names
  • Improved shutdown speed via http://blogs.sun.com/dp/entry/speeding_to_a_halt
  • Add support to allow the user disable auto-snapshots of new pools
  • Bugfix to allow snapshots of datasets with spaces in their names
  • Bugfix to properly deal with namespace clashes in dataset names
  • Exported $LAST_SNAP and $PREV_SNAP variables when performing backups

The main new thing here is the "event-based-snapshot" instance, as described in the README and Defect 9595. It's nothing earth shattering, but a useful feature I think.

I've found that in my day to day use of OpenSolaris, I tend to take the lazy option of running "zfs snapshot -r rpool@snap" whenever I'm about to do something to the system that I might regret later, and don't want to wait for the 15 minute ":frequent" instance to fire. Later, I go to run the same thing again, and find I've already got a snapshot called rpool@snap, so I take a new one, rpool@snap2 - can you tell where this is going?

Eventually, I find myself out of disk-space and with no real clue what was interesting about rpool@snapn in the first place. I torch the snapshots and get on with life. So far, this has worked just fine, but it's a bit manual and results in us snapshotting rpool/swap and rpool/dump, and I don't really want that.

Some of the stuff Erwann added for 2009.06 helps a bit - with the "snapshot this directory" button in Nautilus, we could take a snapshot of a single dataset, but it doesn't group them, and still leaves the name of the snapshot as the only place to describe the snapshot contents.

So, my solution was to add the svc:/system/filesystem/zfs/auto-snapshot:event instance. Most of the code for this was in 0.11, but I'm now including a manifest that uses it. This service isn't managed using cron, instead you get to manually run run the method script each time you want to take a snapshot. You also have to option of supplying a description that gets stored into a user property on the snapshot, com.sun:auto-snapshot-desc (feel free to go all "Web 2.0" here, and add #hashtags if you want!)

Where this wins over a simple "zfs snapshot -r rpool@snap" is that it uses the com.sun:auto-snapshot properties used by the other instances to determine which datasets we want to take snapshots of (and can be overridden by com.sun:auto-snapshot:event).

I've hacked together a (flawed) GUI that I've added as a launcher on my GNOME panel which shows a Zenity dialog box asking for a description of the snapshot, runs the method script, then pops up a notification once the snapshots have been taken. I've not included this in the package, since I'm sure someone will do a better job of it, but download from here if you want it. Obviously this is but one use for event-based snapshots: if you set the zfs/backup-save-cmd SMF property on that instance, you'd have a 1-click "backup my stuff" button! :-)



More of my awesome GUI skills...



The snapshot event notification popup

As ever, the README also documents these changes, and you can get the sources and build yourself a package via:

$ hg clone ssh://anon@hg.opensolaris.org/hg/jds/zfs-snapshot

I'm not sure when these changes will land in OpenSolaris - there'll be a 0.12.1 release as soon as I get to write support for the 'zfs list -d' command that Chris added for the bug I filed 6762432, so perhaps we'll wait for that (I thought it polite to wait till everyone was able to run a version of OpenSolaris that included that fix before making the changes).

Comments welcome here, or on the zfs-auto-snapshot mailing list

Monday Jun 15, 2009

Some of the work I've done recently involved some changes to virt-install(1) to teach it how to install xVM guests from OpenSolaris AI servers - work that you'll see going back soon as a patch to virt-install as part of our xVM 3.3 changes.

This ended up shaking a few bugs out of AI and OpenSolaris, two of which became stoppers for the 2009.06 release (which made for a rather exciting weekend) one was fixed, the other was documented in the release notes.

Along the way though, and the point of this blog post, I got to learn a bit more about OpenSolaris and the boot process when we're using AI and what to do when things go wrong.

For x86 (the only thing I really cared about in my case, sparc differs slightly) AI works by downloading the kernel and a very small boot archive via pxe and tftp. The client then boots with this image and the svc:/system/filesystem/root:live-media SMF service arranges to download solaris.zlib and solarismisc.zlib files, and mounts them on the client.

However, should that service fail for some reason - we're left with a pretty unfriendly OpenSolaris environment. There's a tradeoff between fast/low memory installs and easy-to-debug environments so it's a tough one to call.

However, if you do need to debug stuff early in boot with an AI image, I added a comment to Defect 6851 that explains how you do it. This came from an email I was writing to a colleague today who was running into the same problem - I figured posting those comments to the bug report and writing this short blog post would be a good thing to do. Hope this helps someone out there?

Wednesday Jun 03, 2009

A few weeks ago, we had a Saturday that was everything a Saturday should be - not too early a start, a nice breakfast, an energetic, if slightly damp, walk with Calum and the four of us visiting the Botanic Gardens for a picnic.

We're sort of in a weird state at home at the moment, trying to go around appreciating everything that Dublin has to offer (hence the visit to the Botanic gardens), unsure of what the future holds.

Ever since our first trip to New Zealand we've always thought it'd be an interesting place to live, but before my most recent trip there for Glynn & Jayne's wedding, we'd talked about my using the time over there to consider more carefully whether it's somewhere we'd want to emigrate to.

Over the course of the two and a bit weeks there, I was gradually leaning towards a "yes" answer to the question above, but it's funny - as soon as I heard the rumours about a supposed deal with IBM to buy Sun, I'd decided that if the deal went through, that'd be it, we'd really seriously look into moving. In some ways, that the deal fell through was a bit of a relief, not only because I think it'd have been the wrong thing for Sun, but also because I was off the hook in terms of facing that big decision to move. Now that there's another deal pending with Oracle, I need to face that question again.

On the other side of the argument, we've got a pretty cushy number in Ireland at the moment: our house is a 25 minute cycle from the office (the missus has a shorter commute to her office) and the creche we use for Ella, and possibly Calum too, is right next door -- but that, in a way makes us feel even more trapped: should we give up what appears to be a perfect setup and fling ourselves into the unknown? Looking further out, there's good primary schools in the local area for the kids, but getting them to a secondary schools would mean quite a commute for them I think, so we'd end up having to move somewhere in a few years anyway.

There's some things that could make moving easier. Already, I'm the only person in Ireland working in the Solaris xVM kernel group, so I'm working remotely wherever I am: working from the other side of the world probably wouldn't be that much different for me, assuming I'm allowed that opportunity with the pending acquisition. And of course, I'm not just moving me - we're a family, so anything we do has to work for everyone, otherwise it's not going to happen.

So is the decision to move made yet? No, not at all - it is a realization though, that we need to make that decision soon, rather than leave it hanging over us. Perhaps the best way, is to try to get out there for a year, and see how we settle - a sort of "Try and Buy" approach I suppose.

But, between then and now, there's plenty to enjoy about Ireland, and it seems like considering the question on whether to emigrate or not has made us think a lot about life in general and what we want to get out of it. I think we all should be enjoying it a lot more, dancing in the bluebells as much as we can.

Wednesday May 27, 2009

It's with regret, that I won't be able to head over to CommunityOne West next week - the OpenSolaris tracks look pretty interesting, but as ever with this sort of thing, it's as much about the people you'll meet as the actual content of the sessions - in fact, I'd almost argue, it's more about the people you'll meet, than the sessions.

It's been ages since I've chatted to OpenSolaris-folk in person (other than my immediate colleagues in the xVM team and Glynn of course) and while email and IRC are good, they're no really match for having a beer and a natter with like-minded people - the OpenSolaris Party on Monday night looks like a pretty good opportunity for that. I would also quite like to be over there for the launch of 2009.06 having played a bit of a part in this release too (note to self: filing bugs that end up on a stopper list makes for somewhat exciting weekend), oh well.

My travel plans this year have been put on hold for a while - having been away from the wife + kids for a few weeks while I was down in New Zealand, closely followed by another week in MPK, I've promised to stay put in Ireland for a bit, although more about that in a future blog post I think.

Still, I'll be doing my best to keep up with what's going on at C1, and hope that I can at least read some blog reports on how the event goes next week. Wish I was there! If you wish you were there too, but live a bit closer, and feel like learning more about OpenSolaris, then do fill in the box below :-)

Monday Apr 27, 2009

Original image by James Jordan

I've been rather busy of late, with a recent trip to MPK resulting in a ton of work to bring back home, so haven't had much chance to blog as much as I'd like, apologies.

But, recently, there's been some activity with the ZFS Automatic Snapshot service that I thought I'd publicise a little bit. It seems that great minds think alike: myself, Brock Pytlik from the IPS team and Glenn Brunette (ok, two great minds, and me :-) all seem to have come to the independent conclusion that automatic snapshots on a local machine are good, but snapshots going to a remote machine are great, and have become more interested in dusting off the lesser-known zfs/backup-save-cmd option of the ZFS Automatic Snapshot service.

The timing here is excellent, as this is something I'd been thinking about with the advent of the Sun Cloud API (which relates to my day job at the moment in an interesting kind of way). More work to come in these areas I hope, but after a few mails back & forth with Glenn, he's made it first-past-the-post, with an implementation to send auto snapshots to S3 storage, which looks pretty nifty to me!

There's a heap of other stuff we could do here, we need a few things for this to really fly though:

  • A means to list all snapshots on the remote end
  • A means to choose the most recent common snapshot between the local and remote ends, and send an incremental send stream between that snapshot, and the one we've just taken
  • A means to define what "remote end" means, in an extensible way (be it removable media, network devices, cloud storage etc.)
  • An ability to send/recv into ZFS-based Cloud storage - (storing flat ZFS send streams in the cloud isn't as useful imho - I'd like to be able to browse these from any device)
  • Use the auto-snapshot zfs/interval SMF property set to none, we can take event-driven snapshots, so we could do things like hook the service into nwam, so that we take an on-demand snapshot whenever we get a network connection ( assuming a sensible time period has elapsed since our last snapshot) so we never lose data. The zfs auto backup prototype I'd posted before did this for local disk storage, but I never really took the idea further, waiting for better ZFS removable-media support.

Of course, there's just not enough hours in the day for one person to do all of this, but if you're interested in these sorts of problems, do subscribe yourself to the ZFS Automatic Snapshot email alias and dive in!

But once again, kudos to Glenn for giving this a whirl!

Tuesday Mar 10, 2009

I voted:

RECORDED:  ballot 971674dbf07ccbd25a1bf7935a61ecc1a8b26493 on "Board
Election 2009/Change Constitution" from Tim Foster
Connection to poll.opensolaris.org closed.

- "yes" to what looks like an excellent change to the constitution, and for a set of seven people (I expressed preferences for all sixteen candidates) that I really would love to see on the OGB this year.

Yes, I voted before seeing much in the way of electioneering from several candidates, my view being that people going for posts on the OGB should be pretty well known in their work on the project already and I shouldn't only be hearing about them in the days running up to the election.

There's detail on the above on the 2009 elections page.

Wednesday Jan 21, 2009

We've got the venue confirmed: here are details of the upcoming Irish OpenSolaris User Group meeting:

TopicGeneral OpenSolaris Discussions
DateThursday 29th January 2009
Time19:30
LocationThe Vaults

Look forward to seeing you there! If you can help out with equipment or have ideas for presentations, or just feel like saying "hi", drop mail to the mailing list.

Monday Jan 19, 2009

I sent some mail to the Irish OpenSolaris User group list today, proposing to kick-start our user group meetings again.

Meetings haven't happened at ie-osug since last February, and we're trying to see if a change of tack would help get things going again.

Our meetings from June '06 - Feb '08 were more like a mini lecture-series about OpenSolaris, and while I think these were interesting, they often came across as a bit formal: yes we had pints afterwards (which were usually great) but there was never the atmosphere of community we'd hoped for.

So, this time, we're going to try the approach the SFOSUG use - try holding the meetings in a pub. The location we're thinking of, The Vaults serves food & beer and we'll hopefully be able to reserve a small room in the place. We'll bring along a wifi acccess point and a few laptops, and an LCD projector. We'll still be able to do "feature presentations" if people feel like doing them, but hopefully the more informal atmostphere of a pub will help get people talking a little more, and perhaps grow the user group and get more people participating.

I don't have the exact time/date yet - but will post more when I have it, I suspect it'll be Thursday January 29th. Do please comment here, or send mail to the list if you think this would be a good way to get more people interested in OpenSolaris in Ireland?

Wednesday Jan 14, 2009

I'm absolutely thrilled to announce, that Scott Seighman is continuing to write the OpenSolaris in review posts that I'd been maintaining for a while in this category.

Original image by Dan Taylor Original image by Dan Taylor

So, without further ado, here's Scott's first post - OpenSolaris in Review - December 2008. Time to update your RSS feeds everyone!

Finally, speaking from experience, these posts are a lot easier to write when people in the community help out by contributing additional content each month - so do please send Scott anything you might have that's worthy of a mention.

Thanks for taking on this job Scott - much appreciated!

Monday Jan 12, 2009

I'm rubbish at this sort of thing and generally don't do Internet memes, but since stevel tagged me, the worry of whether I'm interesting enough to come up with 7 things you may not know about me has been weighing heavily on my mind. Judge for yourself whether they're interesting or not!

  1. As a kid, I lived opposite Rathfarnham shopping centre, which in those days was shut at weekends and at 6pm each night. This made for the perfect place to tear around on my blue Raleigh Grifter, a fantastic machine - which probably weighed as much as I did at the time. I have fond memories of that bike - it's motorcycle-gripshift and it's wonderful, if sometimes deadly, 3-speed Sturmey Archer hub.

    Much of the time was spent just messing about, playing chicken with brick walls (pretending you were an X-Wing trying to pull up before crashing into the shields of the now-fully-operational Death Star. I still have all my teeth, in case you're wondering), but playing "Squares", a mixture of a slow bicycle race, sumo wrestling and Kick Start, was the best way to spend your time on a bike.

    Here's how you play: decide on a playing area, a square or rectangle, marked out by the white lines of a few empty car parking spaces. Then get a few of your friends/siblings on bikes to cycle around within the confines of that square. If you touch the ground with your feet or go outside the square, you're out. The winner is the last person on their bike. Hilarity ensues. (and the odd grazed knee, I'm guessing)

  2. I'm the most optimistic person I know. I think this tends to eventually get on people's nerves, but I can usually be relied on to put a cheery spin on whatever situation we're in at the time.

  3. Back in the day, I spent way too much time writing mod files using FastTracker and others. This was all on our 16mhz i386 (2mb ram!) with a handmade Covex thing stuck in LPT1 - I couldn't afford real hardware, which was probably for the best. I've lost most of what I wrote over the years, but have dredged up a few bits of music here, with one converted to mp4, just in case you don't have a mod player handy. If you don't, don't worry, you're not missing much!

  4. From time to time, I wonder what life would be like if I followed the other career ideas I'd had. Over the years, I've wanted to be a cabinet maker and a rally driver. I studied Science in university and was still doing two units of Botany in my final year, before deciding computers were for me.

    Even now, I wonder whether I'd be any good at photography, probably not - though more on my SmugMug page (I like the New Zealand ones - and am looking forward to visting there this year for Glynn & Jayne's wedding!) On the other hand, without the pressures of deadlines and actually putting bread on the table, perhaps being an amateur isn't so bad after all.

  5. I'm the only Genesis fan I've ever met - I mostly prefer the Peter Gabriel era, but like all of their stuff really. I tend not to really like any other prog rock band I've listened to.

  6. I used to role-play rather a lot: epic games that'd last multiple weekends, with story arcs that went on and on. We played anything we could get our hands on - D&D, Runequest, Call of Cthulu, Paranoia, Shadowrun, the system didn't really matter to us. Haven't played in a year or two though, but it's still something I really enjoy. I'm not sure whether this qualifies me as "evil" stevel? Depends who you ask I guess ? :-)

  7. Do you ever get the feeling that you have a really good idea floating around in your head, but can never quite organise your thoughts enough to express it? I get that all the time. I hold a few software patents (not really worth talking about) and have submitted one or two other patent disclosures, but still feel like I haven't quite had my "Eureka!" moment yet - it's not something I worry about, but it's in there somewhere.

Ok, with that out of the way, here's another list of people who can choose to ignore this Internet meme if they so choose:

Here are the rules:

  • Link to your original tagger(s) and list these rules in your post.
  • Share seven facts about yourself in the post.
  • Tag seven people at the end of your post by leaving their names and the links to their blogs.
  • Let them know they’ve been tagged.

Saturday Dec 27, 2008

One of the many great things about being a parent around Christmas, is that not only do you get to see the looks on the faces of your kids when they're opening their presents (itself, priceless) but you also get to play with the presents with them!

Ella was really lucky this year, receiving lots of lovely presents, but one of the best, in my opinion, came from Duncan & Denise, an Alphabet Jigsaw - handmade by a company in Westport, Co. Mayo. It's one of the most ingenious things I've seen.

Ella thinks it's wonderful too. Here she is playing with it wearing her Upsy Daisy costume (thanks Santa!) with Gramps on the kitchen floor:

Calum, being only 15 days old when Christmas arrived, wasn't quite as much into the spirit of the occasion as Ella was, though he did manage to look quite adorable in his "My first Christmas" outfit. May the joys of parenthood continue!

Friday Dec 19, 2008

So far, parental leave is going pretty well - 9 days in and we're coping ok (and by the way, I'm not reading work email at the moment, which is probably a good thing - in my sleep-deprived state, I'm not likely to have much in the way of coherent responses!)

To lessen the effects of cabin fever we decided to go on our first family trip with Ella and Calum yesterday. This year in the Dublin Docklands, there's a Christmas market, and it sounded a bit easier to get the kids to that, rather than fly to one of the real Christmas markets in Europe.

A few things became clear during this trip - getting all four of us out of the house is going to take a lot more practice, and our car doesn't fit two prams (the missus carried Calum in a sling instead, but he won't stay this size forever: perhaps it's time to consider a mini-van? Our days of 3-door BMW coupes seem to be long gone...)

Was the market worth going to? Well, yes, if only just to see the look on Ella's face when she saw the carousel - so many rocking horses in one place, how fantastic! The Bratwurst and Lebkuchen were also very welcome treats. More photos below.

Thumbnails of Tim's photos taken at the Dublin Docklands 12 Days of Christmas market

Happy Christmas everyone!

Wednesday Dec 10, 2008

(though I'm happy that's being announced too)

Last night, close to midnight, I'd started writing a blog entry about Solaris on the desktop, the different environments we've used, and was going to talk a bit about how far we've come from when I joined Sun back in 1996 up to today's release of OpenSolaris 2008.11 (I was also going to mention my role in writing a chunk of Time Slider for the current desktop environment) but that post's for another day.

There was a much more important delivery today: Calum Henry Foster, 7lbs 2 oz, was born at 6:04am about 30 minutes after we arrived in the hospital. And yes, I broke a few red lights on the way in! Mother & baby are both doing well and I am the proudest father in the world (again!) - I'm grinning from ear to ear right now.

Tuesday Dec 09, 2008

Since I started the zfs-auto-snapshot work back in May 2006, there's been a missing piece of functionality. Well, I'm happy to say - it's missing no longer!

With thanks to Luca Morettoni, we now have a fix for:

6777694 Need ability to control when auto-snapshots are taken

We hashed out some of the details on private mail, then on the mailing list here then a few test scripts from Luca, some paper work, and the integration was done - thanks for your patience Luca! [ involving signing a SCA, sending me a hg export patch which I could push to the repository and waiting for me to push the changes - a few more good patches, and I'd be delighted to give Luca commit privileges to make this a bit easier ]

So now with this changeset, the SMF property "zfs/offset", defined since version 0.1 now actually does something - it takes a value in seconds to specify exactly when the various auto-snapshot cron jobs should fire. As always, bug reports welcome!

I've updated the README with details about this feature. We'll try to get it included in the wos as soon as we can.

Monday Nov 10, 2008

Original image by mborowick

Today's a very interesting day for storage systems - it's cool to see the Fishworks team are announcing the Sun Storage 7000 series systems: congratulations one and all. Great things are afoot in my opinion, these are fantastic systems.

While I'm not working on storage systems at Sun any more, I do feel an amount of empathy for those guys: I am working on a software appliance [1] in the form of xVM Server, and I can certainly appreciate what it takes to take a perfectly working OpenSolaris install, strip it down to the bare minimum, add stuff to make it shine especially brightly for a given task, and (of particular focus for me at the moment!) get a product out to the market.

That said, in my previous job in the Solaris ZFS test group, I did run into the Fishworks project, and that story might be worth telling. (And if there's rose-coloured glasses coming across in this post, I apologise: I love my current job, as much fun as QE was, it was also pretty grueling at times ;-)

It was coming into October 2007, and PSARC 2007/618 - the addition of L2ARC devices to ZFS was looming. These devices, along with Separate ZFS Intent Log devices (as a pair, affectionately known as ReadZilla and WriteZilla) and their intelligent application in a hybrid storage pool are some of the most exciting things about the products being announced today and I've really been looking forward today's announcement: it always gives me kicks to see Sun technology hit the market when I've been able to contribute to the product personally, even in the small way that I did in this case.

Anyway, Brendan had got in touch with the ZFS test group to see whether we could do anything to help out.

Our job as QE engineers on ZFS was to write and maintain the ZFS test suite. Clearly we needed to update the test suite to work with these new L2ARC devices. We'd done the same thing for slog devices, but in this case, we were looking for test coverage quickly. There was a ton of other work piling up on my plate: Solaris 10 update testing for ZFS, the Newboot Sparc work for Nevada, test sponsor duties for the fingerprint authentiction project, on top of all the other daily stuff going on. Busy busy.

So, I started hacking about to see how quickly I could get us a very general set of tests on the L2ARC. The answer? Pretty quickly indeed.

Rather than start from scratch by coming up with a closed set of assertions about L2ARC devices, discussing those assertions with colleagues, making sure they were carefully worded, before setting about implementing tests to verify each assertion, I decided to just wing it.

Now that's not to say that we shouldn't also go about writing tests properly, but for a quick fix (in every sense of the word), I wrote a 90 line shell wrapper around /usr/sbin/zpool which you can download here, if you're interested.

The wrapper maintained a list of devices that it'd try to add to every zpool created wth the wrapper; creating a pool would use up one device from the list, destroying the pool via the wrapper would return the device to the list. Pretty simple. This gave us a phenomenal amount of testing for free.

We could use this with our existing test suite, and it would add an L2ARC device to every pool. We could test big and small L2ARC devices, ones based on lofi devices backed by files in / tmp or ramdisks (attempting to simulate really fast disks, despite the weird VM hoops we were jumping through - which resulted in great hilarity when run with our somewhat insane stress tests running on really large machines...) and generally give the code a good run through.

The wrapper found a respectable amount of bugs, and was worth it's weight in gold, despite it's lack of formality in terms of the way we usually write tests. I'm not sure if it's still being used by the ZFS QE team, but I was pretty fond of it.

I think one of the reasons why L2ARC was so pleasant to test, was down to it's design. Like the intent log devices, they integrate beautifully into the rest of the system, with very little extra work on behalf of the user: and that usually makes test engineers happy too (or at least lets them concentrate on the underlying feature, rather than having to spent extra time making sure the CLI was working properly)

Of course helping on L2ARC testing wasn't all work - I was lucky enough to make it over to the Bay Area for the first OpenSolaris developer summit that month, and while in town Brendan was kind enough to invite me up to the Fishworks office for a quick chat about the testing, a look around, and a rather excellent burger for lunch. I even got the chance to discover that I'm completely dreadful at Fish-pong, perhaps lacking in the basic grounding of American football, table tennis and volley ball rules that my Irish upbringing just didn't provide - but that's another story.

I never got a chance to test on one of the physical Storage 7000 series boxes themselves, nor play with what looks like one of the snappiest web interfaces I've seen in a long time, instead I was focusing on L2ARC itself, and helping to make sure it was solid enough to integrate into Solaris. However, that same operating system is the very one that underpins these appliances, so in that sense - I'm glad I could help!


[1] although yes, today's announcements are software and hardware - indeed, xVM Server's not much without the right hardware to back it up either..

Friday Oct 31, 2008

It's been a pretty hectic Z-Day and Halloween, but a great birthday overall! (Of course, if you were to ask E, she'd maintain it was her birthday today as well - but that's ok, I'm happy to share :-)

I was woken up by herself and the missus this morning, being presented with my birthday present: a set of knee and elbow pads and a unicycle, which I'm absolutely thrilled about!

As a result, when working from home today, coffee breaks were spent wobbling precariously around the kitchen, hanging on to various bits of furniture for dear life - definitely more practice needed, but I think I'm really going to enjoy this particular form of transport: the goal, to commute to work on it at least once, but one step (and fall) at a time - I'm a long way off being able to commute on it.

Work-wise, Halloween has been haunted by a wodge of xVM Server work, a not-too-terrifying zfs-auto-snapshot putback, the creepiness of some of my code getting pushed to pkg.opensolaris.org as part of nv_100a, and the blood-curdling results of more people trying out the service, running into both unknown and known issues along the way. Bug reports are always welcome though, however horrifying!

Tonight though has been entirely work-free: answering the door to trick-or-treaters, some nice pizza and some excellent beer (on an American theme tonight, Sierra Nevada Bigfoot 2008 and Anchor Steam Liberty Ale, yum) and the by now, traditional photographs of fireworks - so, here goes with continuing that tradition!

Luca pointed out some problems with doing a pkg image update to nv_100a bits regarding the new SUNWzfs-auto-snapshot functionality.

You can follow the discussion on the indiana-discuss@ mailing list, but so far, it looks like a few workarounds are needed. On a fresh install, it should all fine, but if you're upgrading from an older development build of 2008.11 (unless we come up with a better fix) it appears to deliver the zfssnap role as a locked account (*LK* in /etc/shadow) which isn't allowed to execute cron jobs.

To work around this, you need to unlock the zfssnap role (I'd recommend running pfexec passwd -N zfssnap), add the following to /etc/user_attr:

zfssnap::::type=role;auths=solaris.smf.manage.zfs-auto-snapshot;profiles=ZFS File System Management

then clear the maintenance state of the service:

$ pfexec svcadm clear frequent daily hourly weekly monthly

Thanks for the report Luca, and nice screenshots on your blog entry! I'll add comments to this post if we come up with any better solutions. As always, for those following along at home, the latest zfs-auto-snapshot bits are in our mercurial source repository, which you can get with $ hg clone ssh://anon@hg.opensolaris.org/hg/jds/zfs-snapshot

Friday Oct 17, 2008

Having missed August and September's reviews and, by the looks of things, October's news review as well, it seems like now is a good time to call it quits and pass the torch to someone else in the OpenSolaris community. I just don't appear to have enough bandwidth to produce these any more - my day job's super hectic, and at home we're expecting a new arrival in December: something has to give, and it's the monthly review, sorry.

I believe that some sort of fine-grained journalistic role is really important for the OpenSolaris community - the existing newsletters are fantastic, but rely on contributions moreso than just digging in and seeing what people are talking about at a community and project level, so I really hope someone will continue on with this work. Glynn did an excellent job before me with weekly news (sample here), and Dan's posts about what's new in build... before that were also fantastic.

From a personal perspective, compiling these reports has also been highly educational, if you're interested in OpenSolaris, this is a great way to get an overview of what's happening and where you yourself might want to contribute to the code, so I strongly urge you to have a go!

All that said, I thought I could pass along some tips on how I put these reports together, in hopes it'll be useful for whoever takes over.

The first place I tend to look for news, is the opensolaris-announce mailing list, checking for big announcements. Next up, is the the ON flag days list and the list of ON putbacks over the past month. Finally, the ARC caselog always makes for interesting reading.

After that, it gets a bit more random. I used some basic scripting to help out - opensolaris-lists.sh. Pass this a month as an argument, and it'll proceed to open the thread list for that month for every OpenSolaris mailing list in your browser - 10 at a time, pausing for Firefox to catch up, and you to hit any key to proceed.

I wasn't reading every email on every opensolaris mailing list (though I did read a lot), rather I scanned the Subject: lines, looking for interesting threads, looked at the length of threads to determine what other people found interesting, and over time, built up ideas in my head as to who's emails were worth reading regardless.

Having done that, I'd started to build up a text file with the following format:

nth October 2008
Some headline text to explain the links
http://opensolaris.org/some/link
http://foo.com/some/related/link

mth October 2008
Another headline
http://opensolaris.org/another/link

Then, I passed that text file through a basic html formatter I threw together, format-monthly-opensolaris.awk and then published. Along with each link, I left a quick plea to have people comment on stuff I'd left out that they thought was interesting over the past month. In months where I was super-organised, I was compiling that list throughout the month, rather than waiting for the end of the month - but in cases where I'd left things too late, it'd take most of an evening to put the list together, 3 or 4 hours I'd say.

I think my editorial style tended to veer more towards the technical posts, covering new and notable putbacks, project creations and occasional media happenings. I was admittedly biased towards ON, where I now work :-) I also tried to cover flamewars on the lists with as balanced a view as I could. Most particularly though, I didn't want to just have the monthly news posts turn into marketing material for Sun Microsystems, Inc - this was supposed to be a community service, for everyone contributing to OpenSolaris, so I hope whoever takes over has similar views! Here's all the reviews I've written, from June 2005 to the present (at varying levels of granularity) to get you in the mood.

So there you have it : now that you know how to produce these monthly reports, we just need someone to do it - volunteers? I'll update this post with a link to whoever puts together October's report!

Tuesday Oct 14, 2008

Nice screenshots Erwann! Take a peek here.

Saturday Oct 11, 2008

We got this code into nv_100, as part of LSARC 2008/571 and (at least inside Sun, so far) folks have been starting to play with it.

It's the first time I've been able to use the GNOME Nautilus integration that Erwann came up with, and I think it's pretty cool. Big ups to Niall & Erwann for all their hard work - on helping to get this integrated - without them, this stuff would still just be kicking around on my blog!

We've had a few comments so far - most were known bugs and fixed already. I'll list them here, and add comments as we go along.

Services enabled by default

SUNWzfs-auto-snapshot delivers all it's instances as disabled, but the accompanying desktop support, SUNWgnome-time-slider (the desktop service that uses SUNWzfs-auto-snapshot, integrates more tightly with the desktop and monitors disk space) had a postrun script that enables the services out of the box. Just run svcadm disable <service> to disable them if you want to, but see below for more ideas if you just don't want to snapshot everything...

Noisy cron job

There was some changes close to integration that made the home directory for the 'zfssnap' role go away, which had impact on the way we were planning on doing logging. Originally, the cron job would just echo messages onto the end of the SMF instance's log file in /var/svc/log but since the cron job now runs as a non-root user, we aren't able to write to those anymore.

So we changed it to write logs to the zfssnap user directory, but that wasn't good either, so we eventually moved all logging for the cron job to syslog. A small bug though meant that the service is still a bit too noisy, and so cron end up sending love letters in in the form of svcprop errors to /var/mail/zfssnap - sorry about that. This was actually fixed pre-nv_100, but it just missed the integration date.

Details here on how to grab the sources and build your own version of the SUNWzfs-auto-snapshot package if you want the fix sooner rather than later.

Service inexplicably dropping to maintenance mode

This is probably the most common failure - I'd filed 6749498 about this, which turned out to be a duplicate of 6462803. I say "inexplicably", /var/adm/messages will actually have more detail - as noted above, I don't have a way to explain to SMF why we're dropping the service to maintenance mode, so you just need to look for the log in the right place. Logging during service start/stop gets picked up by SMF, day-to-day log messages (and there's not many of those) get handled by syslog

Otherwise, a few other words of advice:

The service on startup will arrange to take automatic snapshots of all datasets on all pools on the system. You can have it not do this by setting a ZFS user property at the top level dataset in each pool, eg.

$ pfexec zfs set com.sun:auto-snapshot=false rpool

This is a much better way than just disabling the service altogether, this way, you get the option to have the service take snapshots of datasets you are interested in, eg.

$ zfs set com.sun:auto-snapshot=false space
$ zfs set com.sun:auto-snapshot=true space/timf
$ zfs set com.sun:auto-snapshot=false space/timf/foo
$ zfs set com.sun:auto-snapshot:frequent=false space/timf/onnv

Better yet, if you use ZFS Delegation to allow users the userprop permission, they can set user properties on their own filesystems, and choose which of their filesystems get included in the snapshot schedule as above.

Have a look at the hg history, the README for more documentation, and the auto-snapshot.xml service manifest if you're really interested in what's going on behind the scenes. Enjoy!

Wednesday Oct 01, 2008

Yow - October already, how did that happen? I'm a bit late with August's OpenSolaris monthly news, and now September's has piled up on me as well. I'll try to get them out soon - not enough hours in the day.

In the meantime, I remain very busy with xVM Server work, and my sideline project of getting ZFS Automatic Snapshots into Solaris is hopefully just about done - the status on that, is that I putback the zfssnap role last week to the ON source tree, and the Desktop consolidation have delivered SUNWzfs-auto-snapshot to the WOS already, so we'll have the ZFS Automatic Snapshot service in nv_100 - w00t :-)

Back to the day-job: one of the things that popped out of the gate work for the xVM Server product, was a growing frustration with the older version of hg we're using to manage the xVM and xVM-Server gates. We need support for webrev -r to be able to produce readable webrevs for source trees being managed by Mercurial MQ, and had been maintaining our own private copy of hg and the cdm module for a long time.

We're still not quite at the level where we can move off it entirely, but I was able to spend some time alleviating the problem by a quick bit of ksh hackery. The attachment I sent to the scm-migration-dev mailing list, patch-webrev.ksh makes for slightly nicer webrevs of MQ patches. So, no longer do you have to try to get your head around diffs of diffs. (yuck!)

I suspect this script won't scale for very large source trees, but it's certainly a step in the right direction. Hope you find it useful.

Update (later that day): johnlev spotted a bug in my script where it was reporting more changes that had actually been made in the patch. I've got a new version here which does the trick. I've fixed the link above too.

Saturday Sep 20, 2008

We're down in my parents' house in Wicklow this weekend - a bit of a family get-together, Lyd and Edu are over from Barcelona, and Duncan & Denise are down from Carlingford - the occasion being Duncan & Lyd's birthday. We had a BBQ, yes in September, and thankfully the Irish weather was kind to us and the Sun was shining all day - gorgeous. Sorry Glynn & Jayne, wish you were here!

One of the conversations over lunch was about our respective blogs (we all have one now, apparently), and everyone was complaining that mine had an almost complete lack of anything interesting at all right now - they've probably got a point. Posts in my "Off-topic" category have been pretty thin on the ground of late. Actually, I'm even slipping with the technical ones too - OpenSolaris monthly news posts are late, it's nearly the end of September, and I haven't done August's yet either. Sorry about that, there's just not enough hours in the day at the moment.

So, to appease some of my less technical readers (hi Mum & Dad!) here's a post that barely mentions computers. [ suffice to say, that this being Software Freedom Day, I'm composing this post on OpenSolaris 2008.11 nv_98, I used GNOME, Gimp, Exiftool and Gedit to write it - scarcely a scrap of proprietary software here, and I like it! Ok, on with the non-technical content]

Apart from hanging around with the family this weekend, I was down here for something just as enjoyable. Recently it was a milestone birthday for my father-in-law, and we had clubbed together to get him a a day out experiencing falconry, on a Hawk Walk. The voucher was for two people, and as my mother-in-law isn't terribly fond of birds, I was invited along.

What a fantastic outing it was! A group of eight of us spent a few hours learning about the sport of falconry, then got to spend time handing and flying a pair of Harris hawks in the open, and saw several other large birds-of-prey up close and very personal.

I brought the camera along, and quickly managed to fill a 1GB CF card - here's some of the better shots, but it was a tough choice.

A few emotions strike you when you see one of these birds flying towards the leather glove you're wearing in your left hand. Fear initially - it's all beak & talons arriving awfully fast, as the bird's going for the piece of meat you're holding. The landing is pretty dramatic too, but then wonder takes over. Close up they're absolutely amazing creatures. Surprisingly light too, but then again, they're birds, right?

Back at the centre, we got to see a Snowy Owl, some Ferruginous Buzzards, a pair of Lanner Falcons, and an Eagle Owl and got to bring one of the falcons out to see how vastly it differs in the air from the hawks we were flying earlier in the day.

Would I recommend the day out? Absolutely, yes! The guides were friendly and engaging, informative, and very very passionate about their hobby - a really fascinating experience, which I'd love to repeat sometime. More over on Falconry Ireland's web page and check out their Flickr stream too.

Thursday Sep 11, 2008

I'm probably the last person on earth to discover this - but just today, I used the Mercurial bisect command, and thought I'd write up my experiences in case anyone else hasn't played with it before. I'd read about hg bisect in the hgbook, but never had an opportunity to use it in anger.

Here's the problem I was seeing - in builds of xVM Server that I've been doing, we were producing ISO images, but after installation, the pkg command wasn't working properly. Exploring the image a bit, with some help from the pkg python stack trace, I found the problem was that some items in /var/pkg were symlinks pointing to a non-existent mountpoint on the installed image.

Looking at the build logs from distro constructor, cpio was complaining that there was no space left on the device it was writing to. Digging around a bit more and running another build just to make sure, I found the source of the problem - we were df'ing the source directory for the cpio, then doing a mkfile of that size, creating a lofi device that big, then creating a UFS filesystem on that device. There was the problem - the space overhead incurred by the filesystem meant that we were trying to pour a gallon into a pint pot.

So - I knew what the problem was, pulling the tip changeset from the distro constructor even showed me that the problem was already fixed (my favourite kind of bug!) - the fix being to make the file which backs the lofi device just a bit bigger. My question was, what changeset introduced this fix? Enter hg bisect.

With it, you just need to identify where you know the code is bad, and where you know the code is good, and a test to determine whether the change is present. In my case, the test was really short:

grep "Add 1%" build_dist.lib

- but you could conceivably have the test build an entire OS image, install it, and check for the change. The bisect command then does a chop through all of the changesets, narrowing down to where the change was introduced.

In my case, a source tree of 105 changesets resulted in my only having to perform 6 tests to determine where the change occurred. A grep across 105 files would have completed in no time, but had I actually needed to build an OS image for each test, 105 builds would have taken a very long time indeed.

Here's some edited highlights:

timf@haiiro[435] hg bisect -g tip
timf@haiiro[436] hg bisect -b 0  
Testing changeset 52:42e67ad1e103 (105 changesets remaining, ~6 tests)
125 files updated, 0 files merged, 5 files removed, 0 files unresolved
timf@haiiro[438] grep "Add 1%" build_dist.lib   
timf@haiiro[439] hg bisect -b
Testing changeset 78:76e8ef490770 (53 changesets remaining, ~5 tests)
119 files updated, 0 files merged, 95 files removed, 0 files unresolved
.
.
timf@haiiro[447] hg bisect -b                
Testing changeset 103:b8d33c12a531 (4 changesets remaining, ~2 tests)
2 files updated, 0 files merged, 0 files removed, 0 files unresolved
timf@haiiro[448] grep "Add 1%" build_dist.lib
	# Calculate the size of the pkg data directory.  Add 1% of the
timf@haiiro[449] hg bisect -g                
Testing changeset 102:ef08a25b1d1c (2 changesets remaining, ~1 tests)
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
timf@haiiro[450] grep "Add 1%" build_dist.lib
	# Calculate the size of the pkg data directory.  Add 1% of the
timf@haiiro[451] hg bisect -g                
The first good revision is:
changeset:   102:ef08a25b1d1c
user:        Karen Tung 
date:        Wed Aug 06 20:22:36 2008 -0700
summary:     2810 pkg archive size not big enough sometimes

So - I need to update our copy of distro_constructor to be based on changeset ef08a25b1d1c, which gets me the fix for 2810. Yahoo!

Tuesday Aug 26, 2008

I'm just about done with this release of the ZFS Automatic Snapshot SMF service and have just pushed some changes to the mercurial repository on opensolaris.org.

This is a pretty important release, in terms of fixing stuff that's been bugging me about the service since I initially released it. But inevitably, with lots of change comes the possibility of lots of bugs - so, I was hoping to get some feedback on how it's looking before it gets officially released.

So, if you're feeling brave (that means don't use this in production yet!) fire up your favourite source code management system (which is hg, right?) and access the mostly-untested ZFS Automatic Snapshot 0.11 Early Access release via:

hg clone ssh://anon@hg.opensolaris.org/hg/jds/zfs-snapshot

I'm working with Niall & Erwann in the Desktop group here, who have been tasked with DSK-5, to get ZFS Automatic Snapshots on the desktop, and so far, it looks like my code will be providing the back-end service (obviously as well as ZFS :-) so some of the changes are things that make most sense when running this on a desktop or laptop machine.

With that in mind, I've not made any changes to my bundled GUI, since it'll be going away real soon now. However, I've done my best to ensure that there's always ways of turning off the small-system-focused bits, and the service remains backwards-compatible with earlier manifests.

So what's going to be new in 0.11 ? Well having seen Nils write up his changes in that form (more on that later), I thought I'd have a go at writing a Changelog too - so here's the annotated Changelog entry so far for 0.11:

0.11

  • Add RBAC support
    • the service now runs under a zfssnap role
    • service start/stop logs stay under /var/svc/log
    • other logs saved to /export/home/zfssnap (and syslog) [ yes, this sucks a bit - better solutions welcome? ]
  • Add a 'zfs/interval' property value 'none' which doesn't use cron
  • Add a cache of svcprops to the method script (good idea Nils!)
  • Add a com.sun:auto-snapshot user property used by all instances, com.sun:auto-snapshot:$LABEL takes precedence
  • Remove the seconds field of the snapshot name - it's not needed (good idea Håkan!)
  • Changed the way // works with recursive snapshots - ignore snapshot-children, and instead automatically determine when we can take recursive snapshots based on which datasets have the zfs user properties
  • Set avoidscrub to false by default (6343667 was fixed in in nv_94)
  • Bugfix from Dan (thanks!)- Volumes are datasets too
  • Automatically snapshot everything by setting com.sun:auto-snapshot=true on startup. (this gets done on all top level datasets - an existing property set to false on the top level dataset overrides this)
  • Check for missed snapshots on startup
  • Clean up shell style a bit
  • Clean up preremove script (I need to make these scripts redundant before we move to IPS, I know)
  • Write this Changelog
  • In terms of user-visibility, the most obvious changes are running under RBAC, and taking snapshots of all filesystems by default - I realise the latter could be controversial, but you can turn it off if you don't like it. I'm also pretty happy with the changes to the "//" schedule - we now ignore "zfs/snapshot-children" for this particular case, and instead use the list of filesystems marked as "com.sun:auto-snapshot=true" to work out which filesystems we can take recursive snapshots of, and which we have to take indivdual snapshots of. This makes a big difference on large systems.

    One thing that's missing from this release, is Nils Goroll's suggested changes about improving the way the system performs scheduling - more details here. I feel that moving away from cron would result in less familiarity in what the service does: if cron is the problem, certainly one solution is running away from it, but wouldn't it be cool to get cron's shortcomings fixed instead? Yeah, one of those "ample-free time" problems.

    So, without further ado, there's full documentation in the README - enjoy, and please let me know if you see anything weird - there's still time to fix it before 2008.11 (and yes, all this despite my day-job being super hectic right now! xVM Server is getting 99% of my time at the moment, so I definitely expect bugs this early access release of zfs-auto-snapshots!)

    Tuesday Aug 05, 2008

    Life remains as hectic as ever - day-job still amazingly busy (but good-busy) and things are in the same state at home: I'm typing this entry from my in-laws house in Carrickfergus, where we've retreated for the week while we get some building work done to our house [ we're dry-lining the interiors of all external walls, which hopefully will make for a less chilly winter, but it's a messy job, and E's creche is shut for 2 weeks. Migrating north seemed like the sane thing to do ]

    As a result of the above, I haven't been able to give OpenSolaris the attention it deserves, except where my day-job contributes to it of course :-) So, with some amount of guilt, here goes with July's (slightly shorter than usual?) news report - as always, please add missing stuff to the comments section.

    This blog copyright 2009 by timf