Nice review at Gizmodo, here, of VirtualBox. Title is Virtualize Any OS for Free. Check it out.
Nice review at Gizmodo, here, of VirtualBox. Title is Virtualize Any OS for Free. Check it out.
Slides from last week's meeting of the Atlanta OpenSolaris User Group (ATLOSUG) are posted now on the group website - http://opensolaris.org/os/project/atl-osug
We had a good group of about 16 people in attendance and a great discussion around how and why to use COMSTAR.
The next meeting will be held on Sept. 8. The topic will be how COMSTAR and other OpenSolaris technologies fit together in the Sun Unified Storage family of products. Hope to see you there!
Had a great Atlanta OpenSolaris User Group meeting this month. We did an installfest, an update from CommunityOne, and a recap of what's new in OpenSolaris 2009.06. About twenty folks showed up and about half loaded their laptops with the new build while we were there.
We got some great feedback for upcoming topics and are pushing forward with that. We also decided to move back to monthly meetings starting in August. Our next meeting is August 11 when we will talk about COMSTAR. We are also considering a change in venue back to the Sun office in Alpharetta. Matrix Resources has been very gracious in allowing us to use their facility, but I always feel bad that they have to have someone stick around until late at night to babysit us.
We're going to try an experiment to see if we can't get the word out a little better about our merry band via social networks. We've started by creating a Meetup group at http://meetup.com/atlosug. Hopefully this might generate more traffic to our meetings and help us find folks in the area.
What a great opportunity to get together in the same room with folks working to create and sustain OpenSolaris user groups around the world! We had folks from every continent - from Atlanta and Argentina, from Dallas and Serbia, from China and London, and on and on. Something like twenty-five to thirty of the OpenSolaris User Groups were represented.
The whole day was a great experience. It was great to see that as different as each group was, there were a lot of common themes for both successes and for challenges. And a lot of great ideas were shared as to how to boost participation, to improve meetings, and to improve the success of the groups overall. It will be exciting to hear a report back next year on how these ideas have played out.
Be sure to check out Jim Grisanzio's photos to see some of these characters and what all went on at CommunityOne and in the OSUG Bootcamp.
Jeff Jackson, Sr. VP for Solaris Engineering, started the day off with a greeting and charge to get the most out of this opportunity to meet with each other and with the OpenSolaris and Solaris headquarters teams.
Since the thing that brought this group together was a common focus on OpenSolaris User Groups and not the fact that we knew each other, we began the day with a bit of team-building exercise, courtesy of The Go Game. This is a cross between a scavenger hunt and an improvisational acting class. Teams criss-crossed downtown San Francisco trying to find and photograph places hinted at by clues on web pages. At some venues, the teams had to act out and film various tasks. For example, on the Yerba Buena lawn, the team had to engage in an impromptu Tai Chi exercise in order to find their long-lost phys ed teacher, Ms. Karpanski, who then led the team in creating a new exercise video. Once we all returned, all of our submissions were voted on by the team and a winning team chosen. Supposedly, we can see all these photos and videos. Haven't yet found out how. Perhaps, that's for the best!
In order for us to get to know each other's groups, each User Group prepared a poster describing the group, where we were located,
what we do, what sort of members make up the group, and what makes us special. Many of these posters were really well done! We had a
bit of a scavenger hunt for answers to questions found by careful reading of all of the posters. It was really cool to see what
sorts of projects some of the groups had undertaken and how they were working with various university or other organizations.
But the main part of the day was spent in a big brainstorming session. We all identified our successes, our failures, our challenges, and ideas for the future. We put all of these on several hundred post-it notes and placed them on large posters. We grouped them by topic and then went through all of these. Even though this only had an hour on the agenda, it ended up taking the bulk of the day. Since this was the most important thing for us, we decided to rearrange the day to accommodate it.
From these sticky-notes, we found out that some of our groups were mostly focused on administrators but others had a large developer population. We all have some sort of issues around meeting locations - whether it's a matter of access in the evening, finding a convenient location, or providing network access and power. For most groups, having some sort of refreshments was important, though some groups felt like good refreshments attracted too many folks who just show up for the food.
There were a lot of good ideas around using a registration site to get access to the facility and order food, creating and using Facebook, LinkedIn, and Twitter, using IRC, interacting with the Sun Campus Ambassadors, using MeetUp to find new members. Many folks found it useful to video and make available presentations given at their meetings. Some groups (for example in Japan) have special sub-groups for beginners. Other groups are doing large-scale development projects, such as the Belenix project in Bangalore.
For me and the Atlanta OpenSolaris User Group, I have a lot of new ideas that I want to put out to our membership and our leaders - move back to monthly meetings, use a registration site, set up a presence on various social networks.
Many people said that folks come to the user groups in order to network and expand their circle of business acquaintances. In light of the current economic situation, with so many smart people out of work, I am thinking of promoting our group with some of the job networking groups around Atlanta. For example, my church, Roswell United Methodist Church, has one of the largest job networking groups in the Atlanta area. Every two weeks, nearly 500 people meet to network and help each other in their job search. Perhaps the many IT folks in this group might find this a way to get current and stay current in a whole new area.
At any rate, I am inspired to get things cranking at ATLOSUG!
After spending the afternoon working through our hundreds of sticky notes, the OpenSolaris Governing Board had a bit of a roundtable with us to talk about what they do and how we can work better together. It was really helpful for me to hear from them and to get to put faces to some of the names for the folks I did not already know.
We finished out the evening with a great dinner at the Crab House at Pier 39. From what I have seen, many of the photos from dinner and the meeting are already on Facebook, Flickr, and likely blogs.sun.com. Jim Grisanzio, OpenSolaris Chief Photographer, was out in force with his camera!
Thanks so much to Teresa Giacomini, Lynn Rohrer, Dierdre Straughan, Jim Grisanzio, Tina Hartshorn, Wendy Ames, Kris Hake and everyone else who had a hand in organizing this event. Thanks to Jeff Jackson, Bill Franklin, Chris Armes, Dan Roberts and all the other HQ folks who took the time to come and listen and interact with the leaders of these groups. I know that I got a lot out of the meeting and am more eager than ever to promote and push forward with our user group.
West in San Francisco,
along with a number of the other leaders of OpenSolaris User Groups.
(I head up the Atlanta OpenSolaris User Group.)
What a great meeting! Three days of OpenSolaris.
First off, I am sure that Teresa and the OpenSolaris team selected the Hotel Mosser because they knew it was a Solaris focused venue. As Dave Barry would say, I am not making this up! Even the toilet paper was Solaris-based. Bob Netherton and I were speculating that perhaps this was an example of Solaris Roll-Based Dump Management, new in OpenSolaris 2009.06.
Day One was a full day of OpenSolaris and related talks. The OpenSolaris teams maintained tracks around deploying OpenSolaris 2009.06 in the datacenter and around developing applications on OpenSolaris 2009.06. For the most part, I stuck with the operations-focused sessions, though I did step out into a few others. Some of the highlights included:
Day Two was filled with OpenSolaris Deep Dives. These were very helpful, not just in content, but in helping me to hone my own OpenSolaris presentations. For this day, I stuck close to the Deploying OpenSolaris track, having learned in graduate school that I am not a developer. This track included:
table, we had Vasu Karunanithi, Dawit Bereket, Matt Ingenthron,
Scott Dickson (me), Bob Netherton, Isaac Rosenfeld, and Kimberly Chang. It was great to get at least part of the old
gang together and catch up.
Day Three was the OpenSolaris User Group Leaders Bootcamp. But that's for another post....
Sun's Executive Briefing Center is on the road this week. We are visiting with customers in Cleveland, Columbus, and Detroit. Looks like a busy schedule and I am looking forward to the trip. I was asked to fill in at the Solaris Virtualization speaker for this trip.
We fly to Cleveland and fly home from Detroit. Kate has arranged a bus to get us from Cleveland to Columbus to Detroit. My wife calls it Geeks on a Bus and thought it sounded too scary to contemplate!
We'll be talking about Sun's Vision, Systems, Software, OpenStorage, Solaris, Virtualization of Systems, Desktop Virtualization, and Services to support all of these. Hope to see many of you there.
Last week, I blogged about a Jumpstart Survey. I've gotten good comments and some responses to the survey. It's been a week, but I want to collect some more responses before posting an analysis. Take a look at my previous blog and fill out the survey or comment on the blog. I will summarize and report in another week or so.
I'm doing briefings on DTrace and Solaris Performance Tools this week in Atlanta, Ft. Lauderdale, and Tampa. Click the links below to register if this is of interest and you can attend. These are pretty much a 2 1/2 to 3 hour briefing that stays pretty technical with lots of examples.
From the flyer:
Join us for our next Solaris 10 Technology Brief featuring DTrace. DTrace, Solaris 10's powerful new framework for system observability, helps system administrators, capacity planners, and application developers improve performance and problem resolution. DATE: May 12, 2009 LOCATION: Classroom Resource Group, Atlanta TIME: 8:30 AM Registration, 9:00 am - 12:00 pm Session DIRECTIONS: http://www.crgatlanta.com/directions.asp REGISTER AT: http://www.suneventreg.com/cgi-bin/pup_registration.pl?EventID=2705 HOLLYWOOD, FL - May 13, 2009 LOCATION: Seminole Hardrock Hotel TIME: 8:30 AM Registration, 9:00 am - 12:00 pm Session DIRECTIONS: http://www.seminolehardrockhollywood.com/getting_here/directions.php REGISTER: http://www.suneventreg.com/cgi-bin/pup_registration.pl?EventID=2706 TAMPA, FL - May 14, 2009 LOCATION: University of South Florida TIME: 8:30 AM Registration, 9:00 am - 12:00 pm Session DIRECTIONS: http://www.msc.usf.edu/directions.htm REGISTER: https://www.suneventreg.com//cgi-bin/register.pl?EventID=2707 What You'll Learn? You can't improve what you can't see and DTrace provides safe, production-quality, top to bottom observability - from the PHP application scripts down to the device drivers - without modifying applications or the system. This seminar will introduce DTrace and the DTrace Toolkit as key parts of an overall Solaris performance and observability toolkit. AGENDA: 8:30 AM To 9:00 AM Check In, Continental Breakfast 9:00 AM To 9:10 AM Welcome 9:10 AM To 10:15 AM Dtrace 10:15 AM To 10:30 AM BREAK 10:30 AM To 11:30 AM Dtrace Continued 11:30 AM To 12:00 PM Wrap Up, Q&A, Evaluations We look forward to seeing you at one of these upcoming Solaris 10 Dtrace sessions!
Jumpstart makes use of rules to decide how to install a particular system, based on its architecture, network connectivity, hostname, disk and memory capacity, or any of a number of other parameters. The rules select a profile that determines what will be installed on that system and where it will come from. Scripts can be inserted before and after the installation for further customization. To help manage the profiles and post-installation customization, Mike Ramchand has produced a fabulous tool, the Jumpstart Enterprise Toolkit (JET).
For example, I once installed 600 systems with SunOS 4.1.4 in less than a week using Jumpstart - remember that Jumpstart never supported SunOS 4.1.4.
But, I am not just looking for the weird stories. I want to know what Jumpstart features you use. I'll follow this up with extra, detailed questions around Jumpstart Flash, WAN Boot, DHCP vs. RARP. But I want to start with just some basics about Jumpstart.
Lacking a polling mechanism here at blogs.sun.com, you can just enter your responses as a comment. Or you can answer these questions at SurveyMonkey here. Or drop me a note at scott.dickson at sun.com.
Just got my copy of Pro OpenSolaris by Harry Foxwell and Christine Tran in the mail today! Can't wait to get a good look and post a review. I wonder if I can get the authors to inscribe it to me!
Also got a copy of OpenSolaris Bible by Nick Solter, Gerry Jelinek, and Dave Miner. Looking forward into cracking into it as well.
Will post reviews shortly.
After a really long and difficult week, we've lost a good friend in our house today. Our Ernie passed away at 16. Ten days ago, everything was good. But, when he went in to get his teeth cleaned, they found a cancerous tumor in his lower jaw. Kathleen and I are losing a friend, a member of our family. He's like our baby.
We will so much miss him. I know he didn't want to go now, either.
I was visiting with a customer last week and they were very excited to move forward quickly with ZFS boot in their Solaris 10 environment, even to the point of using this as a reason to encourage people to upgrade. However, when they realized that it was impossible to use Flash with Jumpstart and ZFS boot, they were disappointed. Their entire deployment infrastructure is built around using not just Flash, but Secure WANboot. This means that they have no alternative to Flash; the images deployed via Secure WANBoot are always flash archives. So, what to do?
It occurred to me that in general, the upgrade procedure from a pre-10/08 update of Solaris 10 to Solaris 10 10/08 with a
ZFS root disk is a two-step process. First, you have to upgrade to Solaris 10 10/08 on UFS and then use lucreate
to copy that environment to a new ZFS ABE. Why not use this approach in Jumpstart?
Turns out that it works quite nicely. This is a framework for how to do that. You likely will want to expand on it, since one thing this does not do is give you any indication of progress once it starts the conversion. Here's the general approach:
Our goal when complete is to have the flash archive installed as it always has been, but to have it running from a ZFS root
pool, preferably a mirrored ZFS pool. The conversion script requires two phases to complete this conversion. The first phase
creates the ZFS boot environment and the second phase mirrors the root pool. The following in this example, our flash archive
is called s10u6s.flar. We will install the initial flash archive onto the disk c0t1d0 and built our
initial root pool on c0t0d0.
Here is the Jumpstart profile used in this example:
install_type flash_install
archive_location nfs nfsserver:/export/solaris/Solaris10/flash/s10u6s.flar
partitioning explicit
filesys c0t1d0s1 1024 swap
filesys c0t1d0s0 free /
We specify a simple finish script for this system to copy our conversion script into place:
cp ${SI_CONFIG_DIR}/S99xlu-phase1 /a/etc/rc2.d/S99xlu-phase1
You see what we have done: We put a new script into place to run at the end of rc2 during the first boot.
We name the script so that it is the last thing to run. The x in the name makes sure that this will
run after other S99 scripts that might be in place. As it turns out, the luactivate that we will
do puts its own S99 script in place, and we want to come after that. Naming ours S99x makes it happen later in the
boot sequence.
So, what does this magic conversion script do? Let me outline it for you:
lucreateThat's Phase 1. Phase 2 has its own script to be run at the same time that finishes the mirroring of the root pool. If you are satisfied with a non-mirrored pool, you can stop here and leave phase 2 out. Or you might prefer to make this step a manual process once the system is built. But, here's what happens in Phase 2:
installboot. For x86, you
would do something similar with installgrub.I have been thinking it might be worthwhile to add a third phase to start a zpool scrub, which will force
the newly attached drive to be resilvered when it reboots. The first time something goes to use this drive, it will
notice that it has not been synced to the master drive and will resilver it, so this is sort of optional.
The reason we add bootability explicitly to this drive is because currently, when a mirror is attached to a root zpool, a boot block is not automatically installed. If the master drive were to fail and you were left with only the mirror, this would leave the system unbootable. By adding a boot block to it, you can boot from either drive.
So, here's my simple little script that got installed as /etc/rc2.d/S99xlu-phase1. Just to make the code a
little easier for me to follow, I first create the script for phase 2, then do the work of phase 1.
cat > /etc/rc2.d/S99xlu-phase2 << EOF
ludelete -n s10u6-ufs
installboot -F zfs /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t1d0s0
zpool attach -f rpool c0t0d0s0 c0t1d0s0
rm /etc/rc2.d/S99xlu-phase2
init 6
EOF
dumpadm -d swap
zpool create -f rpool c0t0d0s0
lucreate -c s10u6-ufs -n s10u6 -p rpool
luactivate -n s10u6
rm /etc/rc2.d/S99xlu-phase1
init 6
I think that this is a much better approach than the one I offered before, using ZFS send. This approach uses standard tools to create the new environment and it allows you to continue to use Flash as a way to deploy archives. The dependency is that you must have two drives on the target system. I think that's not going to be a hardship, since most folks will use two drives anyway. You will have to keep then as separate drives rather than using hardware mirroring. The underlying assumption is that you previously used SVM or VxVM to mirror those drives.
So, what do you think? Better? Is this helpful? Hopefully, this is a little Christmas present for someone! Merry Christmas and Happy New Year!
Long ago, a customer of mine needed to deploy 600(!) SPARCstation 5 desktops all running SunOS 4.1.4. Even then, this was an old operating system, since Solaris 2.6 had recently been released. But it was what their application required. And we only had a few days to build and deploy these systems.
Remember that Jumpstart did not exist for SunOS 4.1.4, Flash did not exist for Solaris 2.6. So, our approach was to build a system, a golden image, the way we wanted to be deployed and then use ufsdump to save the contents of the filesystems. Then, we were able to use Jumpstart from a Solaris 2.6 server to boot each of these workstations. Instead of having a Jumpstart profile, we only used a finish script that partitioned the disks and restored the ufsdump images. So Jumpstart just provided us clean way to boot these systems and apply the scripts we wanted to them.
But. There's always a but, isn't there.
But, at present, Flash archives are not supported (and in fact do not work) as a way to install into a ZFS boot environment, either via Jumpstart or via Live Upgrade. Turns out, they use the same mechanism under the covers for this. This is CR 6690473.
So, how can I continue to use Jumpstart to deploy systems, and continue to use something akin to Flash archives to speed and simplify the process?
Turns out the lessons we learned years ago can be used, more or less. Combine the idea of the ufsdump with some of the ideas that Bob Netherton recently blogged about (Solaris and OpenSolaris coexistence in the same root zpool), and you can get to a workaround that might be useful enough to get you through until Flash really is supported with ZFS root.
/var as part of the root filesystem rather than a separate
dataset, though this process could certainly be tweaked to accommodate a separate /var.
Once the system to be cloned has been built, you save an image of the system. Rather than using flarcreate, you will create a ZFS send stream and capture this in a file. Then move that file to the jumpstart server, just as you would with a flash archive.
In this example, the ZFS bootfs has the default name - rpool/ROOT/s10s_u6wos_07.
golden# zfs snapshot rpool/ROOT/s10s_u6wos_07@flar
golden# zfs send -v rpool/ROOT/s10s_u6wos_07@flar > s10s_u6wos_07_flar.zfs
golden# scp s10s_u6wos_07_flar.zfs js-server:/flashdirectory
Then, we will use Jumpstart finish scripts to create a fresh ZFS dataset and restore our saved image into it. Since this new dataset will contain the old identity of the original system, we have to reset our system identity. But once we do that, we are good to go.
So, set up the cloned system as you would for a hands-free jumpstart. Be sure to specify the sysid_config and install_config
bits in the /etc/bootparams. The manual Solaris 10 10/08 Installation Guide: Custom JumpStart and Advanced Installations
covers how to do this. We add to the rules file a finish script (I called mine loadzfs in this case) that will do the
heavy lifting. Once Jumpstart installs Solaris according to the profile provided, it then runs the finish script to finish up
the installation.
Here is the Jumpstart profile I used. This is a basic profile that installs the base, required Solaris packages into a ZFS pool mirrored across two drives.
install_type initial_install
cluster SUNWCreq
system_type standalone
pool rpool auto auto auto mirror c0t0d0s0 c0t1d0s0
bootenv installbe bename s10u6_req
The finish script is a little more interesting since it has to create the new ZFS dataset, set the right properties, fill it up, reset the identity, etc. Below is the finish script that I used.
#!/bin/sh -x
# TBOOTFS is a temporary dataset used to receive the stream
TBOOTFS=rpool/ROOT/s10u6_rcv
# NBOOTFS is the final name for the new ZFS dataset
NBOOTFS=rpool/ROOT/s10u6f
MNT=/tmp/mntz
FLAR=s10s_u6wos_07_flar.zfs
NFS=serverIP:/export/solaris/Solaris10/flash
# Mount directory where archive (send stream) exists
mkdir ${MNT}
mount -o ro -F nfs ${NFS} ${MNT}
# Create file system to receive ZFS send stream &
# receive it. This creates a new ZFS snapshot that
# needs to be promoted into a new filesystem
zfs create ${TBOOTFS}
zfs set canmount=noauto ${TBOOTFS}
zfs set compression=on ${TBOOTFS}
zfs receive -vF ${TBOOTFS} < ${MNT}/${FLAR}
# Create a writeable filesystem from the received snapshot
zfs clone ${TBOOTFS}@flar ${NBOOTFS}
# Make the new filesystem the top of the stack so it is not dependent
# on other filesystems or snapshots
zfs promote ${NBOOTFS}
# Don't automatically mount this new dataset, but allow it to be mounted
# so we can finalize our changes.
zfs set canmount=noauto ${NBOOTFS}
zfs set mountpoint=${MNT} ${NBOOTFS}
# Mount newly created replica filesystem and set up for
# sysidtool. Remove old identity and provide new identity
umount ${MNT}
zfs mount ${NBOOTFS}
# This section essentially forces sysidtool to reset system identity at
# the next boot.
touch /a/${MNT}/reconfigure
touch /a/${MNT}/etc/.UNCONFIGURED
rm /a/${MNT}/etc/nodename
rm /a/${MNT}/etc/.sysIDtool.state
cp ${SI_CONFIG_DIR}/sysidcfg /a/${MNT}/etc/sysidcfg
# Now that we have finished tweaking things, unmount the new filesystem
# and make it ready to become the new root.
zfs umount ${NBOOTFS}
zfs set mountpoint=/ ${NBOOTFS}
zpool set bootfs=${NBOOTFS} rpool
# Get rid of the leftovers
zfs destroy ${TBOOTFS}
zfs destroy ${NBOOTFS}@flar
When we jumpstart the system, Solaris is installed, but it really isn't used. Then, we load from the send stream a whole new OS dataset, make it bootable, set our identity in it, and use it. When the system is booted, Jumpstart still takes care of updating the boot archives in the new bootfs.
On the whole, this is a lot more work than Flash, and is really not as flexible or as complete. But hopefully, until Flash is supported with a ZFS root and Jumpstart, this might at least give you an idea of how you can replicate systems and do installations that do not have to revert back to package-based installation.
Many people use Flash as a form of disaster recover. I think that this same approach might be used there as well. Still not as clean or complete as Flash, but it might work in a pinch.
So, what do you think? I would love to hear comments on this as a stop-gap approach.
I just need to take a minute to brag on my wife, Kathleen. She has taken over as the local coordinator for our food pantry for America's Second Harvest, now called Feeding America. She coordinates the couple of dozen volunteers who glean extra food from the local restaurants and groceries and bring it all back to our food pantry, North Fulton Community Charities. It's amazing how much these places would just discard as leftovers at the end of the day or as they restock the shelves with newer product.
Since she took this on, she has done some really cool stuff. She has started to recruit volunteers from among the people who receive food from the pantry that want to give back to the community in gratitude. She has gotten the pantry to start collecting and distributing pet food for the families who need groceries as well, so that they can continue to look after their pets. Now, she has started working with some local folks who make decorative cut fruit arrangements to provide fresh fruit to the pantry. That's something that really makes a difference to the people who are receiving the food subsidies and groceries from the pantry.
I have to say that I am right proud of her for all of this. And I would encourage folks to get involved with their local charities. Go to the Feeding America page to find out what opportunities there are in your area. It really can make a difference to so many people.
Today is my first day back at Sun.
I am excited to be back at Sun. We have a new group of folks focused on Solaris and OpenSolaris. Now, we need to get our heads together and put together a bit of a business plan for the team.
I am sure that the next few months will be hugely busy and exciting!
Thanks to Hal and everyone who made this possible.
I've not been terribly faithful about blogging here. Once in a while, but this is worth saying.
I am part of the great exodus going on this week from Sun. I was notified yesterday that my position has been eliminated.
This has been a great 13 1/2 year ride. Sun has had great peaks and great valleys in that time. But through it all, it has been a top-notch place to be.
Of the things I have done at Sun, I am proudest of being associated with two groups: Dawit Bereket's Solaris team for the last three years and the OS Ambassador program for the last 13 years. These are both groups of the top flight of Solaris folks in the field, and folks who all wear the SUNW (oops, JAVA) hat, rather than the hat of any parochial group or division.
So, for now, I'm signing off. But hope to be back soon.
Being a long time Sun and Solaris guy, it's not often that I step up to say "Wow, Microsoft did something good." But this time I want to.
Recently, a good friend's son returned home from a tour of duty with the Air Force in Iraq. As the plane unloaded in Baltimore, there was a representative from Microsoft handing each of the servicemen and women a fully tricked out Zune, accessories, speakers - the whole nine yards.
There were no cameras, no press releases, no publicity. Just a nice gesture for these men and women who had been away from home and family doing something that, even though they trained and prepared for it, they would just as soon not have to do.
Thanks for this nice gesture, Microsoft.
So, what was exciting from days one and two? Lots!
This Sunday was World Communion Sunday, where Christians all over the world all celebrate Holy Communion on the same day. For me, this is always a powerful statement of the universality of the church. Being on the Strip in Las Vegas, getting to church is a struggle. But, Web 2.0 to the rescue. Google Maps found for me the University United Methodist Church, across the street from UNLV.
Google Maps told me it was 2.7 miles from the hotel to the church, so off I went. Even with the long walk there (and back!), I was really glad that I went. Lovely, small church, but very nice people, and a service that left me thinking hard all the way back to the hotel.
The text was Luke 17:5-10. The first part of this passage is a familiar one, but the second part is a hard, hard saying. But more than that, I pondered all the way home the word "rehearse" in the liturgy. I think there's a lot there to think about still.
Last night was our Networking Reception. Great to see folks again that I had not seen in a while and to meet lots of new faces.
Today, we start with opening sessions from Hal Stern, Dan Berg, Jim Baty, and a host of others. Then, we get into, for me, the guts of CEC - the breakout sessions. There are over 240 sessions, selected from a pool of over 700 submissions. I'm talking (Tuesday, 6PM, Versailles ballroom 3 & 4) on Dynamic Resource Pools in Solaris 10. I'll post my slides after the talk. If you are at the conference, come on over. I understand my talk will also be available in Second Life. I'm still trying to figure out how all of that works, though.
Here are some of my initial + and - observations from CEC so far:
All in all, though I am excited about a great conference and expect to be really tired when I get home!
Jason Calacanis has posted his "official" definition of Web 3.0. He says "Web 3.0 is defined as the creation of high-quality content and services produced by gifted individuals using Web 2.0 technology as an enabling platform."
The same day I saw this, I also saw, on Keith Bostic's fabulous /dev/null mailing list, a link to Cracked.com's The 8 Most Needlessly Detailed Wikipedia Entries. Even though all of these folks are clearly authorities in their field, are we really getting the "wisdom of the crowd"? Geek and Poke gets it pretty right.
Sorry for the inconvenience. We will pick up with our meetings in November. Ryan Matteson, from Ning, will be our speaker. Should be a really good meeting. Details on the topic to follow.
In between Solaris workshops, I got to take a week off and go canoeing with my dad. We had planned to go to the Okefenokee Swamp,
but the fires in south Georgia and northern Florida pretty much made
that impossible. So, we just bummed around instead, going over to
Coldwater Creek in northwest Florida, and then over to Wakulla River,
south of Tallahassee.
I have to say that the Wakulla River, with its headwaters in Wakulla Springs State Park,
is way cool! This river is fed by a spring that pumps out 250 million
gallons of water per day. Crystal clear. At the spring, you can see
the bottom at 125 feet! There are mastadon bones on the bottom from
when either the cave that supplies the spring was dry, or when the
furry brute fell in.
There is a fabulous site that talks about the spring, its geology, the land around it, etc. here.
At the state park, there is a lodge, built in the 1920's, formerly frequented by Johnny Weissmuller of Tarzan fame. In fact, several of the original Tarzan films were shot here. As well as the Creature from the Black Lagoon. The lodge looks like a great place to stay - very Art Deco and ornate and old.
But we were there to canoe.
This river has its fair share of wild life. There are turtles, wading
birds, osprey and birds of prey, mullet leaping, and even manatees.
And there are alligators. Lots of them. Weird thing is that there's swimming right next to the prime alligator areas. They seem to hang out in the marshy edges right around the spring itself. Maybe they are waiting for an unsuspecting teenager to wander too close.
![]() | ![]() |
And we found our share of alligators, small and large, as we paddled the river. Dad was in the front taking pictures and my job was to paddle and put him where he could get good pictures. So, we got really close to this one. It was about 8 or 9 feet long, and we got as close as maybe 8 feet to it. I would have gone closer, but there was a log I couldn't get the boat over. I figured that I was okay. It's like the old story of the guy running from the bear. I didn't have to be so far from the alligator that it couldn't get me, just father away than Dad! I'm working hard to get back to Wakulla River, this time so I can be on the river before light and after dark to really see what goes on on the river. If you're looking for a great place to escape from most everything on the Florida Gulf Coast, Wakulla River, Wakulla Springs State Park should be on your list.

Powered by ScribeFire.
Spring
has arrived! My first iris are blooming right on schedule, actually a couple of days early. The White Flags of Spring, as my grandmother called them, bloomed on the first day of spring, Tuesday of this week. These are a small, only about 14" high, white iris. The always bloom before the first day of April and this year was no exception.
I am looking forward to a pretty good crop of iris this year, I think. I just cleaned out the winter cruft. It looks like this is the year to dig up several of the beds, split them, give them away, and replant. I think I will get a couple of yards of new good dirt to work in with them, too. It looks like everything is just sand anymore in the beds.
I hope that the purple and bronze iris that my grandmother hybridized come back. I didn't see any last year, so I am afraid I have lost those. But I still have so many of hers that every time I go out I remember being at my grandmother's house in the springtime, having Easter egg hunts among the iris, and the sweet smell of the flowers everywhere.
For some reason, I have the Indigo Girls song Southland in the Springtime running through my head about now. Just call me a sentimental old softie.....
I am amazed and awed by all of the folks on BSC who are able to contribute great content *and* get their jobs done! I find that even when I want to share something, there just don't seem to be enought hours in the day to get the job done, talk to & support the customers, and then to put something together that makes enough sense to share.
How do you guys do it? Or do you never sleep?
Continuing with some of the ideas around zvols, I wondered about UFS on a zvol. On the surface, this appears to be sort of redundant and not really very sensible. But thinking about it, there are some real advantages.
Creating a UFS filesystem on a zvol is pretty trivial. In this example, we'll create a mirrored pool and then build a UFS filesystem in a zvol.
bash-3.00# zpool create p mirror c2t10d0 c2t11d0 mirror c2t12d0 c2t13d0
bash-3.00# zfs create -V 2g p/v1
bash-3.00# zfs list
NAME USED AVAIL REFER MOUNTPOINT
p 4.00G 29.0G 24.5K /p
p/v1 22.5K 31.0G 22.5K -
bash-3.00# newfs /dev/zvol/rdsk/p/v1
newfs: construct a new file system /dev/zvol/rdsk/p/v1: (y/n)? y
Warning: 2082 sector(s) in last cylinder unallocated
/dev/zvol/rdsk/p/v1: 4194270 sectors in 683 cylinders of 48 tracks, 128 sectors
2048.0MB in 43 cyl groups (16 c/g, 48.00MB/g, 11648 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
3248288, 3346720, 3445152, 3543584, 3642016, 3740448, 3838880, 3937312,
4035744, 4134176
bash-3.00# mkdir /fs1
bash-3.00# mount /dev/zvol/dsk/p/v1 /fs1
bash-3.00# df -h /fs1
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1 1.9G 2.0M 1.9G 1% /fs1
Nothing much to it.
But, what if I run out of space? Well, just as you can add disks to a volume and grow the size of the volume, you can grow the size of a zvol. Now, since the UFS filesystem is a data structure inside zvol container, you have to grow it as well. Were I using just zfs, the size of the file system would grow and shrink dynamically with the size of the data in the file system. But a UFS has a fixed size, so it has to be expanded manually to accomodate the enlarged volume. Now, this seems to have quite working between b45 and b53, so I just filed a bug on this one.
bash-3.00# uname -a
SunOS atl-sewr-158-154 5.11 snv_45 sun4u sparc SUNW,Sun-Fire-480R
bash-3.00# zfs create -V 1g bsd/v1
bash-3.00# newfs /dev/zvol/rdsk/bsd/v1
...
bash-3.00# zfs set volsize=2g bsd/v1
bash-3.00# growfs /dev/zvol/rdsk/bsd/v1
Warning: 2048 sector(s) in last cylinder unallocated
/dev/zvol/rdsk/bsd/v1: 4194304 sectors in 683 cylinders of 48 tracks, 128 sectors
2048.0MB in 49 cyl groups (14 c/g, 42.00MB/g, 20160 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 86176, 172320, 258464, 344608, 430752, 516896, 603040, 689184, 775328,
3359648, 3445792, 3531936, 3618080, 3704224, 3790368, 3876512, 3962656,
4048800, 4134944
Along the same lines as growing the file system, I suppose you could turn compression on for the zvol. But since the UFS is of fixed size, it won't help especially, as far as fitting more data in the file system. You can't put more into the filesystem than the filesystem thinks that it can hold. Even if it isn't using that much on the disk. Here's a little demonstration of that.
First, we will loop through, creating 200MB files in a 1GB file system with no compression. We will use blocks of zeros, since these will compress quite a bit the second time round.
bash-3.00# zfs create -V 1g p/v1
bash-3.00# zfs get used,volsize,compressratio p/v1
NAME PROPERTY VALUE SOURCE
p/v1 used 22.5K -
p/v1 volsize 1G -
p/v1 compressratio 1.00x -
bash-3.00# newfs /dev/zvol/rdsk/p/v1
...
bash-3.00# mount /dev/zvol/dsk/p/v1 /fs1
bash-3.00#
bash-3.00# for f in f1 f2 f3 f4 f5 f6 f7 ; do
> dd if=/dev/zero bs=1024k count=200 of=/fs1/$f
> df -h /fs1
> zfs get used,volsize,compressratio p/v1
> done
200+0 records in
200+0 records out
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1 962M 201M 703M 23% /fs1
NAME PROPERTY VALUE SOURCE
p/v1 used 62.5M -
p/v1 volsize 1G -
p/v1 compressratio 1.00x -
200+0 records in
200+0 records out
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1 962M 401M 503M 45% /fs1
NAME PROPERTY VALUE SOURCE
p/v1 used 149M -
p/v1 volsize 1G -
p/v1 compressratio 1.00x -
200+0 records in
200+0 records out
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1 962M 601M 303M 67% /fs1
NAME PROPERTY VALUE SOURCE
p/v1 used 377M -
p/v1 volsize 1G -
p/v1 compressratio 1.00x -
200+0 records in
200+0 records out
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1 962M 801M 103M 89% /fs1
NAME PROPERTY VALUE SOURCE
p/v1 used 497M -
p/v1 volsize 1G -
p/v1 compressratio 1.00x -
dd: unexpected short write, wrote 507904 bytes, expected 1048576
161+0 records in
161+0 records out
Dec 1 14:53:04 atl-sewr-158-122 ufs: NOTICE: alloc: /fs1: file system full
bash-3.00# zfs get used,volsize,compressratio p/v1
NAME PROPERTY VALUE SOURCE
p/v1 used 1.00G -
p/v1 volsize 1G -
p/v1 compressratio 1.00x -
bash-3.00#
So, you see that it fails as it writes the 5th 200MB chunk, which is what you would expect. Now, let's do the same thing with compression turned on for the volume.
bash-3.00# zfs create -V 1g p/v2
bash-3.00# zfs set compression=on p/v2
bash-3.00# newfs /dev/zvol/rdsk/p/v2
...
bash-3.00#
bash-3.00# mount /dev/zvol/dsk/p/v2 /fs2
bash-3.00# for f in f1 f2 f3 f4 f5 f6 f7 ; do
> dd if=/dev/zero bs=1024k count=200 of=/fs2/$f
> df -h /fs2
> zfs get used,volsize,compressratio p/v2
> done
200+0 records in
200+0 records out
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v2 962M 201M 703M 23% /fs2
NAME PROPERTY VALUE SOURCE
p/v2 used 8.58M -
p/v2 volsize 1G -
p/v2 compressratio 7.65x -
200+0 records in
200+0 records out
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v2 962M 401M 503M 45% /fs2
NAME PROPERTY VALUE SOURCE
p/v2 used 8.58M -
p/v2 volsize 1G -
p/v2 compressratio 7.65x -
200+0 records in
200+0 records out
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v2 962M 601M 303M 67% /fs2
NAME PROPERTY VALUE SOURCE
p/v2 used 8.83M -
p/v2 volsize 1G -
p/v2 compressratio 7.50x -
200+0 records in
200+0 records out
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v2 962M 801M 103M 89% /fs2
NAME PROPERTY VALUE SOURCE
p/v2 used 8.83M -
p/v2 volsize 1G -
p/v2 compressratio 7.50x -
dd: unexpected short write, wrote 507904 bytes, expected 1048576
161+0 records in
161+0 records out
Dec 1 15:16:42 atl-sewr-158-122 ufs: NOTICE: alloc: /fs2: file system full
bash-3.00# zfs get used,volsize,compressratio p/v2
NAME PROPERTY VALUE SOURCE
p/v2 used 9.54M -
p/v2 volsize 1G -
p/v2 compressratio 7.07x -
bash-3.00# df -h /fs2
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v2 962M 962M 0K 100% /fs2
bash-3.00#
This time, even though the volume was not using much space at all, the file system was full. So compression in this case is especially valuable from a space management standpoint. Depending on the contents of the filesystem, compression may still help the performance by converting multiple I/Os into single or fewer I/Os, though.
One of the things that is not available in UFS is the ability to create multiple snapshots quickly and easily. The fssnap(1M) command allows me to create a single, read-only snapshot of a UFS file system. In addition, it requires an additional location to maintain backing store for files changed or deleted in the master image during the lifetime of the snapshot.
ZFS offers the ability to create many snapshots of a ZFS filesystem quickly and easily. This ability extends to zvols, as it turns out.
For this example, we will create a volume, fill it up with some data and then play around with taking some snapshots of it. We will just tar over the Java JDK so there are some files in the file system.
bash-3.00# zfs create -V 2g p/v1
bash-3.00# newfs /dev/zvol/rdsk/p/v1
...
bash-3.00# mount /dev/zvol/dsk/p/v1 /fs1
bash-3.00# tar cf - ./jdk/ | (cd /fs1 ; tar xf - )
bash-3.00# df -h /fs1
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1 1.9G 431M 1.5G 23% /fs1
bash-3.00# zfs list
NAME USED AVAIL REFER MOUNTPOINT
p 4.00G 29.0G 24.5K /p
p/swap 22.5K 31.0G 22.5K -
p/v1 531M 30.5G 531M -
Now, we will create a snapshot of the volume, just like for any other ZFS file system. As it turns out, this creates new device nodes in /dev/zvol for the block and character devices. We can mount them as UFS file systems same as always.
bash-3.00# zfs snapshot p/v1@s1 # Make the snapshot
bash-3.00# zfs list # See that it's really there
NAME USED AVAIL REFER MOUNTPOINT
p 4.00G 29.0G 24.5K /p
p/swap 22.5K 31.0G 22.5K -
p/v1 531M 30.5G 531M -
p/v1@s1 0 - 531M -
bash-3.00# mkdir /fs1-s1
bash-3.00# mount /dev/zvol/dsk/p/v1@s1 /fs1-s1 # Mount it
mount: /dev/zvol/dsk/p/v1@s1 write-protected # Snapshots are read-only, so this fails
bash-3.00# mount -o ro /dev/zvol/dsk/p/v1@s1 /fs1-s1 # Mount again read-only
bash-3.00# df -h /fs1-s1 /fs1
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1@s1
1.9G 431M 1.5G 23% /fs1-s1
/dev/zvol/dsk/p/v1 1.9G 431M 1.5G 23% /fs1
bash-3.00#
At this point /fs1-s1 is a read-only snapshot of /fs1. If I delete files, create files, or change files in /fs1, that change will not be reflected in /fs1-s1.
bash-3.00# ls /fs1/jdk
instances jdk1.5.0_08 jdk1.6.0 latest packages
bash-3.00# rm -rf /fs1/jdk/instances
bash-3.00# df -h /fs1 /fs1-s1
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1 1.9G 61M 1.8G 4% /fs1
/dev/zvol/dsk/p/v1@s1
1.9G 431M 1.5G 23% /fs1-s1
bash-3.00#
Just as you can create multiple snapshots. And as with any other ZFS file system, you can rollback a snapshot and make it the master again. You have to unmount the filesystem in order to do this, since the rollback is at the volume level. Changing the volume underneath the UFS filesystem would leave UFS confused about the state of things. But, ZFS catches this, too.
bash-3.00# ls /fs1/jdk/
jdk1.5.0_08 jdk1.6.0 latest packages
bash-3.00# rm /fs1/jdk/jdk1.6.0
bash-3.00# ls /fs1/jdk/
jdk1.5.0_08 latest packages
bash-3.00# zfs list
NAME USED AVAIL REFER MOUNTPOINT
p 4.00G 29.0G 24.5K /p
p/swap 22.5K 31.0G 22.5K -
p/v1 535M 30.5G 531M -
p/v1@s1 4.33M - 531M -
bash-3.00# zfs rollback p/v1@s2 # /fs1 is still mounted.
cannot remove device links for 'p/v1': dataset is busy
bash-3.00# umount /fs1
bash-3.00# zfs rollback p/v1@s2
bash-3.00# mount /dev/zvol/dsk/p/v1 /fs1
bash-3.00# ls /fs1/jdk
jdk1.5.0_08 jdk1.6.0 latest packages
bash-3.00#
I can create additional read-write instances of a volume by cloning the snapshot. The clone and the master file system will share the same objects on-disk for data that remains unchanged, while new on-disk objects will be created for any files that are changed either in the master or in the clone.
bash-3.00# ls /fs1/jdk
jdk1.5.0_08 jdk1.6.0 latest packages
bash-3.00# zfs snapshot p/v1@s1
bash-3.00# zfs clone p/v1@s1 p/c1
bash-3.00# zfs list
NAME USED AVAIL REFER MOUNTPOINT
p 4.00G 29.0G 24.5K /p
p/c1 0 29.0G 531M -
p/swap 22.5K 31.0G 22.5K -
p/v1 531M 30.5G 531M -
p/v1@s1 0 - 531M -
bash-3.00# mkdir /c1
bash-3.00# mount /dev/zvol/dsk/p/c1 /c1
bash-3.00# ls /c1/jdk
jdk1.5.0_08 jdk1.6.0 latest packages
bash-3.00# df -h /fs1 /c1
Filesystem size used avail capacity Mounted on
/dev/zvol/dsk/p/v1 1.9G 61M 1.8G 4% /fs1
/dev/zvol/dsk/p/c1 1.9G 61M 1.8G 4% /c1
bash-3.00#
I think am pretty sure that this isn't exactly what the ZFS guys had in mind when they set out to build all of this, but this is pretty cool. Now, I can create UFS snapshots without having to specify a backing store. I can create clones, promote the clones to the master, and the other things that I can do in ZFS. I still have to manage the mounts myself, but I'm better off than before.
I have not tried any sort of performance testing on these. Dominic Kay has just written a nice blog about using filebench to compare ZFS and VxFS. Maybe I can use some of that work to see how things go with UFS on top of ZFS.
As always, comments, etc. are welcome!
I mentioned recently that I just spent a week in a ZFS internals TOI. Got a few ideas to play with there that I will share. Hopefully folks might have suggestions as to how to improve / test / validate some of these things.
The first thing that I thought about was using ZFS as a swap device. Of course, this is right there in the zfs(1) man page as an example, but it still deserves a mention here. There has been some discussion of this on the zfs-discuss list at opensolaris.org (I just retyped that dot four times thinking it was a comma. Turns out there was crud on my laptop screen). The dump device cannot be on a zvol (at least if you want to catch a crash dump) but this still gives a lot of flexibility. With root on ZFS (coming before too long) ZFS swap makes a lot of sense and is the natural choice. We were talking in class that maybe it would be nice if there were a way to turn off ZFS' caching for the swap surface to improve performance, but that remains to be seen.
At any rate, setting up mirrored swap with ZFS is way simple! Much simpler even than with SVM, which in turn is simpler than VxVM. Here's all it takes:
bash-3.00# zpool create -f p mirror c2t10d0 c2t11d0 bash-3.00# zfs create -V 2g p/swap bash-3.00# swap -a /dev/zvol/dsk/p/swap
Pretty darn simple, if you ask me. You can make it permanent by changing the lines for swap in your /etc/vfstab (below). Notice that you use the path to the zvol in the /dev tree rather than the ZFS dataset name.
bash-3.00# cat /etc/vfstab
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
#/dev/dsk/c1t0d0s1 - - swap - no -
/dev/zvol/dsk/p/swap - - swap - no -
I would like to do some performance testing to see what kind of performance you can get with swap on a zvol. I am curious about how this will affect kernel memory usage. I am curious about the effect of things like compression on the swap volume. Thinking about that one, it doesn't make a lot of sense. I am also curious about the ability to dynamically change the size of the swap space. At first glance, changing the size of the volume does not automatically change the amount of available swap space. That makes sense. That makes sense for expanding swap space. But if you reduce the size of the volume and the kernel doesn't notice, that sounds like a it could be a problem. Maybe I should file a bug.
Suggestions for things to try and ways to measure overhead and performance for this are welcomed.