Thursday Oct 11, 2007

UltraSPARC T2 & LDoms 1.0.1

Sun is officially announcing the first products in its UltraSPARC T2 (US T2) based platform lineup today, the T5x20 series. You can read all about the details here and here. There are two big stories here related to our Logical Domains technology. For those of you who are new to my blog, Logical Domains (LDoms for short) is the name of Sun's virtualization technology for our SPARC CMT platforms that allows multiple operating systems to run concurrently on a single system.

These products represent the first of our CMT based platforms that are shipping with LDoms technology pre-installed from the factory. All future CMT servers from Sun will ship with the ability to run Logical Domains out of the box. This includes the LDoms-enabled hypervisor (our LDoms hypervisor runs on bare metal, and is embedded in the firmware of the platform), all the necessary Solaris components, and the LDoms Manager package (which is what my team works on).

This further represents the introduction of version 1.0.1 of LDoms technology. Besides support for the new US T2 based platforms (and a slew of bug fixes), this release supports the ability to reset any domain, even one which owns physical I/O devices, while all other domains continue to run. Even the control domain, i.e. the one on which the LDoms Manager runs, can reboot while all other domains stay up. This represents a major step forward in terms of RAS capability for LDoms.

As of today, LDoms version 1.0.1 is only available pre-installed on our newly announced US T2 based servers. Stay tuned here for information on the impending availability of this upgrade on our existing US T1 based platforms!

UPDATE: LDoms version 1.0.1 is now available for download here. This includes the firmware updates for US T1 platforms.

WARNING: There are two important caveats when upgrading from LDoms 1.0 to LDoms 1.0.1:

  • Configurations saved to the service processor under 1.0 are not usable under 1.0.1. The LDoms 1.0.1 Administration Guide describes the upgrade procedure that needs to be applied to work around this. Part of this procedure needs to be carried out BEFORE performing the actual upgrade!
  • You must upgrade both the firmware and LDom Manager components at the same time.

Tuesday Aug 21, 2007

VIO device renaming by LDom Manager

[This is the first in a series of entries I'll write about tips, tricks & other issues with the LDom Manager.]

The LDom Manager allows you to specify a name for each VIO client & server instance you configure. Currently (i.e. in LDoms 1.0 and the upcoming 1.0.1 releases), this information is not stored as part of the machine description (MD) for the associated guest domain. Instead, the device name to instance mapping is stored in the LDom Manager's private constraints database, which is itself persisted as a simple XML file in the control domain's filesystem.

There are cases where the information in the constraints database doesn't match that of the running system, and in those cases, the LDom Manager, on startup, will apply a canonical name to any VIO device(s) for which no name mapping is available. The two main reasons this can happen are:

  • Loss of the constraints database file (as a result of an OS upgrade, for example)
  • Reverting the system to a configuration stored on the SP containing a different set of VIO devices than the currently running Config.

When the LDom Manager first starts, if it can't find a mapping for a given VIO device in its constraints database, it applies a canonical name using the following heuristics:

For VIO clients: <type><instance #>, where <type> is either "vnet" or "vdisk", and the instance # is incremented for each additional device of type <type> encountered

For VIO servers: <domain-name>-<type><instance #>, where type is "vds", "vsw", or "vcc"

The Ldom Manager's renaming of VIO devices never affects the actual binding of VIO devices to instances in the OS, nor the binding of VIO clients to servers; everything continues to operate normally. The impact is in how the LDom Manager references VIO devices for display and reconfiguration by the user.

There is, however, one more serious problem to note: if a VIO device is configured using a name that matches a potential canonical name, and the LDom Manager subsequently attempts to use that same canonical name on another VIO device, it'll cause the LDom Manager to abort on startup, and eventually enter maintenance mode. This failure can be identified by this message appearing in the LDom Manager's log file:

Assertion failed: 0L != clientp->published_name, file vio_classes.c, line 2471

To work around this, so the LDom Manager can start, its constraints database (stored in /var/opt/SUNWldm/ldom-db.xml) must be hand-edited to rename the offending VIO device to one that doesn't collide with the canonical name namespace. There is a bug open in our bug tracking system for this problem; it's CR #6571091.

We plan to address these issues in an upcoming release (after the 1.0.1 release), by eliminating the need for the LDom Manager to rename VIO devices altogether.

Tuesday Jul 24, 2007

Updated Beginners Guide to LDoms

Tony Shoumack has just posted an updated version of the excellent Sun Blueprints™ document Beginners Guide to LDoms. This is the perfect resource and reference document for those folks new to Logical Domains. It provides both a conceptual background of the technology, and specific guidance on configuring LDoms. This represents a significant update to the previous version, incorporating feeedback from LDoms experts within Sun as well as our customers. It is also now current with the 1.0 release. If you had downloaded the previous version (which was targeted at our old, pre 1.0 release candidate builds of LDoms), please update to this latest version.

Tuesday May 01, 2007

LDoms 1.0 is out!

As Ash and Narayan have already announced, Logical Domains 1.0 is now a reality!

Tuesday Mar 06, 2007

Logical Domains 1.0 Early Access Now Available!

It's here! LDoms 1.0 Early Access has just been posted to the Sun Download Center. This 68+ MB tarball includes new firmware images for T1000, T2000, Netra T2000 Server & Netra CP3060 Blade systems; mandatory patches to Solaris 10 11/06 (aka Update 3); the unbundled LDoms Manager & security packages; documentation; and an install script. In addition, a Sun BluePrints article entitled Beginners Guide to LDoms: Understanding and Deploying Logical Domains is available separately.

This is Early Access software. As stated on the download page: "This is an unsupported technology preview of Logical Domains software, providing early access to those interested in the technology". In other words, there are known bugs, limitations & rough edges (all of which we're working hard to eliminate for our 1.0 General Availability release later this spring).

As the LDoms Manager team lead, I'm very excited to help make this technology available for folks to try out. Everyone on the LDoms team is thrilled to have reached this major milestone, and we all look forward to hearing about the experiences of all our early adopters. We welcome your feedback (ldoms-feedback AT sun DOT com) to help us improve the product, and you can participate in the discussion at the LDoms SysAdmin Hub on BigAdmin.

Saturday Jan 20, 2007

I've been Tagged

So I've joined the ranks of those tagged. Thanks Jason. The rules are simple: I post five things most people don't know about me, and then I tag others to do the same. So here goes:

1) I am a licensed (but currently inactive) pilot. I passed my checkride in 1989. Unfortunately, lacking sufficient time or finances to maintain my proficiency, I stopped and have not flown in the past 15 or so years. I still yearn to get back up in the air.

2) I have played several instruments. I played french horn in my junior high school band. It didn't last long; the difficulty lugging the thing around meant I didn't do much practicing at home. I also took piano lessons as a kid, but never got very far with those either. More recently, I took an adult Ed. class to learn to play harmonica. Now there's a nice portable instrument! I still pick it up every once in a while and play a blues lick or two...

3) I saw Queen in concert in Buffalo, NY in 1979. My college roommate camped out for tickets, and we wound up with 8th row seats (plus the stage was jutting out four rows where we were sitting). It was one of the best concerts I can remember (there were a few I barely remember, but that's another story); Freddie Mercury was an amazing performer. RIP.

4) My nickname in college was Flounder. From the movie Animal House. Because my roommate decided to give me a nickname that bore no resemblance to my personality. This is the same roommate who got me 8th row Queen tickets, so I cut him enough slack.

5) I am a member of the Eric Conspiracy Secret Laboratories:

Okay, Ash, Narayan & Josh, consider yourselves tagged.

Thursday Jan 04, 2007

Superhero Test

Which Superhero am I?[Read More]

Tuesday Jan 02, 2007

LDoms features in Solaris

LDoms features are showing up in Solaris 10 11/06 (aka Update 3); this does not mean LDoms is now available.[Read More]

Wednesday Dec 20, 2006

First six weeks with my new MacBook Pro

I've now had my shiny new Intel Core 2 Duo based, 15" MacBook Pro for about six weeks. Here are my impressions, good & bad.

First the bad:

NOTHING. After reading all the reports of various problems with the MacBook Pros, mine has been operating flawlessly. No strange noises, no overheating, no optical drive issues (though I've yet to burn a DVD), no wireless issues.

Okay, one thing: memory. I wish it supported more. I know this isn't Apple's fault, but the 2GB I have is clearly not enough when running Parallels VMs, and I suspect upgrading to the limit of 3GB would only help marginally.

Oh yeah, one more thing: disk speed. The 120GB drive only rotates at 5400RPM, and on I/O bound work, it feels a tad sluggish. The hard drive upgrade option from Apple was even slower, so I stuck with the 120GB.

Now the good:

It's FAST. I'm upgrading from an 867MHz TiBook, and the speed difference is amazing. The migration assistant made the transition painless, and since I'm good about keeping my software up to date, almost everything was already a universal binary and ran at full speed from day one.

One key CPU benchmark for me, SETI@Home, processes each job approximately 5x faster than my TiBook. But with two CPUs, it now processes two jobs at once, for almost a 10x throughput increase!

I opted for the glossy screen, and haven't regretted the decision. This thing is so crisp! I almost never have an issue with glare, and when I do, a very slight adjustment usually resolves that problem.

Other features that are a nice upgrade from the TiBook: built-in Bluetooth & iSight, better WiFi reception, backlit keyboard, and two-finger scrolling. I have to say that this last feature is one of the most productivity-enhancing I've come across in a long time. I now can't stand to sit in front of a laptop who's trackpad desn't support it.

I also splurged for a Bluetooth Mighty Mouse (I already own & love a wired one), and losing the wire is such a pleasure for mousing!

Luckily, Cisco upgraded their VPN client just in time to resolve an issue with the Core 2 Duo chips, so I was able to VPN into Sun's network with no issues.

I love seeing those _two_ CPU load meters in my menu bar, thanks to MenuMeters. At first, I thought it was malfunctioning! Took me a second or two to realize...

There were a couple quirks with X11, but those were quickly resolved with an Update from Apple. I've got Emacs and Open Office working just fine. I haven't been able to get the X11 version of VNC to compile & run sucessfully, so I'm using Chicken of the VNC.

Finally, Parallels rules! I have VMs for Windows XP (Yuck), Ubuntu Linux, and Solaris currently installed. As someone who's currently working on virtualization in my day job (see my posts on LDoms), it's so sweet to have this option on my laptop.

The increasing availability of virtual appliances for Parallels on the web is not only totally cool; it also represents an important evolution for virtualization technology in general. Being able to download, install, configure and run complete software stacks in a matter of minutes really brought home that point to me. It's given me some ideas that I want to see us incorporate into LDoms.

All in all, I'm one very happy customer.

Wednesday Oct 25, 2006

New Macbook Pro

[Caveat: I'm an avid fan of Apple and their Macintosh computers. There are lots of us at Sun. If you don't care to read posts about my experiences with my Macs, then skip anything I write under the "Mac" category.]

I've been waiting for months for Apple to announce Merom (aka Intel Core 2 Duo) processor support in their Macbook Pro laptops. My wait ended last night, and I immediately placed my order:

-----------------------------------
Product Name: MBPRO 15/2.33 CTO

With the following configuration:
 - Processor	0656642	2.33GHz Intel Core 2 Duo
 - Memory	0656619	2GB 667 DDR2 SDRAM2x1GB
 - Hard Drive	0656623	120GB Serial ATA Drive@5400rpm
 - Optical Drive	0656625	SuperDrive 6X
 - Display	0656632	15 Glossy Widescreen Display
 - Modem	0656645	None
 - Apple Software Solutions	0656200	None
 - Keyboard/Mac OS Language	0656627	BkLit Keyboard/Mac OS
 - Country Kit/AEX	0656628	Country Kit


Estimated Shipped By: OCT 30, 2006
Estimated Delivered By: NOV 06, 2006

This is to replace my four year old, slightly dented but still working, Titanium Powerbook 867MHz workhorse. It's gonna be a long two weeks!

What I do at Sun

Whenever I'm asked by family & friends to describe what I do at Sun, unless I'm talking to other engineers, my description of my job & current project usually leaves my audience with that glazed look in their eyes. My hope is that with this blog, I can describe my work, and the immense satisfaction I get out of it, in a way that's approachable to all interested. So I'll start here with a high-level, not too technical description of what I do, then dive into more technical details in future posts.

I lead a small team of developers working on software that configures and manages a set of computer systems. But instead of each of these "systems" being a separate physical computer, they're all running on the same machine! This is possible partly through a technique called virtualization, which allows one computer to run multiple operating system (OS) images simultaneously. Each OS thinks it's running on its own physical computer system, with CPU(s), memory, disk, console terminal, and network connection(s). This illusion of a complete computer system is sometimes called a "virtual machine". We call it a "Logical Domain" (LDom for short). The OS image running in an LDom is termed a "guest" OS. To someone logging in to a LDom, it is indistinguishable from logging in to a separate machine running the same OS.

We actually accomplish this through a combination of partitioning and virtualization. Since the computers our project utilizes contain lots of CPUs and memory, we can split these up among the running guest OS's. This is the partitioning piece. But in the case of, say, network connections, there are generally not enough to go around for each guest to have its own. So we utilize virtualization to allow a small number of limited physical resources to appear as a larger number of virtual resources, allowing each guest to think it has its own private resource (e.g. network connection), when in reality, it's sharing that resource with other guests.

The management software our team develops is just one component of the overall "Logical Domains" (LDoms) technology. Correspondingly, our small team is part of a larger team implementing all the other necessary components to produce the LDoms software "stack".

News first started leaking out about our LDoms effort when David Yen mentioned the technology in an interview he did back in January. More recently, LDoms was announced by John Fowler, the Vice President of the recently formed Systems Group.

Although LDoms has not been released yet, some of the code is already available as part of the OpenSolaris project. In addition, other folks have been blogging about it, so it's time that some of the developers on the LDoms team join in the discussion.

The initial LDoms release will be targeted at our T1000 and T2000 platforms. These systems utilize our UltraSPARC T1 processor and Hypervisor virtualization technology. LDoms leverages the virtualization infrastructure already built in to the hardware and hypervisor of these platforms, and takes it one step further to provide support for dynamically creating & managing multiple logical domains, each able to host its own guest OS. Some customers without highly threaded workloads have asked us how these CMT systems can best work for them. Logical Domains is one important answer.

My role is the team lead for the LDoms Manager software component. The LDoms Manager configures, monitors and controls logical domains, and is itself managed by system administrators (either directly or through other management software). I'll talk more about how the LDoms Manager accomplishes these tasks in subsequent posts. For now, I hope this introduction helps both non-technical and technical folks get a better understanding of what I'm doing in my current project, as well as a little taste of what virtualization technology and our Logical Domains product are all about.

Sunday Oct 01, 2006

Hello

Hi! I've been planning to introduce myself to the blogosphere for a while now. I've been tweaking a draft of my initial posting, describing both what I do and what the Logical Domains (LDoms) project I'm involved in is all about, for a couple months now, but the pace of work on our virtualization stack for Sun's SPARC sun4v based platforms (LDoms in eight words) has been preventing me from finishing and posting it.

So why am I posting now? And not at all with what I had initially planned to use to introduce myself? What got me off my duff was Josh Simons' Why Blog? entry. I already understand the benefits of blogging to Sun, its customers, and our own engineering community. But what compelled me to respond was his comment: "...one of you growing up had Howard Stern for a summer camp counsellor". Well, I couldn't just let that comment lie there. Yes, Josh is referring to me.

The year was 1974, the camp was (the now defunct) Wel-Met, and this was no ordinary summer camp experience: Imagine 40 15-16 year old kids, three counselors, and one chartered bus, traveling cross-contry for six weeks, camping out every night. That was the Wel-Met Western Trip. One of those three counselors was a relatively quiet, unassuming guy named Howard Stern (we knew him as Howie). Look here for some photos from our trip.

Okay, enough of that. My goal with this blog is to talk about what we're doing with LDoms, as well my role in the project. I'll tend to stick mostly to technical topics, ranging from a high enough level for my family to understand, to nitty-gritty details only a true geek would enjoy.

I will post my previously planned introductory entry next. But first, thank you Josh, for your none too subtle attempt at goading me into joining the conversation.