After an undesirably long time, I'm happy to say that another drop of Solaris on Xen is
available here.
Sources and other sundry parts are here.
Documentation can
be found at our community site, and
you can read
Chris Beal describe how to
get started with the new bits.
As you might expect, there's been a massive amount of change
since the last OpenSolaris release.
This time round, we are based on Xen 3.0.4 and build 66 of Nevada. As always, we'd love to hear about
your experiences if you try it out, either on the mailing list or the IRC channel.
In many ways, the most significant change is the huge effort we've put in to stabilize our codebase; a
significant number of potential hangs, crashes, and core dumps have been resolved, and we hope we're
converging on a good-quality release. We've started looking seriously at performance issues, and filling
in the implementation gaps. Since the last drop, notable improvements include:
- PAE support
-
By default, we now use PAE mode on 32-bit, aiding compatibility with other domain 0 implementations; we also
can boot under either PAE or non-PAE, if the Xen version has 'bi-modal' support. This has probably been the
most-requested change missing from our last release.
- HVM support
-
If you have the right CPU, you can now run fully-virtualized domains such as Windows using a Solaris dom0! Whilst
more work is needed here, this does seem to work pretty well already. Mark Johnson has some useful tips on using HVM domains.
- New management tools
-
We have integrated the virt- suite of management tools. virt-manager provides
a simple GUI for controlling guest domains on a single host. virt-install and virsh are simple CLIs
for installing and managing guest domains respectively. Note that parts of these tools are pre-alpha, and we still
have a significant amount of work to do on them. Nonetheless, we appreciate any comments...
- PV framebuffer
-
Solaris dom0 now supports the SDL-based paravirt framebuffer backend, which can be used with domUs that have PV framebuffer support.
- Virtual NIC support
-
The Ethernet bridge used in the previous release has been replaced with virtual NICs from the
Crossbow project. This enables future work
around smart NICs, resource controls, and more.
- Simplified Solaris guest domain install
-
It's now easy to install a new Solaris guest domain using the DVD ISO. The temporary tool in the last release,
vbdcfg, has disappeared now as a result. William Kucharski has a walk-through.
- Better SMF usage
-
Several of the xend configuration properties are now controlled using the SMF framework.
- Managed domain support
-
We now support xend-managed domain configurations instead of using .py configuration files. Certain
parts of this don't work too well yet (unfortunately all versions of Xen have similar problems), but we are
plugging in the gaps here one by one.
- Memory ballooning support
- Otherwise known as support for dynamic xm mem-set, this allows much greater flexibility in partitioning
the physical memory on a host amongst the guest domains. Ryan Scott has more details.
- Vastly improved debugging support
-
Crash dump analysis and debugging tools have always been a critical feature for Solaris developers. With this release,
we can use Solaris tools to debug both hypervisor crashes and problems with guest domains. I talk a little bit about
the latter feature below.
- xvbdb has been renamed
-
To simply be xdb. This was a very exciting change for certain members of our team.
We're still working hard on finishing things up for our phase 2 putback into Nevada (where "phase 1"
was the separate dboot putback). As well as
finishing this work, we're starting to look at further enhancements, in particular some features that are available
in other vendors' implementations, such as a hypervisor-copy based networking device, blktap support,
para-virtualized drivers for HVM domains (a huge performance fix), and more.
Debugging guest domains
Here I'll talk a little about one of the more minor new features that has nonetheless proven very useful.
The xm dump-core command generates an image file of a running domain. This file is a dump of all
memory owned by the running domain, so it's somewhat similar to the standard Solaris crash dump files.
However, dump-core does not require any interaction with the domain itself, so we can grab
such dumps even if the domain is unable to create a crash dump via the normal method (typically, it hangs
and can't be interacted with), or something else prevents use of the standard Solaris kernel debugging facilities
such as kmdb (an in-kernel debugger isn't very useful if the console is broken).
However, this also means that we have no control over the format used by the image file. With Xen 3.0.4,
it's rather basic and difficult to work with. This is much improved in Xen 3.1, but I haven't yet written
the support for the new format.
To add support for debugging such image files of a Solaris domain, I modified mdb(1) to understand the format
of the image file (the alternative, providing a conversion step, seemed unneccessarily awkward, and would have had to
throw away information!). As you can see if you look around usr/src/cmd/mdb in the source drop,
mdb(1) loads a module called mdb_kb when debugging such image files. This provides simple methods for
reading data from the image file. For example, to read a particular virtual address, we need to use the contents of
the domain's page tables in the image file to resolve it to a physical page, then look up the location of that page
in the file. This differs considerably from how libkvm works with Solaris crash dumps: there, we have a
big array of address translations, which is used directly, instead of the page table contents.
In most other respects, debugging a kernel domain image is much the same as a crash dump:
# xm dump-core solaris-domu core.domu
# mdb core.domu
mdb: warning: dump is from SunOS 5.11 onnv-johnlev; dcmds and macros may not match kernel implementation
Loading modules: [ unix genunix specfs dtrace xpv_psm scsi_vhci ufs ... sppp ptm crypto md fcip logindmux nfs ]
> ::status
debugging domain crash dump core.domu (64-bit) from sxc16
operating system: 5.11 onnv-johnlev (i86pc)
> ::cpuinfo
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 fffffffffbc4b7f0 1b 40 9 169 yes yes t-1408926 ffffff00010bfc80 sched
> ::evtchns
Type Evtchn IRQ IPL CPU ISR(s)
evtchn 1 257 1 0 xenbus_intr
evtchn 2 260 9 0 xenconsintr
virq:debug 3 256 15 0 xen_debug_handler
virq:timer 4 258 14 0 cbe_fire
evtchn 5 259 5 0 xdf_intr
evtchn 6 261 6 0 xnf_intr
evtchn 7 262 6 0 xnf_intr
> ::cpustack -c 0
cbe_fire+0x5c()
av_dispatch_autovect+0x8c(102)
dispatch_hilevel+0x1f(102, 0)
switch_sp_and_call+0x13()
do_interrupt+0x11d(ffffff00010bfaf0, fffffffffbc86f98)
xen_callback_handler+0x42b(ffffff00010bfaf0, fffffffffbc86f98)
xen_callback+0x194()
av_dispatch_softvect+0x79(a)
dispatch_softint+0x38(9, 0)
switch_sp_and_call+0x13()
dosoftint+0x59(ffffff0001593520)
do_interrupt+0x140(ffffff0001593520, fffffffffbc86048)
xen_callback_handler+0x42b(ffffff0001593520, fffffffffbc86048)
xen_callback+0x194()
sti+0x86()
_sys_rtt_ints_disabled+8()
intr_restore+0xf1()
disp_lock_exit+0x78(fffffffffbd1b358)
turnstile_wakeup+0x16e(fffffffec33a64d8, 0, 1, 0)
mutex_vector_exit+0x6a(fffffffec13b7ad0)
xenconswput+0x64(fffffffec42cb658, fffffffecd6935a0)
putnext+0x2f1(fffffffec42cb3b0, fffffffecd6935a0)
ldtermrmsg+0x235(fffffffec42cb2b8, fffffffec3480300)
ldtermrput+0x43c(fffffffec42cb2b8, fffffffec3480300)
putnext+0x2f1(fffffffec42cb560, fffffffec3480300)
xenconsrsrv+0x32(fffffffec42cb560)
runservice+0x59(fffffffec42cb560)
queue_service+0x57(fffffffec42cb560)
stream_service+0xdc(fffffffec42d87b0)
taskq_d_thread+0xc6(fffffffec46ac8d0)
thread_start+8()
Note that both ::cpustack and ::cpuregs are capable of using the actual register set at
the time of the dump (since the hypervisor needs to store this for scheduling purposes). You can also
see the ::evtchns dcmd in action here; this is invaluable for debugging interrupt problems (and
we've fixed a lot of those over the past year or so!).
Currently, mdb_kb only has support for image files of para-virtualized Solaris domains. However,
that's not the only interesting target: in particular, we could support mdb in live
crash dump mode against a running Solaris domain, which opens up all sorts of interesting debugging
possibilities. With a small tweak to Solaris, we can support debugging of fully-virtualized Solaris instances.
It's not even impossible to imagine adding Linux kernel support to mdb(1), though it's hard to imagine there
would be a large audience for such a feature...
Tags: Xen OpenSolaris
Trackback URL: http://blogs.sun.com/levon/entry/solaris_xen_update