Bringing up Solaris domain 0 (dom0) on Xen was
surprisingly easy. Mostly because all of the hard
work was already done by other people. The
hard work which remained, was also done by other
people
I apologize in advance for giving credit to the
wrong folks or for taking credit for something I
didn't do. This was such a blur, it all tends to
blend together...
Obviously, this won't cover everything. I tried
to talk about some of the more interesting parts.
Well, interesting is relative of course
To start with, first you need to be able build xen
on Solaris. You could actually cheat and start with
a xen image and skip all the user apps to manage
domUs. But that seems kind of pointless unless you
have tons of bodies to throw at the effort, which we
don't, thankfully.
John L
and
Dave
already had Xen building, so all I had
to do was ask them what I needed to do to build it..
The first thing you need are changes to gcc and
binutils that's shipped in /usr/sfw. Which is why you
need to download unofficially updated SUNWgcc,
SUNWgccruntime, and SUNWbinutils packages in order to
build the xen sources on Solaris (they will be
officially updated at some point in the future).
There were two things that John L fixed. The first
one was a bug in how we build gcc (can't find it's
own ld scripts).
See this bug.
The second fix was to add a -divide to the binutils gas
to not treat / as a comment. John got this change back
to to binutil cvs repository, but it hasn't made it out
in a release yet (as far as I know).
Of course, Dave and John L had to change stuff in the
xen.hg gate to get it to compile too. If you look at
the source, you'll notice there are a few things
we don't try and compile current, e.g. hvm related
support. Then, of course, you need to test it to make
sure the xen binary worked (user apps would have to
wait until Solaris dom0 was up). Not sure if it just
worked or they had to debug it, but it was working by
the time I got to it
So after I built my xen gate, put xen.gz in /boot (starting with
32-bit dom0), and tried to boot a i86xen (vs
i86pc) version of the kernel debugger (kmdb). Again,
I was following footsteps here. John L had done a ton of
work getting kmdb to work in domU (since we already had
Solaris domU running on a Linux dom0). And Todd and/or
John L had already debugged kmdb on a Solaris dom0.
So I was at kmdb prompt ready to venture into unknown
territory.
So before I could boot my Solaris dom0, I had to build
one. Up to this point, we only had the driver changes
we needed for domU. Before xen, we only had one x86
"platform", i86pc.
This is unlike SPARC, which usually gets a new "platform"
or every major architecture change (e.g. sun4m, sun4u,
sun4v). On SPARC, you'll also see machine specific
platmod's and platform directories to provide additional
functionality and modules which are specific to a given
machine (e.g. /platform/SUNW,Sun-Fire-880).
For xen (on x86), we have a new "platform", i86xen.
For Solaris dom0, we we're missing all of the drivers
which were in i86pc (i.e. they did not show up in i86xen).
The vast majority of these drivers aren't platform
specific and can go into intel, i.e. doesn't have
any platform specific code (which today is i86pc and
i86xen). So I had to try to move each driver over to
intel and see if it had platform specific code or not.
Since there was only one intel "platform" in the past,
the lines we're a little gray at times. But I finally got
through it and ended up moving around 40 drivers in src/uts
and a little over 15 in closed/uts, to intel from i86pc.
For the rest, I need to create makefile in i86xen to
build a platform specific version of these drivers.
Now I had a Solaris dom0 kernel to boot. I setup my
cap-eye install kernel, rebooted into kmdb, and :c'd
into a new world. The majority of the hard work was
already done bringing up domU. The CPU and VM code for
domU, done by
Tim,
Todd, and
Joe
just worked for domain 0. That made life very simple.
The first problem I ran into was the internal pci config
access setup in mlsetup. It was initially shutoff for domU,
I had added it back in for dom0. However, this requires a
call to the BIOS, which xen doesn't allow. So I changed
the code to default to PCI_MECHANISM_1 for i86xen dom0.
From there, the next problem I ran into was ins/outs
weren't working.. That was fixed with a
HYPERVISOR_physdev_op (PHYSDEVOP_SET_IOPL), which
ended up being slightly wrong and fixed by Todd before
we released.
Now I was at the point where we are attaching drivers
and the drivers are trying to map in their registers.
Joe had done a bunch of work in the VM getting the
infrastructure ready for foreign PFNs, which are basically
PFN's which are tagged to mark then as containing the real
MFN, instead of being present in the mfn_list. Since this
was the first time trying that code out, I ran into a couple
of minor bugs. The more interesting problem was that Xen was
using one of the software PTE bits in a debug version of
Xen which conflicted with the bit we we're using to mark
the page as a foreign. I commented out that feature and
rebuilt Xen and continued on while Joe worked on changing
the PTE software bits to be encoded instead of individual
flags to avoid bit 2 int PTE software field.
I had already changed the code in rootnex to convert the
MFN (device register access) to a foreign PFN during
ddi_regs_map_setup(). So once the PTE software bits were
cleared we were sailing through the driver reading its
device registers and on to mapping memory for device DMA.
I had also modified the rootnex dma bind routine. When
we're building dma cookies, we need to put MFNs in the
cookies instead of PFNs. I had a couple of bugs in that
code, fixed that up, then ran into the contig alloc code
path. I hadn't coded up the contig alloc code path changes
yet (were we want to allocate physically contiguous memory).
So I cheated and temporarily took out all the drivers which
required contig alloc, and did the contig alloc code at a
later time (my boot device didn't need it
)
Now I was up to vfs_mountroot(). This is where the
Solaris drivers start taking over disk activity and
stop using the BIOS to load blocks. This is also where
we first start noticing problems if interrupts don't work.
This is where I handed off the Stu
. This was the last
of the hard problems. Stu had been busily working on Solaris
dom0 interrupt support. A mix of event channels, pcplusmp,
ACPI, and APICs. Something I would never wish on anyone.
Stu got it up and working remarkably fast (something he should
talk about
) and I was back and running up to the console
handover.
The console config code is a little bit messy in solaris.
I waded through that for a little bit. All of the code was
originally in the common intel part of the code. I moved
the platform specific code to i86pc and i86xen then have
a different implementation in i86xen which basically always
sends the Solaris console to the Xen console. Not
sure if it will stay that way in the end, but that makes
the most sense IMO.
And from there, I was at the multi-user prompt..
Some other interesting problems I ran into during the
bringup. I had to have isa fail to attach on a Solaris
domU. The ISA leaf drivers assume the device is present and
bad things happen. There were a couple places in the
kernel where they have hard coded physical address which
it tries to map in (e.g. psm_map_phys_new; the lower
1M of memory, used for BIOS tables, etc.; and xsvc used
by Xorg/Xsun). And we found out the hard way that Xen's low
mem alloc implementation is linux specific. Only allocates
memory < 4G && > 2G. We need to redo our first pass at
implementing memory constrained allocs.
As far as booting 64-bit Solaris dom0, it booted up the
first time.
We'll that enough for now.. I'll save the bringup of domUs
on a Solaris dom0 for the next post. That was a little more
challenging...