Wednesday September 30, 2009 | Ghost Busting Hunting down the Ghosts in our machines. Chris Beal's Weblog |
|
Just a quick blog, so that hopefully when you google for this you will find something and not spend hours debugging it. I just upgraded to OpenSolaris B123, and everything worked fine, however when I logged out and back in again (something I rarely do, so it might have been present before), the machine was on it's knees. As usual I started the debugging process # prstat show svc.configd at the top # truss -p `pgrep svc.configd` just showed it running door_return() a lot. This implies that something is talking to svc.configd. On reflection I could have dtraced the door_call and door_return calls and gathered the execname of the process it was talking to, but I spotted before that, that my pid values were going up quickly. So clearly there was a process starting up, talking to svc.configd and exiting and repeating. After a few minutes digging around I found desktop-print-management-applet calling ospm (OpenSolaris print manager) which was doing svcprop. A colleague then pointed me to http://defect.opensolaris.org/bz/show_bug.cgi?id=10863 which described the problem and gives a workaround
Fortunately this should be fixed soon
This is really just a note to myself as I keep forgetting the options. I'm developing some new plugins for fmd. When they don't work, there's loads of additional data you can get out which isn't there by default, and you can add debug print statements to your code like
But these won't be visible unless you fmd in a debug mode.
First you need to disable fmd
If you want to see what fmd it's self is doing add the -o debug=all flag
Then you see those lovely debug messages. Posted by cwb ( Sep 15 2009, 01:08:48 PM BST ) Permalink Virtualization Landing Page I don't normally just post a link to a page of links, but in conversation with one of our doc writers today, she mentioned they'd put together a landing page for all out V12N products http://docs.sun.com/source/821-0057/ I thought it had some quite interesting links so I thought I'd share Chris Posted by cwb ( Jul 09 2009, 04:29:42 PM BST ) Permalink Comparing dtrace output using meld Comparing dtrace and other debug logs using meldmeld is a powerful OpenSource graphical "diff" viewer. It is available from the OpenSolaris IPS repositories so can bee installed from the packagemanager in OpenSolaris or simply by typing $ pfexec pkg install SUNWmeld It is very clever at identifying the real changes within files and highlighting where the difference start and end.
First off the dtrace script is $ cat rx.d #!/usr/sbin/dtrace -Fs fbt::nge_receive:entry
{
self->trace=1;
} fbt::nge_receive:return /self->trace==1/ {
self->trace=0; } fbt:::entry /self->trace==1/ {
printf("%x",arg0);
} fbt:::return /self->trace == 1/ {
printf("%x",arg0);
} This very simply traces all function calls from the nge_receive() function.
$ meld rx.out rx.works.out This throws up a gui as seen here
You can see it essentially does two things I've decided I really need to get back to writing a blog occasionally, and what better day to choose than June 1 2009. Why? Well today we release OpenSolaris 2009.06, the latest OpenSource release of our operating system Solaris. I know this all sounds a bit marketing, but actually there are some really good reasons for running OpenSolaris on your own machine. First off, it is the most secure OS I know of. No need to Virus protection. Second, it just works (mostly). I've just got a new Macbook Pro, I always find it easier to do development work on Solaris than any other platform so I like to run OpenSolaris. It installed pretty much seamlessly (just having to change the EFI disk label using the macOS fdisk utility as described here). The only thing that doesn't work out of the box is the Wifi - which is a pain. It's a broadcom chipset so I've got hold of a PCI3/4 Atheros card which works well Third, all the development tools I need (and indeed anyone developing for or on Solaris) are available within the standard repositories. I found this page which is how I set up my laptop as a build machine. From a day to day computing perspective it does everything I need. Mail, Web, chat all included, OpenOffice in the repositories for free (and simple) download. A new Media player in Elisa (in the repo), though unfortunately you have to buy the codecs for many common video formats. So the next questions is, is it any different from 2008.11? Well it's hard for me to say as I've been upgrading every few weeks to the latest development builds (by using the opensolaris.com/dev repository). But I did install it fresh in side a VirtualBox VM and was impressed with the speed of the install. The auto installer is now more complete and can install SPARC machines (necessary for a good proportion of our customers). There are networking improvements, but generally the speed and usability is what you'll notice. Oh and Fast reboot. Makes it much quicker to shutdown or reboot a machine. Today I'm attending Comunity One (or C1 as we call it) and much more will be discussed about OpenSolaris and all our other OpenSource development efforts. I'll try to remember to write a blog about it (though don't hold your breath on recent form
In praise of the blogosphere - Upgrading OpenSolaris past build 91 I can't believe it's 4 months since I last blogged. Oh well it's been a hectic time. And this is just a quick not for my own memory jogging purposes really. I have been having a pain of a time upgrading from OpenSolaris build 91. I've been away alot (fantastic holiday thanks very much) and had been working on my laptop under VirtualBox. Any way. I would do my usual $ pfexec pkg refresh --full $ pfexec pkg install SUNWipkg $ pfexec pkg image-update However at the end of downloading all the packages It would give me an error like pkg: attempt to mount opensola���2 failed. pkg: image-update cannot be done on live image I hadn't really given it much thought for a while but when I did google the message it took me straight to this page http://louisbotterill.blogspot.com/2008/07/open-solaris-to-b93.html Which pointed out the missing information(can't workout how I missed it - may be it was while I was away) http://defect.opensolaris.org/bz/show_bug.cgi?id=2387
So any way upgrading it on a copy of the BE really helped and I am now on the latest bits. Thanks to Louise Botterill for the tips
Interesting view on making money in business I had the chance to be a key note speaker at PROMISE (and ICSE workshop) last week. The group uses datamining and A/I techniques to look for patterns and make predictions on a variety of things, like where in code defects might occur, or how much effort a project might entail. I was there because we had a project a few years a go running to predict which bugs might cause customer escalations, using similar techniques. I was responsible for implementing the fixing of these bugs proactively. My talk was geared around how to put a business case together and run such a project, and ultimately why in this case it was wound up before major benefits were realised. There were loads of other great talks and papers, but the other keynote speaker, Murray Cantor from IBM had some interesting points, one of which I wanted to pull out here. He said that there are three things you can monetize. Innovation, Customer Relationships, and Cost structure. For example, you can Make money by having the first product to market, or a good close relationship with the customer or buy doing it cheaper than anyone else. He drew this in a triangle like this So this got me thinking as to where Sun fits in to the picture. First off I'd say it's a different place from IBM who put huge resources in to having a close relationship with the customer (Murray indicated he felt IBM was somewhere on the Innovation/Customer line). However It isn't purely at the Innovation point either. We provide Innovative technologies to help lower costs both for our customers (hey Free Software anyone - check out http://opensolaris.com), but also by automating things like system management thus removing cost and complexity (Take a look at our xVM strategy to merge virtualization and system management at http://openxvm.org, thus removing some of the headaches to running a virtual data center). Oh and did I mention our coolthreads hardware? So I think we're probably somewhere between the Cost and Innovation points. I'm not saying this is a full theory of business, but I found it a useful thought experiment to see the different value propositions of various companies business models. Debugging sparc really (and I do mean really) early boot problems For some work I've been doing I've had to work out how to debug the sparc boot process, before you can get to kmdb. And yes you can do it, it's just not that easy. So I thought I'd put it on my blog, in case I lose the notes I made in a mail to myself, and it might be of interest to some of you. First off get as much of the diagnostics available from the OBP as possible
The reset-all is important as it saves the options the the nvram. Now we try and boot it up - before anything is loaded. Note this requires a debug kernel, but if you're playing in this space and you're on sparc then you probably know that already
You will see the boot fail like this
This is expected and how we get to start playing with breakpoints really early on. Not the unix module is not yet loaded so we now have to load the unix module. To do this we load the boot forth code and copy what it does
So by copying what do-boot does we can intercept the boot process
Now we can start some more magic. A DEBUG kernel will check the stop-me property in kobj_start(). This is something we have to populated in the boor properties which is why we've done all this messing around to get to this point
We can now start the boot process using exec-file. It will stop immediately because of the stop-me property (ctrace gives me the stacktrace)
From this point we have access to the unix symbols and can start setting break points. For example
I'm interested in getting some more module loading debug info out so lets set moddebug to 0xf
(displays current value of a long)
(set the long to be F then display it again) Now lets see what additional info I get
OK That doesn't tell me much more but you get the idea. You can access the symbols - set break points, set variables. In addition you can dump out memory with dump, single step with step and loads of other things that you might want to do, but this at least will act as a memory jogger for me Let me know if you found this useful. Chris
Posted by cwb ( Feb 14 2008, 01:23:20 PM GMT ) Permalink Installing Indiana/Opensolaris For a few days recently I have been looking at the future of packaging, pkg(5) or IPS. IPS looks really powerful and quite simple. It will allow us to generate fixes and deliver them much more simply. What I've been thinking about is how and when will we generate fixes using this mechanism. Any way as a result I've signed up for pkg-discuss-AT-opensolaris-DOT-org and indiana-discuss-AT-opensolaris.org. Both of these are very active and full of interesting discussions (and arguments) and ideas. Anyway, it's not surprising there has been so much activity recently. Today indiana-discuss announced the launch of the developer preview of the opensolaris binary distribution. So I tried it out on a couple of machines. My laptop first, an Acer Ferrari 4005. Everything just worked. The LiveCD booted up, really quickly actually, well done the team for getting the performance up so well. Even wireless worked, though that's probably because I've already swapped the Broadcom wireless miniPCI card for an Atheros one. Unfortunately I have no spare slices available on the laptop so I move on to my next machine. This is my home PC, usually running WindowsXP for the kids, it has never successfully Solaris for reasons that will become apparent. I have just upgraded the hard drive so theres 60Gb partition free for me to do some damage. Booting the livecd failed, or rather Xorg failed to display anything. My machine is an old Athlon XP2600 with an AGP radeon x1600pro graphics card. Great for games, but unfortunately the Solaris/OpenSolaris Radeon driver doesn't support it. Fortunately Stephan Hahn blogged about how to get Xorg to use the vesa driver from the livecd. With that in place I got the gnome gui up and gave the install a go. The installer uses dwarf-caiman, a cut down slim line installer which is nice and easy to navigate. The install itself was really quick - there's only a CD's worth installed. The rest should be added later over the web from the IPS repository. Unfortunately that is where my old machine creaked too much. The onboard ethernet is an nforce2 gigabit ethernet. It should work with the nge driver but I think it's just too old. I tried adding an alias for it using # add_drv '"pci10de,66"' nge The install claimed it failed, but it did come up fine after a reboot, though I had to add a user again at single user because the useradd hadn't worked. Warning here. root is just a role that users can take on - so you can't log in as root as you might expect from a "normal" solaris system. I'm pretty impressed. Nice installer, lightweight liveCD to get you started. zfs root and pkg(5) to add new stuff (or it will when I get a new ethernet adapter. I wonder if I can get one of my old USB wireless sticks to work Do give it a go, it is one vision of the future of opensolaris
Chris
So I'm on the road again. The Sun Tech Days this times I'm in Rome and Milan is later in the week. I've just talked about "What is OpenSolaris" and "OpenSolaris Virtualization" It's great to connect with real people who do or want to use OpenSolaris and interest in out xVM and Zones based technologies. Any way - great to be in Rome, just wish I was closer to the center. I made the trek in to see the Colosseum. Some say it's not as impressive as they were expecting. I have to say I had no expectations and was mightily impressed. I will post a link to some photos when I've uploaded and checked them
Posted by cwb ( Sep 24 2007, 03:17:02 PM BST ) Permalink Starting out with Solaris on Xen As you may have seen from the announcement and John's blog we have a new set of Solaris on Xen bits available for download. A lot has changed in the (almost) year since the last drop. Certainly things are a lot easier set up than they were back then. First big difference I notice is that you can install these bits straight from the DVD which means no mucking around with bfu. Once it is installed also you have the joys of much newer Solaris builds including improvements to networking and removable media (but that isn't the point of this post). Of course the thing you really want to do is run multiple operating systems so (while there are documents here I always think it's nice to see peoples use cases. Find out how they got things working. I'm going to use zfs for storage so I made sure I had a large amount of space available for a zpool # zpool create guests c2d0s7 First gotcha. After install the default boot entry in the grub menu.lst is for solaris on metal (ie not booting under Xen). You can change that before rebooting or select Solaris dom0 from the grub menu. Check you are running under Xen by looking at uname -i dominion# uname -i i86xpv (dominion is the name of my host) If that says i86pc then you're not booted under Xen, i86xpv is the new platform modified to run on Xen. I found that I accidentally booted on metal first time, and when I then booted under Xen the services weren't enabled. I had to manually enable them. (If you boot straight in to Dom 0 they start. dominion# svcs -a | grep xctl online 10:51:04 svc:/system/xctl/store:default online 10:51:11 svc:/system/xctl/xend:default online 10:51:11 svc:/system/xctl/console:default online 10:51:16 svc:/system/xctl/domains:default If it says anything other than online, enable them with # svcadm enable "service name" I use a zpool to create my disk devices for my domains. This has huge advantages, such as the ability to quickly snapshot a domain (say after install) so you can always return to that state. Also you can clone a snapshot so if you want to have many similar domains (say multiple solaris development environments) you can clone an install and then only the changes between the domains are stored (zfs being copy on write). To set this up you need to create a zvol on your zpool # zfs create -V 10G guests/solaris-pv This creates a zvol of up to 10G in size. Unused space is still free for other users of the pool to allocate. You can access the device for this zvol using /dev/zvol/dsk/guests/solaris-pv So that's simple - how do we install a Solaris domain? First off I create an install python config file. (Soon there will be a tool to manage the install for you but that's not really ready yet). This python file describes some simple things about the domain like where the disk and cdrom is. dominion# cat /guests/configs/solaris-pv-install.py name = "solaris-pv-install" memory = "1024" disk = [ 'file:/guests/isos/66-0613-nd.iso,6:cdrom,r', 'phy:/dev/zvol/dsk/guests/solaris-pv,0,w' ] vif = [ '' ] on_shutdown = 'destroy' on_reboot = 'destroy' on_crash = 'destroy' Name is obvious, and I've copied the iso image to be a file to speed up install. You can kick off the install just by starting the domain dominion# xm create -c /guests/configs/solaris-pv-install.py This says start the domain and give me a serial console access to it. You then do a normal Solaris install. Once complete you should create a second python file to boot off the zvol. but first I'm going to snapshot it so I can quickly duplicate it (though I really should sys-unconfig it first to make me input the hostname and ip info again.) dominion# zfs snapshot guests/solaris-pv@install dominion# cat solaris-pv.py name = "solaris-pv" memory = "1024" root = "/dev/dsk/c0d0s0" disk = [ 'phy:/dev/zvol/dsk/guests/solaris-pv,0,w' ] vif = [ '' ] on_shutdown = 'destroy' on_reboot = 'destroy' on_crash = 'destroy' and create it with # xm create -c solaris-pv.py This then comes up as per a normal solaris boot, if you've given it an ip address during the install or set it to use dhcp you should be able to log in to it using ssh. The networking is effectively bridged, that is to say, you need a real IP address for each domain on the same network as the Dom0. So the next question I always get is "Can I run windows as a domU". And the answer is "maybe". What we have done up till now is use a paravirualised domU. That is one that has been modified to run on Xen. Anything that would trigger a privileged operation (interrupt, privileged instruction etc) is modified to be a call to the hypervisor. This is nice and fast, but some operating systems haven't had this treatment. However with the advent of the intel core2duo and Rev F Opteron/Athlon64 (socket AM2) processors, some hardware support for virtualisation has been built in to the chip. This detects these privileged operations and redirects control back to the hypervisor to do "the right thing" With Xen these are referred to as HVM domains. Russ is going to be blogging more about these so I won't go in to too much detail, but if you want to know if your system is HVM capable, I wrote this simple program to tell you
dominion# cat hvm-capable.c
#include < sys/types.h>
#include < sys/stat.h>
#include < fcntl.h>
#include < unistd.h>
#include < string.h>
#include < errno.h>
#include < stdio.h>
static const char devname[] = "/dev/cpu/self/cpuid";
/*ARGSUSED*/
int
main(int argc, char *argv[])
{
struct {
uint32_t r_eax, r_ebx, r_ecx, r_edx;
} _r, *rp = &_r;
int d;
char *s;
int isamd = 0;
int isintel = 0;
if ((d = open(devname, O_RDONLY)) == -1) {
perror(devname);
return (1);
}
if (pread(d, rp, sizeof (*rp), 0) != sizeof (*rp)) {
perror(devname);
goto fail;
}
s = (char *)&rp->r_ebx;
if (strncmp(s, "Auth" "cAMD" "enti", 12) == 0) {
if (pread(d, rp, sizeof (*rp), 0x80000001) == sizeof (*rp)) {
(void) printf ("processor is AMD ");
/*
* Read secure virtual machine bit
* (bit 2 of ECX feature ID)
*/
(void) close(d);
if ((rp->r_ecx >> 2) & 1) {
(void) printf("and processor supports SVM\n");
return (0);
}
(void) printf("and does not support SVM\n");
} else {
(void) printf ("error reading features register");
(void) close(d);
return (1);
}
} else if (strncmp(s, "Genu" "ntel" "ineI", 12) == 0) {
if (pread(d, rp, sizeof (*rp), 0x00000001) == sizeof (*rp)) {
(void) printf ("processor is Intel ");
/*
* Read VMXE feature bit
* (bit 5 of ECX feature ID)
*/
(void) close(d);
if ((rp->r_ecx >> 5) & 1) {
(void) printf("and processor supports VMX\n");
return (0);
}
(void) printf("and does not support VMX\n");
} else {
(void) printf ("error reading features register");
(void) close(d);
return (1);
}
}
fail:
(void) close(d);
return (1);
}
SVM is AMD's implementation of HVM while VMX is Intel's. And just a teaser of what you can expect. (right click - view image to see it full size)
Here you see a solaris paravirtualized vm being installed, a windows vista hvm domain. In the top left corner you can see the virtual machine manager. A new management gui that will help manage domains.
Sorry this is going to be pretty hard to see unless you view the image in it's original size (1600x1200, yes virtualisation helps you use up those wasted resources including screen real estate)
Posted by cwb
( Jul 19 2007, 10:15:00 AM BST )
Permalink
Surfing kernow
I had a fantastic weekend down in Cornwall. Saturday's highlight was the boys up to their middles in the sa before they realised it was cold. Followed by my own vague attempts at surfing.
I decided to take my camera to the beach to see if I could become a famous surf scene photographer. I don't think that'll happen but I'm pretty pleased with these.
Two years of prevarication and 5 days hard work and finally .... The wall is built. A couple of years a go I leveled the area outside our back door and laid a patio. This meant I had to dig in to the slope that raises up away from our house, and that meant I was going to have to build a wall and some steps up to the lawn. For some reason I've found excuses not to do it ever since, until two weeks a go I was going in to Travis Perkins looking for fence panels (there is a fencing shortage in the UK at the moment) and came out with 400 bricks (well not literally, they did deliver them. So I've spent most of Easter and a long weekend building it, but it is finally done. I think it looks pretty good. What do you think?
Snow
A little late I know but we had some snow in the South of England this week. The usual 1 or 2 inches causing the rail and road infrastructure to break down, schools to be closed and food rationing to be enforced.
Using the OpenSolaris Mercurial repository
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||