Saturday July 26, 2008 |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
To implement a SunRay demo with a Windows Terminal Server back-end, normally we use two separate servers. But now, with virtualization all around us, why not put those two components on a single box. Of course, there are many ways to skin the cat. I decided to take the X4100 that I had in the lab, put Solaris 10u4 on it, then install Sun Ray Server Software (SRSS) and finally install VirtualBox with a Windows 2003 Enterprise Edition guest server for the backend. However, this entry is not about Sun Ray or Windows, but about VirtualBox. When selecting the correct download, it was a bit unclear if there were separate versions for Open Solaris versus normal Solaris, or that it was all "one and the same". To jump couple of steps ahead, there is only one version, but it is clear that the developers of VirtualBox are doing their work with OpenSolaris and then expect it to work with regular Solaris as well. Well, as I discovered, and with me many others, that doesn't always work out. The pkgadd is straight forward and without problems. At the end you will have VirtualBox in your path and you're ready to fire it up. But then I got the following error bash-3.00# VirtualBox ld.so.1: VirtualBox: fatal: libGL.so: open failed: No such file or directory
Things like this have happened before, and my usual solution is "let's Google". In this case that was the wrong thing to do, :) because I got soooo many wrong suggestions. In some forum I read that something is wrong with the service "svc:/application/opengl/ogl-select:default" or that you have to install the package "sunwcslr". In my case, none of this was true. I'm sure that most of the problem is related to the difference between 32 and 64 bit systems. Not only the CPU, but also the OS and therefore the libraries must be matching. With my X4100, I was running a 64 bit AMD Opteron. I had installed Solaris running in 64 bit mode and I hadn't made the mistake of downloading the 32 bit version of VirtualBox. Here's some commands you can use to verify your own configuration regarding these issues. bash-3.00# uname -a SunOS java3 5.10 Generic_120012-14 i86pc i386 i86pc bash-3.00# ls VirtualBox* VirtualBox-1.6.2-SunOS-amd64-r31466.pkg VirtualBox-1.6.2-SunOS_amd64.tar.gz VirtualBoxKern-1.6.2-SunOS-r31466.pkg bash-3.00# isainfo -k amd64 bash-3.00# which VirtualBox /usr/bin/VirtualBox This looks good, 64 bits all where it matters. But why are things still going wrong when you start VirtualBox. No matter what all the other forum messages are saying, in my case it was in the end simply a matter of not finding the 64 bit version of libGL.so. I tried many other things first, but what solved it was setting LD_LIBRARY_PATH to include "/usr/X11/lib/mesa/64". bash-3.00# VirtualBox ld.so.1: VirtualBox: fatal: libGL.so: open failed: No such file or directory Killed bash-3.00# echo $LD_LIBRARY_PATH bash-3.00# export LD_LIBRARY_PATH=/lib:/usr/lib/64:/usr/X11/lib/mesa/64 bash-3.00# echo $LD_LIBRARY_PATH /lib:/usr/lib/64:/usr/X11/lib/mesa/64 bash-3.00# VirtualBox ^C bash-3.00# Which shows what went wrong and how it can be fixed. (2008-07-26 08:56:19.0) PermalinkIt's already dark when I leave the hotel, dragging my carry-on behind me. I can turn left, to the pub where the rest of the troops is probably already behind their second beer, but I decide to make a little detour to the right. I walk to the front of the 18-wheeler to see if Dan, the driver of our Project Blackbox rig, is still around. I find him, with big gloves on, between the power generator and the water chiller. He is working hard to make the Blackbox transport ready again. We shake hands to say goodbye. I met Dan for the first time in Calgary a month or so ago, great guy, not only the driver of our Blackbox demo roadshow unit, but also the one who took the most fabulous pictures of the box in between the high-rise of Calgary's downtown core. Today we are in Vancouver. Different bussiness drivers, but the same crowd that gets inspired by Project Blackbox and sees how it can open new avenues for datacenter expension, consolidation and "going green".
After my goodbye to Dan, who's now off to Mexico City, I join my colleagues and then it's off to the airport. For those of you who are "frequent flying" as well, you know the drill. Empty your pockets, get all your keys and stuff into the grey plastic bin, your laptop in the second bin, your coat in the third, etc. But now it comes.... One of the security folks, I would guess around 60 years old, sees my Sun badge in the bin next to my coins, my keys and phone. He asks me out of the blue, "but is Solaris free" .... and it is clear he means it in the "free as in beer" sense. It catches me a little off guard, but my reply is "you can just download it, no problem". His counter "yeah, but do I get source code and am I then able to change it?" I try to assure him with "of course, that is what open source is all about". Next question: "but do I need assembler code to do this?" (now you understand why he was at least 50+ :-). I hope I was correct, but my answer was "no problem, it's all C code, you will be fine". And this all happened within 20 seconds, five times the speed of an elevator pitch, while at the same time I was emptying my backpack to get my laptop into the gray plastic bin, etc. Time was flying way too fast!! I would have loved to talk with this guy about what project he was working on. He was a really interesting person. As a day job checking our bags for stupid things like bottles of shampoo, but in the end really interested in how he could modify and improve Solaris. That's special !! (2007-11-21 00:20:47.0) PermalinkA little time back I was preparing for a big benchmark project where our customer wanted to compare a single large system using many zones with a more horizontally scaled infrastructure, consisting of a number of smaller servers, like V490 and V890. I immediately thought that replacing a number of servers, being chatty over the network, with a single server, carved up into zones, would give a big benefit in network performance. Zone-to-zone network traffic should be faster than server-to-server. So I fired off some emails to people that I thought would give me the final answer, but the responses were very mixed. Therefore it was time to do some of my own experiments. Doing a big benchmark in one of the Sun Solution Centers, I had the availability of some serious hardware for these tests. On the other hand, as is usual with these types of projects, there was a lot going on at the same time, therefore in the end time was limited for this little exercise. This was my test platform:
This is the environment I built:
This provided us with three test scenario's: a) network traffic from one virtual interface to another, both on the same physical interface, b) two zones talking with each other, each with their own physical interface and c) two independent servers, or in this case domains. I used ftp to send files of three different sizes: 1M, 3M and 1G bytes. All files were created in /tmp and sent to /tmp. I repeated each test three times. Here are the results (all times in secs):
So, from this we can see clearly that zone-to-zone traffic doesn't "hit the copper" and probably gets shortcutted somewhere in the IP layer of the TCP/IP stack. I would think that with slower interfaces, like 100 mbps, the speed advantage will be even higher than the 1.5-2x we see here. (2007-05-11 01:11:03.0) PermalinkLast night I finally found the time to upgrade my laptop from a "too much patched" Solaris 10 5/03 to a latest-greatest "Solaris eXpress Developer Edition". Before we dig into wireless, I've to do a little plug for SXDE. I think it's a great idea. Many users want on their desktop or laptop something that is up-to-date, must have a decent stability, but doesn't have to be as rock-solid as a normal Solaris release. Problem with using standard S10 on a laptop is that drivers can be "way behind". Which is then normally already fixed in Nevada, but running that on the system that my email is depending on is not my piece of cake. Nevada is great, but please on my second system. SXDE is the sweet spot in-between: Once every 3 months a snapshot is taken of the Nevada code (the bi-weekly release of that is now also called Solaris eXpress, Community Edition, SXCE), which gets then a couple of "fixes only, no new features" debug cycles, is then bundled with Studio 11 and released to us, the Solaris end-users and developer crowd. I think this is great, it's more stable than S11 Nevada, which is really beta code, and still you get all the latest bug-fixes and drivers. So, I moved over, and everything went very, very smooth. I also rebooted a couple of times, started to customize the system, configured NTP, noticed that SXDE knows about my Artheros WiFi chipset, configured that, and all was great. One of the biggest features for me was that it decided NOT to overwrite my MBR. So, even while my system is running RH and XP in parallel to Solaris, I didn't have to do any of that 'grub' stuff to reinstall the Master Boot Record. Cool..... For whatever reason, I did a reboot and my system hang with an absolutely black screen. I rebooted in FailSafe mode, but couldn't see anything wrong. So I reinstalled from scratch. And again the first half hour all was OK, but then it would hang like hell. I couldn't even ping the box. I guess that in total I reinstalled 4 times over the weekend, got quite a routine for it :-), but finally I figured out what went wrong. As usual it was a combination of a mistake by me, and a system that's not foolproof enough. In this case, my mistake was that when I configured Wireless I told it to "Activate on Boot". Made sense. But I don't have an access point, and was simply testing on what my neighbours provided on the 2.4 GHz band. :-) What is the problem, is that if you click "Activate on Boot" and then, when booting, you don't have a proper access point, the system is not properly timing out. At least that is my theory. It simply waits and waits and waits. With the result that the system simply hangs and you have to reinstall from DVD. I guess that alternatively you can figure out how to reverse that "Active on Boot", while in FailSafe mode. I kept life more simple and from then on didn't touch that checkbox anymore. Which, so far, works pretty fine. (2007-03-25 23:01:26.0) PermalinkLast time that I had to setup a printer in Solaris, it was an experience straight out of hell. It was 2-3 years ago, on a system running Solaris 9, and I finally got it working, using the Common Unix Printing System (CUPS), but my experience was bad enough that since then I avoided, as well as I could, to get ever involved with printer setup again. But setting up a Solaris 10 based system recently, to be used as a home PC, I faced the topic again. I read some man-pages and did some Googling. After some erroneous first attempts, I checked out docs.sun.com and was pointed to "printmgr". Which has improved hugely since two years ago. The printer I had to setup was a cheap HP Deskjet 812C. And to my surprise, the list of printers preconfigured in printmgr is biiiiggg, also including my little Deskjet. Because this is a parallel port connected printer, the device it resides at is "/dev/printers/0". So far so good! See here is what I had to do to seup. click for full-size After this I tested with "lp -d deskjet /etc/nodename" and the textual printout was fine. Then it was time to start Mozilla and print a page with graphics and color. Also this worked out-of-the-box. The last thing to do was to configure the printer in StarOffice. Because StarOffice runs on Windows, Linux, Solaris, OS-X and a couple of other systems, it doesn't make use of the underlying printer subsystem, but has its own. Which is a hassle, but from a software development point of view, I understand why they did it like that. To configure the new printer in StarOffice 8, go to Launch -> Applications -> Office -> Printer Administration. And then I ran out of luck. StarOffice knows only about one HP Deskjet printer and that was of course not the model I had. I still configured using that driver, and I got printouts, but there were white bands every inch and couple of other formatting issues. So, that was not the way to go. Time to pull out of my bag of tricks a goldie-oldie, I've used for years with success. When setting up a PC, I always configure a HP LaserJet III and an Apple LaserWriter II printer. The first driver can be used for any printer that uses PCL, while the latter is the lowest common denominator for PostScript based printers. OK, you won't get the use of features like two-sided printing or using other paper bins, but for basic printing these two configs are good enough. Back to StarOffice, I selected the driver for the "HP LasterJet III PostScript Plus" and printed a test page. All was fine, including color. Which was a bonus, knowing that the LJ III was a B&W laser printer. (2006-12-30 13:48:31.0) Permalink Comments [2]Solaris install + USB ... a no-no Last night ... mmm, more early this morning :-) ... I was installing OpenSolaris SDX beta (Nevada build #55) and forgot that my USB external drive was still plugged in. This happened because I had downloaded on that drive the 4 Gig ISO image and then burnt it to a DVD. The install went well, with the only muddy thing that my bootdisk had become c2d0 and not c0d0, what I'm used to. But still: so far, so good. After login I noticed the USB partition being auto-mounted, which is good, and I suddenly understood what had happened. The USB device has one way or another a higher priority over the ATA harddisk and therefore the bootdisk becomes c2d0. Which is of course not what you want to happen. You can imagine that when I unmounted the USB disk and rebooted, the system needed some deep hard thinking – read "long timeouts" – before it understood where to find its MBR. In short: don't do this!! I took the easy way out and reinstalled everything from scratch, which was not too bad but could have been avoided. Lesson to learn: unplug every USB stick or device before you install an OS. (2006-12-28 20:25:07.0) Permalinkadding a network card with Solaris X86 It's the kind of thing you don't have to do very often, because the Operating System install takes care of it so well. Even to the extend that you are tempted to just reinstall the OS when adding some new hardware to your system. In this case I needed to add two 3Com network cards to an Ultra-20 that was already configured for the onboard Ethernet. I know how to do it under Linux: just start the GUI config tool. With Solaris, it's a bit more of a manual process. But, in the end not too tough, and when you get stuck, Google is your friend. I first checked the Solaris FAQ at www.sun.drydog.com. It was not 100% accurate (probably based on an older Solaris version), but a very good starting point. Manually configuring a network with ifconfig is something I've done often enough. But the issue for me is that I don't know which device/driver name to use. In Linux this is simple, it's always "eth0", but in Solaris it depends on the driver. After adding the network cards and rebooting I did a PCI scan: bash-3.00# /usr/X11/bin/scanpci pci bus 0x0000 cardnum 0x0a function 0x00: vendor 0x10de device 0x0057 nVidia Corporation CK804 Ethernet Controller pci bus 0x0001 cardnum 0x09 function 0x00: vendor 0x10b7 device 0x9050 3Com Corporation 3c905 100BaseTX [Boomerang] pci bus 0x0001 cardnum 0x0a function 0x00: vendor 0x10b7 device 0x9050 3Com Corporation 3c905 100BaseTX [Boomerang] You see the onboard Ethernet Controller and then the two 3Com cards. The important part is the vendor and device numbers. With these two, we now have a look at: bash-3.00# grep 9050 /etc/driver_aliases elxl "pci10b7,9050" This gives us the "elxl" driver name I was looking for. Alternatively, you can have a look at:
bash-3.00# grep 9050 /boot/solaris/devicedb/master pci10b7,9050 pci10b7,9050 net pci elxl.bef "3Com 3C905-TX Fast Etherlink XL 10/100" To take care that Solaris "picks up" the card, you need to do a "touch /reconfigure" and then restart your system with "reboot" or "init 6". The FAQ says that you then have to press 'Esc' during the driver configuration, but that's not the case (anymore). After rebooting, it's time to configure the network interface. First by hand: bash-3.00# ifconfig elxl0 plumb bash-3.00# ifconfig elxl0 netmask 255.255.255.0 192.168.1.2 bash-3.00# ifconfig elxl0 up bash-3.00# ifconfig elxl0 bash-3.00# ping 192.168.1.1 bash-3.00# ifconfig elxl0 down bash-3.00# ifconfig elxl0 unplumb And when that works fine, (assuming "moon" is the hostname) make it permanent with: bash-3.00# echo "moon" > /etc/hostname.elxl0 bash-3.00# echo "192.168.1.2 moon" >> /etc/hosts bash-3.00# svcadm restart network/physical(2006-07-24 15:06:18.0) Permalink Having enough Unix blood in my veins – be it Solaris, BSD, Linux, AIX, Ultrix, Xenix, SCO, whatever – when I install Linux on a PC I always configure the CMOS clock to be running in GMT. When that system runs Windows too, it's maybe not too wise a choice, but the purist in me likes the Unix method of having a UTC based real-time clock. Here I have to side step for a bit and give air to a personal rant: Why on earth (very literally in this case) are calendar systems, especially on PDAs, not able to schedule meetings in a different timezone than where you are. And in addition keep track of the timezone you're in. Even if you're not a road warrior, con-calls are attended from many zones. It gets worse, let's say you are in EST when you schedule the con-call, and after travelling from EST to CST you attend. By now your calender has become a complete mess. That's why PDAs should use the Unix method for storing dates and times. And the applications of course need to change. Back to our notebook with that combo of Windows, Linux and Solaris. With two-out-of-three winning here :-), I always configure the real-time clock to run in UTC. Of course you have to be careful that Windows isn't syncing with an NTP server. Still I had problems with the time on my system and I must admit that I blamed that fully on Windows. Which is not true ... but also true.
Investigating this a bit more systematically and reading the appropriate man pages, I discovered that Solaris X86 assumes your real-time clock to run in local time. This, because Solaris also assumes that it has to co-exist with Windows. Therefore, if you read the manpage for /usr/sbin/rtc and look in the file /etc/rtc_config, you will see that it stores the seconds between the local clock and UTC to take care that when the CMOS clock runs local-time, the Unix kernel can run in UTC and then the normal Unix TZ mechanism is used so that date/time info gets displayed again in local time. In my opinion this is really to Take The Long Way Home, but that's how it is.
# # This file (/etc/rtc_config) contains information used to manage the # x86 real time clock hardware. The hardware is kept in the machine's # local time for compatibility with other x86 operating systems. This # file is read by the kernel at boot time. It is set and updated by # the /usr/sbin/rtc command. The 'zone_info' field designates the local # time zone. The 'zone_lag' field indicates the number of seconds # between local time and Greenwich Mean Time. # zone_info=Canada/Mountain zone_lag=21600
I didn't test it, but I presume this gets updated on-the-fly at the start and end of Daylight Savings Time. Bottomline is that whether you like it or not, the CMOS clock must run in local time and that you must configure your Linux installs accordingly. (2006-05-11 16:41:55.0) Permalink Comments [2]
At home, I'm using those little black Shuttle SN85G4 systems. One as my desktop and one as a MythTV project under way, which has to be ready by the time the soccer championship hits us in June. These Shuttles are running AMD64 (socket 754, no Opteron's :-). They're small, pretty silent and because Shuttle is coming up with new stuff, you get them pretty cheap, if you're still able to find them. This weekend I was installing Solaris 10 and stumbled on the nForce 3 chipset not being supported. Also no download available from NVIDIA (for Linux yes, not for Solaris). That gives you a system with no working NIC, and that's no good. Checking the HCL, they clearly took the "easy way out" by installing a separate PCI network card, by-passing the onboard NIC. A little browsing brought me to Masa Murayama's website which brings a whole set of unusual and therefore so useful NIC drivers. Among those, one called "nfo" for the nForce chipset. Masa labels the driver alpha-code, but I've had no problems. If you're going to use this driver, here are some tips. The tarball contains the source, but also ready-to-go modules for i386 and amd64. You can try to build, but if possible, better don't. Forget about "make", jump straight to "make install". I'm saying this, because building with gcc needed a little hacking and compiling with Studio 11 wasn't successful at all. Next, follow his README carefully. It includes an "install", then an "uninstall" and then again an "install". This sounds weird, but if you follow it strictly, it all works. If you try to make shortcuts, like I did :-), you better know what you're doing, I didn't :-). I stumbled on one problem. Part of the install is running a script called "adddrv.sh". In my case that gave a message about a library already being installed. I took that as a warning, but it really was an error. You have to solve that (by commenting out a line in the script, in my case two lines) before you can continue. If you ignore it, you're in trouble, it won't work. When I later emailed about this with Masa Murayama, he mentioned that the likely cause was an nForce GbE driver recently added to Solaris. I guess he is right about that, but for me he is "1 - 0" ahead of the game (we're back to soccer here :), because the standard install didn't give me any networking at all and his "nfo" driver is doing fine. (2006-05-09 11:52:35.0) PermalinkFor quite a while I've played with Solaris 10 Zones. Even made very good use of them for isolating development projects in a sandbox environment. And so far, all I needed could be done with a simple 'zonecfg', 'zoneadm install', 'zoneadm boot' and then 'zlogin'. Recently I started to dig a little deeper and ran into problems with networking between local zones. And I had issues with that, partly by not enough RTFM :-) and partly by real issues. The real problems were mainly caused by trying this on a system running Nevada with BFU-ed on top of that the BrandZ stuff. Some piece in ZFS broke a dependency in my SMF. This article is about where I went "off the right track" while setting up my zones. What was obvious and what wasn't. Often because somewhere I read something that was dead wrong (yes, also on blogs :-) or at least incomplete. Because I'm a Letterman fan, I've decided to do this in a Top Ten format. Similar to the Late-Show, don't pay too much attention to the ranking.
That's it for now. Happy hacking and have fun with your zones ... (2006-04-28 14:01:36.0) Permalink Comments [1]Unix shell programming appears to be an art, not a science. That's probably also why I'm not so much of a Perl fan. Take note, I do like both scripting and art a lot, but when it comes to Computer Science (mind that last word :) I think computer languages should be very unambiguous and easy to read. If that makes a compiler's life a little harder or means a bit more typing for the programmer, so be it.... But this story is not about bashing Perl, it's about unexpected problems with simple shell scripting. I was installing W3C's libwww package to get going with some XML-RPC development. It was the typical process: "./configure" followed by "gmake", nothing fancy. The configure step went fine, but while gmaking, it hicked up severely. I traced it back to some "test" commands in the libtool script, when the condition of 'test' had an empty variable. Something like:
#!/bin/sh
something=""
if test $something = "yes"
then
echo "something is yes"
fi
Or when the something="" statement had been absent at all. In both cases, the result is an error at the "if test" line. The cause of all this trouble is that for undefined variables there is a difference between if test $something = "yes"; then and if test "$something" = "yes"; then Mind the additional quotes. The libwww package is missing those a couple of times, which appears to be fine when the variable has a value, but not when it is undefined. Back to art vs science. That things like this are illegal syntax I'm OK with. But it should be made clear up front. Preferably as a compile error, although I realize that for scripting languages that is not too applicable. I had hoped for an error like "line 13: illegal syntax" or something similar. But that's not what I got: ./test.sh: test: argument expected So it took me 10 mins and many more echo statements to find out which of all those test commands in the 5000+ lines of code script was the culprit. After finding it, of course the problem repeated itself, quite a couple of times. It took a while to get this script fixed, but don't worry, I did ... :-) Later, when redoing the install on RedHat, I discovered RH8 is having the same issues with the test command, but it states: ./test.sh: line 13: test: =: unary operator expected The message is even more cryptic, but it has the big advantage of giving you a line number. It's the small things that matter :-). Which would have saved me many echo statements and a couple of hours. With all these problems out of the way, I could dig into XMLRPC-C to build an interface between a Windows client and a Solaris backend. So far, I find XML-RPC to be an elegant protocol, which results in quick implementations and simple solutions. (2006-04-16 20:58:13.0) Permalink |
Calendar
NavigationSearch
ReferersToday's Page Hits: 34
Recent Entries
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||