Thursday Apr 28, 2005

There's been a bit of discussion (ok, heaps!) recently about whether OpenSolaris exists, whether Sun really is going to opensource Solaris, and so on. One member of the OpenSolaris pilot program saw a tendency for those who only ever post negative comments about Sun, Solaris and OpenSolaris to fall back to the "perception is reality" crutch. How about a short example of the absurdity of this: One poster's perception was that Solaris 10 was not available, for free, from Sun's website. Another poster wanted Solaris 10 to be like RedHat AS. (Although why that was desired I don't know). Unifying these two theories we get: Proposition 1:  Perception is Reality Lemma 1:  Solaris 10 does not exist Proposition 2:  Solaris 10 is like RedHat AS Conclusion: RedHat AS does not exist. Those of you who remember watchin the BBC series 'Yes Minister' and 'Yes Prime Minister' will no doubt recall the Politician's Fallacy: We must do something. This is something. We must do this. So I'm really sorry but I'm going to have to invoke the Monty Python and the Holy Grail is she a witch? reference at this point so you know how idiotic it all is. The OpenSolaris pilot is getting very close to bursting out into the open (I don't know when, but it's coming soon), and the activity is increasing day by day. The official pronouncements talk about Q2 of CY2005 (ie, between the start of April and the end of June 2005), and there is no doubt in my mind that we're going to get hit that target. I'll keep you posted....

Friday Apr 22, 2005

I also picked up Ben Folds' latest album (including NTSC dvd ;< ) called Songs for Silverman and it too is on high rotation right now. It's a disc that builds on his previous solo work more so (in my opinion) than when he was part of Ben Folds Five.

One thing about this album that annoys me somewhat is the inclusion of an NTSC dvd rather than a PAL dvd. There's a wealth of good reasons why NTSC (aka Never The Same Colour --- try googling for "ntsc never the same color" and hit "I'm feeling lucky") should not be used, starting with the fact that Australia (where I live) just doesn't use NTSC. Makes me think that this is either a straight import from the US (unlikely given the price), or that the local distributor was unwilling to do any thinking about what was being packaged up. Much more likely...

Still, the music displays Ben Folds' expansion and development into what might be called larger works. There are a variety of other instruments, some other people providing backing vocals (including his wife Frally), and some complex rhythms and harmonies. And (I've just noticed after listening to it 10 times) that there are little interlude bits too --- quite soft background-style music with muted keyboard, drums and (I think) string bass. Interesting!

I really like this album. Now I just need a few spare cycles to sit down and look at the DVD as well.

Actually, there isn't any background-style music on this album --- just The Cat Empire's website hijacking my /dev/audio. Hmmmm not very nice!

I don't really know how to describe The Cat Empire - are they hip-hop, folk, funk, jazz, disco (!) or what? They seem to be a mix of all sorts except possibly thrash, metal and hard rock. So listening to them is quite enjoyable.

I picked up their latest album Two Shoes a few days ago and as is my habit it's now on high rotation. They've got a nasty habit though with their website of hijacking one's /dev/audio and playing a track from their album.

This is rude and obnoxious behaviour for a website.

If I want to listen to their music, I will do so when I choose to. I do not want to have it forced on me.

Anyway, the group is touring Australia at the moment. I didn't go (and haven't gone) to their shows at the Enmore Theatre - I figured that their sort of music is better suited to a place where you can get up and dance (maybe even mosh a bit). The Enmore Theatre is very much a sit down sort of place. Now maybe if they were appearing at The Metro instead....

Thursday Apr 21, 2005

On Tuesday I bfu'd to the latest nightly build of Solaris next so I could take advantage of the boot re-architecture project integration. This went quite well except that I managed to corrupt my boot-archive through not paying attention at the right time and forgetting a step.... grrr. Once I'd fixed that problem (boot cdrom -s, mount -F ufs -o rw,logging /dev/dsk/c0d0s0 /mnt ; /mnt/sbin/bootadm update-archive -R /mnt ; sync ; umount /mnt ; reboot) I felt confident enough to go to the next stage, booting the competition's OS on my laptop.

I figured I should boot it to see what it thought was going on. That was ok, but running partition magic was when things went downhill fast. PM decided that my partition table had errors, and would I like it to fix them? I was really stupid at this point, and clicked yes.

BAD mistake.

Not only could I not boot back to MS-Windows, but I was unable to boot Solaris either...

Fortunately my desktop Solaris box was unaffected, so with a bit of digging I was able to find the System Rescue CD iso, pull it down, burn it and boot from it. That was great, but sfdisk and cfdisk both told me I had a bodgy partition table (duh! I knew that already!) and refused to help. By this point I was getting quite frantic, and googled again and again, eventually coming up with a hit on gpart.

I am very pleased to say that gpart saved my laptop. It was included on the linux System Rescue CD as /usr/bin/gpart.

Gpart has a scan option where it looks at where your partition table should be, and tries to interpret the data which it finds. I used this first, and wrote down exactly what it produced. Fortunately for me it matched what I remembered of my disk layout, so I re-ran it with the "-W" option to write the corrected partition table to disk.

Then deep breaths, sync, sync, sync, reboot..... grub menu.... YAY!!! I'm back to life!

Of course MS-Windows still won't boot properly -- gets to a certain point and hard-hangs, or just reboots the laptop entirely.... but that's a topic for another day.

Now I'm doing another backup of my data to a workstation in the office..... because you never know.

I'm also emailing the author of gpart to thank him for his utility, and request that he enhance the list of known partition types to include Solaris2 (== 0xbf by the way) which is what Solaris10 installations use now.

Monday Apr 18, 2005

Just checked in to see what Slashdot has on the front page, and what should I find but a discussion about how embedding a GPL'd font in your document could make your document covered by the GPL. How crazy is that?!! I know that there is a specific special case section for using fonts in a GPL addendum whereby you can add the relevant license notice to each file in the package. There's also a fairly good GPL Licensing FAQ and there is even a GNU Free Documentation License. What I do not like is that for such important issues as this there is no clear cut, well-defined black and white reference to point to which dispels all the myths and bogies about the license. Imagine if I wrote some document containing a serious trade secret or potentially libellous allegation and used a GPL'd font to do so. If I believed what I read in the slashdot header then I would have to make that document available to whoever requested a copy..... not a good situation to be in! But after I re-read the GPL FAQ I was reassured --- once again slashdot was indulging in a licensing beat-up. It was interesting to see a beat-up on an aspect of the GPL though. Not sure when that happened last..... And by the way, I only view slashdot for the links to real articles.... ;)
Every now and then we get asked to provide continuous effort on a escalation, where the desired output is a determination of Root Cause. Bearing in mind that Continuous Effort (CE) requires follow-the-sun handoff, and that Root Cause can require a large period of time to determine (ever tried tracking a Heisenbug ?), there is only one case that I can think of that was truly justified as Continuous Effort Root Cause Analysis. Did you watch the movie Apollo 13? The engineers and rocket scientists at NASA used Kepner-Tregoe's Analytical Troubleshooting methodology to determine the root cause of why they had suffered power loss and a consequent air leak from the spacecraft. They also worked out how to work around the problem (providing "customer relief" ). Next time your boss is pushing to work around the clock to provide root cause on a problem, remember to keep it all in perspective --- is this going to save lives? The problem may certainly be mission- or business-critical --- I've worked on plenty of those! -- but if you ain't gonna save a life with it.... You can read more about Kepner-Tregoe's Analytical Troubleshooting methodology in their book The New Rational Manager. Kepner-Tregoe run frequent courses on rational processes, and at Sun we make use of those rational processes on a daily basis all over the company.

Saturday Apr 16, 2005

Firstly let me emphasize that these comments are totally personal, in no way reflect company policy or positions, and are not endorsed by Sun Microsystems. I am not an officer of the company and I have no access to financial statements --- I depend on Sun's official announcements just like the rest of the world. Right, with that out of the way, I've got two bones to pick about the earnings report and stock market commentary about it. The first thing that we see from stock market analysts is "Sun should retrench more employees to return to profitability faster." We see this sort of commentary from Forbes, First Albany and Goldman Sachs time and time again. Do analysts ever stop to think about who or what actually produces the products, services and solutions that a company like Sun (or IBM for that matter) sell? Sun's headcount is down about 25% since the height of the tech bubble. Sun still has a product line which doesn't include Microsoft Windows (yay!), and Sun is still producing basic research, new hardware and software products, and making increasing money from providing services. IBM Global Services has shown the world that one way of providing services is to throw people at a problem. Sun doesn't have nearly as many people as IBM GS and yet still manages to get increasing profit from providing services. So why should headcount be cut again? Does it worry any analyst out there that maybe if Sun kept cutting there wouldn't be any sales force to sell products which the R&D teams create? And that by cutting services employees there would be nobody to provide services which the now non-existent sales force had sold? When will the analysts start to apply some logic to their pronouncements? Probably never, unfortunately. The other bone is that Sun is in the midst of a transition --- we're re-inventing who we are and what we do. I for one am whole-heartedly in favour of this transition. We used to be a company which (in my opinion at least) was focused on sparc hardware to the detriment of everything else. Now we're not. In fact, we haven't been that for about 2 years now, it just seems like the analysts out there have pigeon-holed Sun and don't want to think about changing their minds. Sure, we haven't sold many LX50s or v60x/v65x boxes but so what? They filled a gap in the product line while we got up to speed with Opteron-based systems. I wonder whether any of the analysts have bothered to do any research on just how good these Opteron-based boxes are, or what customers are doing with them. Probably not, because that would mean that the existing pigeon-hole needs re-evaluating. We see comments like this from Rob Enderle where he claims that
Sun still looks like a company in search of a meaningful strategy
Perhaps Rob Enderle hasn't been able to attend any of the analyst con-calls or conferences, or even manage to read the investor information page. Perhaps he (and all the others accusing Sun of being incoherent) don't understand what sort of effort it takes to turn a company from being hardware-focused in systems-focused. What I'd really, really, really like to know is what the analysts think will be a meaningful strategy for Sun. We've never done what the market thought was the right thing: still don't have MS-Windows on the pricelist, still haven't ditched sparc, still haven't ditched Solaris in favour of linux, still haven't cut 50% of the workforce and still haven't cut R&D down to Dell's levels. And despite not doing what the analysts tell us we should, we've managed to break even, improve our result by nearly a quarter of a billion USD over a year ago, and become the market leader in systems based on AMD's Opteron systems. So please, stop bagging Sun over strategy, and only breaking even on a GAAP basis. Do some actual research into the company rather than just bagging Sun because that's what you always do. It gets really boring to get blasted with the same old hot air all the time. We're at a tipping point in the technology cycle and you'll just have to take it from us --- Sun has a viable short- and long-term strategy, it works and it will bring in the dollars.

Friday Apr 15, 2005

El Reg has found a posting from Linus on the BitKeeper vs Andrew Tridgell bunfight. It makes for interesting reading. I was particularly interested to read this paragraph:
...But that's not what Tridge did. He didn't write a "better SCM than BK". He didn't even try - it wasn't his oal. He just wanted to see what the protocols and data was, without actually producing any replacement for the (inevitable) problems he caused and knew about...
I find these interesting for two reasons. Firstly, Tridge seems to have been engaging in some research. This is something which most other people would find quite laudable. Research is also what drives innovation --- you can't improve upon something unless you know at least a bit (or a lot!) about that something. Secondly, Linus' email leaves a lot open to interpretation. On my first reading of that paragraph I was left with the impression that Linus is accusing Tridge of creating data/metadata problems (what other people might call data corruption) within the linux kernel repository. Clearly a bad thing to be implying. I had to read the paragraph a few times before it occurred to me that Linus was probably only talking about Larry pulling the license. Then later in the email is this tidbit:
I'll write my own kernel source tracking tool because I can't use the best any more.
If we take Donald E. Knuth as any sort of reliable guideline on diversions like this, then there won't be any more innovations coming from Linus involving the linux kernel, because he'll be spending all his time and effort designing, debugging and generally re-inventing the source code management wheel. And finally, this bunfight has made it into the mainstream media. I like Sam Varghese's final paragraph:
All that this incident has done is to bring to the fore the fact that free software and open source software are definitely not one and the same thing and that compromises made at one point could well come back to bite those who make them.

Thursday Apr 14, 2005

A colleague recently pointed me to Hitachi Global Storage Technologies' lame-as flash animation (hey, we're all geeks, right?) entitled Get Perpendicular. It's all about how disk drive manufacturers can provide us with a phenomenal increase in data capacity by making the bits stand up rather than lie down. According to El Reg we should start to see perpendicular storage available in 2.5in form-factor drives by the end of this year. If you think about the claimed 40Gb per platter capacity then it's no great stretch to imagine having a 4Tb drive in your laptop. I thought I was doing well when I got my new laptop which came with a 60Gb 7200rpm disk. Now imagine a rack full of something similar to these or even these beasties and you start to see why we need something like ZFS to manage it all.
Alan pointed me to this article --- by SJVN no less --- entitled SCO Gives Sun Blessings to Open-Source Solaris Isn't this a bit like getting a blessing from the anti-Christ? It's well-known that SCO and Linux don't get along, and there's certainly no love from the various Linux communities towards Sun on a number of fronts. No doubt the usual people will froth at the mouth at how this allegedly "proves" that Sun is out to kill Linux. Friends, foes, citizens of the world, if you think that then you are sadly mistaken. A quick check of Sun's Operating Systems link under Products and Solutions/Software shows that Linux is even the top link in the table, with the comment Sun brings a comprehensive systems approach to Linux. Sun provides Java technology, x86-based hardware, Red Hat Enterprise Linux, and SUSE Linux Enterprise Server along with Sun's Java Enterprise System and Sun Java Desktop System -- all supported by Sun services. Side note: the PTS engineer on the other side of my cubicle wall runs linux (debian if I recall) and most definitely supports linux. A lot of my colleagues just here in this office have a box running linux somewhere in their system menageries too. You can't get better than that for putting your money where your mouth is.

Wednesday Apr 13, 2005

In the lecture today we covered arrays (and pointers) in C. About time too I thought. By way of an example the prepackaged lecture slides from Hanly and Koffman's C Program Design for Engineers, 2nd Ed showed how to use a character array as a string:
char b[] = ``Ned Flanders";
... is stored in memory as ...
`N' `e' `d' ` ` `F` `l' `a' `n' `d' `e' `r' `s' `\0'
How many of you would expect that an example provided in a textbook on the C language would be legal code in C? My lecturer didn't appear to think so, which I think is outrageous and depressing at the same time. How are my fellow students supposed to learn the language when the examples presented are wrong? I've learnt a few things over the years when presenting information for others or teaching an SGR class. One of those things is that your examples must be correct --- in a language class, you must be able to compile the example! Five slides further on was a multi-dimensional char array example with not a single correct element. Folks, if you're going to display character constants in C, use the single quote or apostrophe ('). Using a "backtick" or ` character will not work. At all. Ever. If you are a unix sysadmin or perl programmer, you'll know the importance of the backtick. Back when I was a sysadmin (before I joined Sun) I used to read the book reviews on www.perl.org. I remember quite vividly a reviewer shredding what was otherwise a decent book because the font used for printing did not have a correct backtick glyph: at least half of the example code looked wrong on the page and was useless as a teaching or reference example.
I was intrigued to see that there was a referrer from google to my blog: bitbucket in LINUX driver I've been meaning to write a bitbucket driver for some time now, along with a showstopper module. When I started work in what was then called CPRE, I helped some of our Bay Area colleagues with beta testing various backup products. One day I was staggered to find out that a bug I had logged was considered a showstopper, and preventing the other vendor from releasing the product for any OS at all. Another member of our group slightly misheard me, and asked me how my fixes to the showstopper module were going.... It's been a running gag ever since.

Thursday Apr 07, 2005

One thing that working on this first programming assignment has made me realise is that nowhere in the course structure for my Computer Systems Engineering degree @ UTS is there a subject on algorithms. Now Chris, one of PTS' senior senior engineering staff who happens to be based in the Sydney office, mentioned the other day that when he did his degree (cue the Four Yorkshiremen skit) all the programming courses were about algorithms first and languages second. So of course when we've all been sitting around at lunch time challenging each other to find better ways to find perfect numbers, Chris has looked at what Nathan and I did, and then ripped the guts out and replaced them with better and better algorithms. And I do mean, better and better --- both in terms of cpu usage and memory usage. I remember being fairly good at working out good algorithms in my first degree but being terrible at coding them. Now I think the balance has turned somewhat --- my coding is much better and I'm re-learning how to develop algorithms. A few months ago I had a chance to fix a bug in the mpt(7D) driver (this is the on-board scsi controller in the v20z, v40z and v440). The problem manifested itself on a v20z with Solaris 9: handling an interrupt while handling an interrupt blew the kernel thread stack away because of the way we were doing scatter-gather list allocation for each scsi command structure. I re-designed the way we do this allocation to use an as-needed allocation of exactly as much kernel memory as is needed for that particular scsi command. Now while it took me a few goes to get all the necessary pieces re-written, it was tremendously satisfying to know that I personally had figured out how to do this, and it was efficient, it was elegant and it was mine. Now if I'd been tasked with fixing that bug straight out of uni first time around, it probably would have taken me twice as long and not been nearly as elegant --- because I didn't have the exposure to good code and good algorithms. But with a few years (ok, 10+) behind me, it was actually quite easy. Now what I need to do is try to pass this experience on to my colleagues and the young-uns who I'm studying with. That is going to be the challenge!
When I wrote earlier this week about amd64 compiler performance was all done with the unsigned int. I figured for a laugh this evening that I'd see what I got by using unsigned longs instead.

It's interesting. In both case below (testing the first million natural numbers) the compiler options differ by ---xarch=amd64. The consistent options were ---fast ---xlibmil ---xlibmopt

Data ModelWallclock time
32bit12.81sec
64bit21.74sec

This I find rather strange. Why is it that longs in 64bit mode make execution nearly 6 seconds longer?

Wednesday Apr 06, 2005

This is something which I stumbled over last year: with the move to SMF (Service Management Facility), just adding a line to /etc/inet/inetd.conf for a new service such as CVS doesn't work any more. This manifest will let you run CVS as a service on your Solaris 10/Express machine. Save it as /var/svc/manifest/network/cvspserver-tcp.xml and ensure you've got the line cvspserver 2401/tcp #cvs pserver process in your /etc/services, edit the exec_method to suit your site, then run # svccfg import /var/svc/manifest/network/cvspserver-tcp.xml # svccfg disable svc:/network/cvspserver/tcp:default # svccfg enable svc:/network/cvspserver/tcp:default and be on your merry way. You should check the manpages for inetconv(1M) and smf(5), and the docs.sun.com entries in the Solaris 10 System Administrator Collection for more information. If you really want to get stuck into SMF, then check out these resources at the bigadmin site: SMF hits on bigadmin Sun Microsystems --- BigAdmin: Solaris Service Management Facility --- Service Developer Introduction Sun Microsystems --- BigAdmin: Solaris Service Management Facility --- Quickstart Guide BigAdmin Feature Article: Solaris 10 OS Feature Spotlight: Predictive Self---Healing

This blog copyright 2009 by jmcp