Thursday June 21, 2007 Just noticed an interview with Greg Papadopoulos, Sun's CTO in the Technology section of the BBC News website. The interview is about Greg's views of future technology trends. He makes some interesting points about mobile phones, PCs, the inexorable rise of the network, and kitchen utensils :-)
Thursday June 14, 2007 I had to attend a meeting in Manchester on Tuesday afternoon, so while I was there I took the chance to drop in on Baby, or as it is more properly known, the Small-Scale Experimental Machine, or SSEM for short. Baby was the world's first stored-program electronic digital computer, running its first program on June 21 1948. Now before all you US citizens start getting all uppity and start telling me that ENIAC first ran two years earlier in 1946, the unique and ground-breaking feature of Baby is that the program which it ran was stored electronically and in the same store as the data, making it the first machine with a von Neumann architecture - Baby is therefore the ancestor of every modern computer. ENIAC used decimal arithmetic and was hard-wired - changing the program required a lady with a pair of pliers. And anyway, ENIAC was beaten by the British Colossus (1943), which was both binary and electronic, but not Turing-complete, and in turn Colossus was beaten by the German Z3 machine (1941) which was Turing complete and used binary arithmetic, but was electromechanical.
Right, now we've got the historical arguments sorted, on to the machine itself. The machine was constructed at the University of Manchester by Frederic Williams, Tom Kilburn and Geoff Tootill. During WWII, Williams and Kilburn had both worked at the Telecommunications Research Establishment, which was a cover organisation set up to do work on radar.
The ground-breaking feature of Baby was the Williams-Kilburn tube. This was a cathode ray tube which formed the memory of the machine. Williams and Kilburn had extensive experience of CRTs from their experience of developing WWII radar equipment, and in fact the tubes they used in Baby were from radar equipment. The glass surface of the tube was used as the store - the electron beam was used to write a grid of charged spots on the glass, with different charge levels representing 1 and 0. The values were read off via a metal plate on the outside of the tube. The charges leaked away over time, so the contents had to be refreshed on a regular basis, in a very similar way to modern DRAM. Baby used a 32 x 32 pattern of dots, giving a memory capacity of 32 words each of 32 bits. Williams-Kilburn tubes provided random access to the stored data, unlike the alternative technology of the time, the mercury delay line, which only provided serial access. Baby actually used four tubes, one as the main store (memory), one as the Accumulator, another to hold the address of the current Instruction (CI - Control Instruction) and the instruction itself (PI - Present Instruction). The final tube was used to mirror the contents of the other three, the tube displayed being switchable. The display tube was necessary because the other three tubes had a pick-up plate on the front, and they were also heavily shielded to prevent electrical interference.
The instruction set used bits 0-12 of a 32-bit word to hold the target address, and bits 13-15 to define the operation. The instruction set was very simple with just 7 instructions:
Note there is no addition operation - you can implement addition using subtraction and negation, but you can't do the inverse. The reason for the limited memory and instruction set is because Baby was intended purely as a technology test-bed. The work carried out fed directly into later machines such as the Manchester Mark 1 and the Ferranti Mark 1. In 1995 the decision was taken to build a complete replica of Baby in time for the 50th anniversary celebrations in 1998 and this effort was successful, with the resulting replica being housed in the Museum of Science and Industry in Manchester. The machine is powered up every Tuesday, hence my visit.
The picture on the left shows the entire machine. The rack on the far left contains the power supplies, the rack on the far right contains the storage CRTs. In the centre is the panel containing the control, switches and display. The remaining racks contain the machine's logic circuits. The photo on the right shows a close-up view of the control rack. At the top is the display CRT - if you look carefully you can see that a program is running, an animation of a ship is visible at the bottom of the tube, sailing from left to right. The knobs to the right of the tube allow you to select the tube which is being displayed. Immediately below is the 'typewriter' which is used to input the program, each switch corresponds to one bit. There are 40 switches, so 8 of them are unused. Below that is a panel containing switches to select the line number in the store to modify (top) and switches to select the Function (opcode) to be executed (bottom). Finally, at the bottom of the picture is a row of control switches used to clear, load, run and stop the program. Click on either picture for a bigger view.
Great pains have been taken with historical accuracy - the people who worked on the original machine were consulted wherever possible, and many hours were spent poring over original notebooks and old photos, identifying the exact placement of component and labelling. The same components as were used in the original machine were used throughout. Even the racks are authentic, even though some of them had to be rescued from someone's garden! The racks are standard Post Office ones with 19 inch mountings - anyone recognise that dimension? ;-) They even went as far as replicating the numbering on the bottom of the racks, although nobody knew what the numbers were, and the chances were that the racks were second-hand in the original machine. My favourite touch is that in the original machine the display CRT (top of the right-hand picture) was propped up on a cardboard valve box as the cutout in the panel was too big - that's been replicated too!
The first program to run on the machine was written by Tom Kilburn, and calculated the highest factor of 218. One of the other early programs written for the machine was a long division routine written by Alan Turing who was working at the National Physical Laboratory at the time, but who was shortly to move to Manchester University.
I asked how reliable the replica was, and remarkably it seems to be very reliable, unlike the original version. It seems that the main reason is that the valves in the replica are mostly 1950s and 1960s vintage, and manufacturing techniques improved rapidly in the period after the war. The machine is run every week, and in the last 10 years only 3-4 valves have failed. More problematic are the old wire-wound resistors, which tend to fail more often.
To coincide with the 50th anniversary there was a programming competition. As a result several simulators are available, along with a programmer's reference manual and example programs. My favourite emulator is David Sharp's, which has a graphical interface that is similar to the real machine. It is written in Java, and I've been working with David to produce a Webstart version (not available yet) as well as trying to improve the visual accuracy of the simulator.
It was fascinating to read some of the documents and papers that have been written about the machine, and to see the machine actually running. Many of the concepts and techniques we still use today such as dynamically refreshed memory and relative jumps first appeared in this machine and it is amazing when you realise just how architecturally similar the SSEM is to modern machines, despite its appearance. I knew that Manchester had played a part in the early history of computing, but until I spent some time reading about the SSEM I hadn't realised just how pivotal that contribution had been.
Tuesday May 29, 2007 A rare chance to watch (and even contribute to!) the birth of a new programming language - lolcode.com. 1337!
Posted by alanbur ( May 29 2007, 06:15:31 PM BST ) Permalink Comments [0]
Tuesday January 16, 2007 Just seen this link posted internally: the rather windily-named Study on the Economic impact of open source software on innovation and the competitiveness of the Information and Communication Technologies (ICT) sector in the EU. To cut to the chase, Sun is acknowledged as being the number one contributor to Open Source, outstripping the second contributor (IBM) by nearly 3½ times. I've reproduced the relevant table below:
Table 5: Cost estimate for FLOSS code contributed by firms
| Total contribution from firms | |||
| Number of firms | 986 | ||
| Source lines of code | 31.2 million | ||
| Estimated effort | 16444 person years | ||
| Estimated cost | 1.2 billion Euro | ||
| Top contributors | |||
| Rank | Name | Person-months | Cost (mil euro) |
| 1 | sun microsystems inc. | 51372 | 312 |
| 2 | ibm corp. | 14865 | 90 |
| 3 | red hat corp. | 9748 | 59 |
| 4 | silicon graphics corp. | 7736 | 47 |
| 5 | sap ag | 7493 | 46 |
| 6 | mysql ab | 5747 | 35 |
| 7 | netscape communications corp. | 5249 | 32 |
| 8 | ximian inc. | 4985 | 30 |
| 9 | realnetworks inc. | 4412 | 27 |
| 10 | at&t | 4286 | 26 |
And that's before the recent OpenJDK announcement!
Posted by alanbur ( Jan 16 2007, 08:31:23 PM GMT ) Permalink Comments [2]
Friday December 01, 2006 One of the things that was on my to-do list after setting up the Meninos do Morumbi Oldham website was to do something on the reporting front with the server log files. I'd already set up Tomcat to generate combined log format files by putting this in the server.xml file:
<Valve className="org.apache.catalina.valves.AccessLogValve"
directory="logs"
prefix="access."
suffix=".log"
pattern="combined"
resolveHosts="false" />
so I had the raw data I needed, I just needed to do something with it. In the past I've used AWStats to do log file reporting, but it is written in perl and therefore needs a CGI-bin setup. This is easily done if you are running Apache, but I'm running stand-alone Tomcat, and although you can run CGI stuff under Tomcat, it isn't really recommended.
As is often the way, I was looking for something else entirely when I came across Visitors, a stand-alone log file analyser written in C. It writes its report as either a single HTML or text file, and the report looked fine, so I could just run it from cron once an hour and put the generated report somewhere in the tree managed by MeshCMS.
One small additional wrinkle: as you can see from the sever.xml entry above I've turned off DNS lookups for the access logs. The reason for this is that DNS lookups can take some time, and I don't want logging to slow down the web server. However it's useful to have resolved names for reporting purposes, so I pipe the log files through the apache logresolve utility before feeding them into Visitors. At the moment I'm doing this each time I build the reports - I should really just do this once and cache the result, but that's a job for another day :-)
Thursday November 02, 2006 Greg Luck has done an insightful writeup of Sun's Open Source Strategy:
and:
Although Greg is primarily discussing our Application Server and Java strategies, it's really heartening to see someone
Thursday August 17, 2006 I saw this article on The Inquirer about the updates to our x86 line, but I had to read this sentence in the last paragraph a couple of times to make sure I wasn't seeing things:
How times have changed indeed...
Posted by alanbur
( Aug 17 2006, 12:58:02 PM BST )
Permalink
Comments [0]
Sunday June 11, 2006 As I said in my last post, I like O'Reilly's Safari Online, but I find some of the presentation a little annoying. Other than the grotty font choices, my other bugbear is the awful "Additional reading" section that appears at the bottom of lots of the pages. The blurb says:
but high-quality it isn't, for example whilst viewing a page about SQL queries I get a suggestion that I go read some chapter in a C# book! Whilst I could fix the font problem with Firefox's userContent.css file, I couldn't use that method to excise the "Additional reading" cruft, as although it was in a div the div in question had neither a class nor an id. This looked like a job for Greasemonkey. Greasemonkey is a Firefox plugin that allows you to run user-defined Javascript over each page as it is loaded by the browser, so you can modify page content before it is displayed. Earlier versions of Greasemonkey had some serious security flaws, but these have been fixed in the current version. With Greasemonkey installed, removing the cruft was a snap - each section I wanted to chop out was enclosed in a div as I said, and had a h4 heading containing the string Additional reading as its content. As I was going to have to use Greasemonkey, I reimplemented the CSS hack I described in my last post in the Greasemonkey script - the advantage being that individual Greasemonkey scripts can be enabled and disabled, unlike userContent.css, which is permanent. The following script did the trick:
// ==UserScript==
// @name Safari Books Online cleanup
// @namespace http://bleaklow.com/greasemonkey
// @description Fix fonts and remove 'Additional reading' section from Safari pages
// @include http://*.safaribooksonline.com/*
// ==/UserScript==
/* Make the fonts readable. */
GM_addStyle(' \
.docText, .docList { \
font-family: sans-serif !important; \
font-size: medium !important; \
} \
.docFootnote, .docItemizedlist { \
font-family: sans-serif !important; \
font-size: small !important; \
} \
tt, pre, code, .docMonoFont { \
font-family: monospace !important \
} \
} \
');
/* Remove cruft from the top of the page. */
var node;
var nodes = document.evaluate(
'//tr[@class="toplogo"]/../..',
document, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
if (nodes.snapshotLength == 2) {
node = nodes.snapshotItem(0).parentNode;
node.parentNode.removeChild(node);
node = nodes.snapshotItem(1);
node.parentNode.removeChild(node);
}
/* Remove the stupid "Additional reading" section. */
node = document.evaluate(
'//h4[.="Additional reading"]/..',
document, null, XPathResult.ANY_UNORDERED_NODE_TYPE, null).singleNodeValue;
if (node != null) {
node.parentNode.removeChild(node);
}
/* Remove the book cover and details and replace with a simple heading. */
node = document.evaluate(
'(//a[@title="Book Cover"])[1]/../..',
document, null, XPathResult.ANY_UNORDERED_NODE_TYPE, null).singleNodeValue;
if (node != null) {
var td_cover = node.childNodes[0];
var tbody_info = node.childNodes[1].childNodes[0].childNodes[0];
node.removeChild(td_cover);
var title =
tbody_info.childNodes[0].childNodes[0].childNodes[0].innerHTML;
while (tbody_info.childNodes.length > 1) {
tbody_info.removeChild(tbody_info.lastChild);
}
tbody_info.childNodes[0].innerHTML = "<h2>" + title + "</h2>";
}
The important bits to note are the @include, which restricts this script to just the Safari website, and the use of document.evaluate with an XPath expression to find the appropriate part of the DOM to tweak.
Posted by alanbur
( Jun 11 2006, 08:22:46 PM BST )
Permalink
Comments [0]
Thursday June 08, 2006 While I love the content available from O'Reilly's Safari Books Online website, I'm less than enamoured with the presentation. Specifically it insists on using small Times Roman as the font for displaying the contents of books. You can increase the size, but it doesn't 'stick' between visits, and my aging eyesight really doesn't cope well with small serif fonts, especially the rather grotty Times Roman I have available - rather than specifying a serif font, the CSS explicitly asks for Times Roman.
The fix is pretty straighforward - firstly I found the CSS style used by the text in question, using the most excellent Web Developer extension for Firefox - install the extension, press <Control-Shift-Y> and click on the offending text to find the style information.
Next step: override the CSS by using Firefox's userContent.css file to provide a replacement. This file is found in your Firefox preferences directory, the easiest way to locate it is with find $HOME/.mozilla -name userContent-example.css - it should live in the same directory as that file. Here's what I used:
@-moz-document domain(safaribooksonline.com) {
.docText, .docList {
font-family: sans-serif !important;
font-size: medium !important;
}
.docFootnote, .docItemizedlist {
font-family: sans-serif !important;
font-size: small !important;
}
}
Note that after you've created this file you need to restart Firefox, as it only gets read when Firefox starts up. The neat bit is the use of the @-moz-document directive to restrict the custom CSS to just the Safari Books Online website, as .docText and .docList aren't exactly unusual names and could conveivably used by other websites. The full synxax of the @-moz-document directive is as follows:
url(http://www.w3.org/)http://www.w3.org/url-prefix(http://www.w3.org/Style/)http://www.w3.org/Style/domain(mozilla.org)mozilla.org or ends with .mozilla.org
Posted by alanbur
( Jun 08 2006, 03:22:37 PM BST )
Permalink
Comments [0]
Monday February 27, 2006 I was driving to band practice tonight and an item came on Radio Four about Jenny Pope, a lady from Mossley who has gone missing in Ecuador. She was last heard from while she was in Banos, Quito on 9th January and shortly afterwards her bank account was systematically emptied. I don't know her personally but a lot of my friends in the Manchester drumming community do, and we had a big busk in Manchester a couple of weeks ago to try to help raise awareness (and money!). We managed to bring in £300 in a couple of hours, thanks to the kindness of the people of Manchester. The Manchester police were on the radio this evening saying that they were sending a couple of officers out to Ecuador to help in the search for her. Fingers crossed that they manage to make some progress, everyone who knows the family is obviously very concerned about her.
Posted by alanbur
( Feb 27 2006, 10:21:13 PM GMT )
Permalink
Comments [0]
Saturday February 18, 2006 Picked this tip up off slashdot - some websites (e.g. the Washington Post) insist on making you register to read their content, even if the content is actually free - the assumption is that they only want the registration so they can harvest your email address and use it for marketing purposes, and as someone who gets huge volumes of spam already (despite 500+ a day caught by the corporate filters, 40-50 still get through) I'm very loathe to give out my email address when I don't have to. Although there are services such as BugMeNot which allow you to create a throw-away email address for registration and to look up already registered addresses, the webmasters of the sites that require registration cotton on to these pretty fast and block them.
However, these selfsame sites obviously want google to index them, so they allow the google search robot in without requiring registration. Enter the User Agent Switcher plugin for Firefox. I've used this in the past to get access sites that insist you use Internet Explorer to view them, but you can add an entry to make your browser look like it is the google bot. In the Description field put whatever you want to identify the entry, and in the User Agent field put the string googlebot/2.1 (+http://www.google.com/bot.html), you can leave the rest of the fields empty. Switch to the new User Agent string and hey presto, all that content you couldn't previously access is visible. Unlike the disposable email address arms race, I don't see how this can be circumvented without making the content invisible to Google as well.
Posted by alanbur
( Feb 18 2006, 05:58:18 PM GMT )
Permalink
Comments [6]
Friday January 27, 2006
I use thunderbird 1.5 as my mail reader, and it is always bugged me that clicks on file:// links in emails are ignored - these are frequently used internally within Sun, for example code reviews are usually done with a tool called webrev which builds a tree of HTML pages containing the diffs, and the file:// links are usually mailed out to reviewers. A spot of light googling revealed this page on the topic for firefox, and although the page implied the same configuration settings were also used by thunderbird, it wasn't quite clear how to apply the technique to mails read from an IMAP server. Hmm.
To try to figure out what was going on I opened up the thunderbird JavaScript console (on the Tools menu) and clicked on a file:// link in an email. The following error message appeared:
So I surmised that what I needed to do was to tell thunderbird to trust links from my IMAP server, so I added the following to my user.js file:
// Allow file:// links
user_pref("capability.policy.policynames", "localfilelinks");
user_pref("capability.policy.localfilelinks.sites", "imap://mailserver.uk.sun.com:143");
user_pref("capability.policy.localfilelinks.checkloaduri.enabled", "allAccess");
restarted thunderbird, and clicking on file:// links now worked. Note that there are a number of good security-related reasons as to why this isn't enabled by default, so use it at your own risk!
Posted by alanbur
( Jan 27 2006, 01:02:25 PM GMT )
Permalink
Comments [3]
Friday January 06, 2006 The Register have just published their yearly IT supplier perceptions report, and Sun come out on top.
Yay for us, and thanks to all the readers of El Reg!
Posted by alanbur
( Jan 06 2006, 01:18:10 PM GMT )
Permalink
Comments [0]
Thursday December 22, 2005 As I said in my last post, I wanted to use one TWiki installation to support all the virtual hosts I was setting up. Although TWiki provides its own authentication and access control, it doesn't really provide a workable way of restricting read access. The other problem is that as I already needed to protect the site anyway using SSL and HTTP Basic Authentication, if I used the TWiki mechanisms I'd end up having to duplicate and manage the access control information in both Apache and in TWiki, which really wasn't acceptable.
Note however that you really can't get away from using TWiki authentication if you want to track page changes by user. The TWiki documentation on how to do this is pretty abysmal, but if you are already using HTTP Basic Authentication you can get away with controlling access to all the TWiki CGI scripts with Basic Authentication and requiring that people register themselves with TWiki before editing any pages. As all of the virtual hosts I was setting up were already access controlled I had that part of the problem covered.
My requirements were that I should be able to share some webs between all the virtual hosts, other webs between a particular subset of the virtual hosts, and yet other webs would be restricted to just a single virtual host. If I could to this it meant I could share a single Main and TWiki web amongst all the vhosts, so for example once people had registered in TWiki they could be given access to other sites without requiring re-registration. It would also allow me to share common information such as site documentation across all the sites whilst still maintaining secure access to the non-shared webs.
The first step was to create a twiki directory under the htdocs directory for each vhost, and create symlinks from there to the common bin, lib and templates directories. Subdirectories were created for data and pub as these wouldn't be shared between vhosts, and within those directories a further set of symlinks to the appropriate Web subdirectories of the master twiki install were made. Seperate Trash subdirectories were made - if we shared Trash subdirs then pages deleted in one Web would be visible to all the others.
The next step was to figure out how to get each vhost to use the appropriate one of the TWiki trees I'd just created. TWiki stores its configuration in lib/Twiki.cfg - the file is a set of perl global variable assignments that specify the environment TWiki is running under and each twiki CGI script loads it in at startup. Obviously these would need to be different for each vhost, but rather than having a seperate TWiki.cfg for each site I wanted to have just one copy. My last post included the following segments from perl.conf:
$env = [
[ APP_ROOT => '/approot' ],
[ TWIKI_ROOT => "$sr/htdocs_$vh/twiki" ],
];
SetEnv => $env,
PerlSetEnv => $env,
This was so I could grab those values from the environment and then use them to configure TWiki to point to the files and directories appropriate to the current vhost. I put the necessary changes to Twiki.cfg in place, fired up Apache and pointed my browser at the TWiki homepage of one of the vhosts - Yay! it all worked. I then fired up a seperate browser window, pointed it at one of the other vhosts and bounced on the reload key a few times. It all looked OK at first, but after a few refreshes the page started showing Webs from the other site. I switched back to the original browser window, hit reload a few times and that started showing info for the other site as well - Urk!
I had a pretty good idea what was wrong. The Apache architecture consists of a pool of httpd processes that serve requests, and in a virtual hosting setup such as mine a given httpd process will potentially serve pages for multiple virtual hosts. I was also running TWiki under mod_perl, and in that setup the perl interpreter and any code loaded into it is persistent - each httpd process has an interpreter embedded inside it, unlike the normal CGI environment where each CGI script invocation results in a seperate fork/exec of the perl interpreter. Under this environment, global variables are toxic, and TWiki is riddled with them (the standard of some of the code in TWiki is less than excellent, this is just one example). I guesed that the problem was being cause by one or more global variables that weren't being reeinitialised appropriately, but which ones? I really didn't want to have to make drastic changes to the TWiki codebase to fix the problem if I could avoid it. A bit of trawling through the code revealed the root cause - although TWiki.cfg was correctly initialising everything based on the environment variables that were being passed in, on subsequent invocations the globals weren't being modified to point to the current vhost.
The fix needed two parts - I needed to seperate out the bits of TWiki.cfg that were dynamic and needed to be reinitialised on each request, and I needed to make sure that the reinitialisation was actually performed on each request. The first step was to go through TWiki.cfg and seperate out all the dynamic bits into a subroutine:
#
# Initialise the dynamic bits of TWiki's configuration.
#
sub doDynamicConfig
{
# Fetch config from environment and untaint.
my ($twiki_root) = $ENV{TWIKI_ROOT} =~ m{^([\w/._-]+)$};
my ($twiki_url) = substr($ENV{SCRIPT_URI}, 0, -length($ENV{SCRIPT_URL}))
=~ m{^([\w/:._-]+)$};
# Set up dependent TWiki globals.
$defaultUrlHost = $twiki_url;
$pubDir = "$twiki_root/pub";
$templateDir = "$twiki_root/templates";
$dataDir = "$twiki_root/data";
$logDir = $dataDir;
#
# XXX NASTY HACK.
# Depends on the ordering of elements in @storeSettings.
#
$storeSettings[1] = $dataDir;
$storeSettings[3] = $pubDir,
$wikiHomeUrl = "$twiki_url/twiki";
$debugFilename = "$logDir/debug.txt";
$warningFilename = "$logDir/warning.txt";
$htpasswdFilename = "$dataDir/.htpasswd";
$logFilename = "$logDir/log%DATE%.txt";
$remoteUserFilename = "$dataDir/remoteusers.txt";
$userListFilename = "$dataDir/$mainWebname/$wikiUsersTopicname.txt";
}
I also seperated out all the static bits into a second subroutine, I've left that out as it's not particularly interesting. The next bit was to figure out how to make sure doDynamicConfig was run on every request. Fortunately all the TWiki scripts call a common routine on startup (TWiki::initialize in lib/TWiki.pm) so all that was required was the addition of a call to doDynamicConfig and now everything worked as it should.
That's the end (for the moment, at least) of my series of posts on this topic, I hope that someone out there who is trying to deploy TWiki across multiple, secure virtual hosts might find that the information therein saves them some grief!
Posted by alanbur
( Dec 22 2005, 05:57:24 PM GMT )
Permalink