Tuesday August 26, 2008
It Must Be Time for TeaMike Kupfer's Weblog Over the weekend I upgraded an Ubuntu box to 8.04 LTS (Hardy Heron). The default background image is a way cool rendering of a heron. If you go walking along the bayshore here you can sometimes see a heron, so I was delighted to see this new background. It's gorgeous. (2008-08-26 12:10:14.0) Permalink Comments [0]Converting Projects to Mercurial One of the things that we consider when deprecating components
of (Open)Solaris is how users move from the old software to the
new software. We've applied that principle to the SCM Migration
project, so we've been working on documentation (e.g., a Mercurial
cheat sheet for TeamWare users), and the updated tools work with
both TeamWare and Mercurial. Also, we don't want to tie the
schedules of large projects to the SCM Migration schedule or vice
versa. So we need to support projects that are begun under
TeamWare, but which are still under development when we're ready
to move the gate from TeamWare to Mercurial. That support is
provided by a new script called In general, it's hard to convert a TeamWare workspace to
Mercurial, at least if you want to maintain history. But ON
already has a policy that putbacks should (usually) add a single
delta. That is, any project-specific history will be lost
anyway. That makes the job of Suppose you have a project gate--call it
![]() It turns out that it is pretty easy for Let's walk through an example. Suppose I have a workspace that deletes all the SCCS helper scripts in usr/src/tools. And to demonstrate renames, it renames the scripts directory makefile to Makefile.new.
$ pwd
/export/kupfer/tonic/wx2hg-tests/tw.no-sccs-tools.demo
$ putback -n
...
Would put back name changes: 10
rename from: usr/src/tools/scripts/Makefile
to: usr/src/tools/scripts/Makefile.new
rename from: usr/src/tools/scripts/sccscheck.1
to: deleted_files/usr/src/tools/scripts/sccscheck.1
rename from: usr/src/tools/scripts/sccscheck.sh
to: deleted_files/usr/src/tools/scripts/sccscheck.sh
rename from: usr/src/tools/scripts/sccscp.1
to: deleted_files/usr/src/tools/scripts/sccscp.1
rename from: usr/src/tools/scripts/sccscp.sh
to: deleted_files/usr/src/tools/scripts/sccscp.sh
rename from: usr/src/tools/scripts/sccshist.sh
to: deleted_files/usr/src/tools/scripts/sccshist.sh
rename from: usr/src/tools/scripts/sccsmv.1
to: deleted_files/usr/src/tools/scripts/sccsmv.1
rename from: usr/src/tools/scripts/sccsmv.sh
to: deleted_files/usr/src/tools/scripts/sccsmv.sh
rename from: usr/src/tools/scripts/sccsrm.1
to: deleted_files/usr/src/tools/scripts/sccsrm.1
rename from: usr/src/tools/scripts/sccsrm.sh
to: deleted_files/usr/src/tools/scripts/sccsrm.sh
The following files are currently checked out and have been edited in workspace
"/export/kupfer/tonic/wx2hg-tests/tw.no-sccs-tools.demo":
usr/src/tools/scripts/Makefile.new
...
No changes were put back
$
Note that although Makefile.new is checked out, it need not be. Converting this to Mercurial is simple. If your TeamWare
workspace is in a directory that you have write access to, just
point
$ pwd
/export/kupfer/tonic/wx2hg-tests
$ /opt/onbld/bin/wx2hg tw.no-sccs-tools.demo
requesting all changes
adding changesets
adding manifests
adding file changes
added 6349 changesets with 91335 changes to 49774 files
44994 files updated, 0 files merged, 0 files removed, 0 files unresolved
Initializing wx...
...
New renamed file list:
...
New active file list:
...
Will backup wx and active files if necessary
...
wx initialization complete
usr/src/tools/scripts/Makefile.new already checked out
rename usr/src/tools/scripts/Makefile -> usr/src/tools/scripts/Makefile.new
rename usr/src/tools/scripts/sccscheck.1 -> deleted_files/usr/src/tools/scripts/sccscheck.1
rename usr/src/tools/scripts/sccscheck.sh -> deleted_files/usr/src/tools/scripts/sccscheck.sh
rename usr/src/tools/scripts/sccscp.1 -> deleted_files/usr/src/tools/scripts/sccscp.1
rename usr/src/tools/scripts/sccscp.sh -> deleted_files/usr/src/tools/scripts/sccscp.sh
rename usr/src/tools/scripts/sccshist.sh -> deleted_files/usr/src/tools/scripts/sccshist.sh
rename usr/src/tools/scripts/sccsmv.1 -> deleted_files/usr/src/tools/scripts/sccsmv.1
rename usr/src/tools/scripts/sccsmv.sh -> deleted_files/usr/src/tools/scripts/sccsmv.sh
rename usr/src/tools/scripts/sccsrm.1 -> deleted_files/usr/src/tools/scripts/sccsrm.1
rename usr/src/tools/scripts/sccsrm.sh -> deleted_files/usr/src/tools/scripts/sccsrm.sh
After the renames, it applies a patch for each modified file...
patching file usr/src/tools/scripts/Makefile.new
...and then you're done.
$ ls -dF *demo*
tw.no-sccs-tools.demo/ tw.no-sccs-tools.demo-hg/
You can verify that
$ cd tw.no-sccs-tools.demo-hg
$ hg diff -g
diff --git a/usr/src/tools/scripts/sccscheck.1 b/deleted_files/usr/src/tools/scripts/sccscheck.1
rename from usr/src/tools/scripts/sccscheck.1
rename to deleted_files/usr/src/tools/scripts/sccscheck.1
diff --git a/usr/src/tools/scripts/sccscheck.sh b/deleted_files/usr/src/tools/scripts/sccscheck.sh
rename from usr/src/tools/scripts/sccscheck.sh
rename to deleted_files/usr/src/tools/scripts/sccscheck.sh
diff --git a/usr/src/tools/scripts/sccscp.1 b/deleted_files/usr/src/tools/scripts/sccscp.1
rename from usr/src/tools/scripts/sccscp.1
rename to deleted_files/usr/src/tools/scripts/sccscp.1
diff --git a/usr/src/tools/scripts/sccscp.sh b/deleted_files/usr/src/tools/scripts/sccscp.sh
rename from usr/src/tools/scripts/sccscp.sh
rename to deleted_files/usr/src/tools/scripts/sccscp.sh
diff --git a/usr/src/tools/scripts/sccshist.sh b/deleted_files/usr/src/tools/scripts/sccshist.sh
rename from usr/src/tools/scripts/sccshist.sh
rename to deleted_files/usr/src/tools/scripts/sccshist.sh
diff --git a/usr/src/tools/scripts/sccsmv.1 b/deleted_files/usr/src/tools/scripts/sccsmv.1
rename from usr/src/tools/scripts/sccsmv.1
rename to deleted_files/usr/src/tools/scripts/sccsmv.1
diff --git a/usr/src/tools/scripts/sccsmv.sh b/deleted_files/usr/src/tools/scripts/sccsmv.sh
rename from usr/src/tools/scripts/sccsmv.sh
rename to deleted_files/usr/src/tools/scripts/sccsmv.sh
diff --git a/usr/src/tools/scripts/sccsrm.1 b/deleted_files/usr/src/tools/scripts/sccsrm.1
rename from usr/src/tools/scripts/sccsrm.1
rename to deleted_files/usr/src/tools/scripts/sccsrm.1
diff --git a/usr/src/tools/scripts/sccsrm.sh b/deleted_files/usr/src/tools/scripts/sccsrm.sh
rename from usr/src/tools/scripts/sccsrm.sh
rename to deleted_files/usr/src/tools/scripts/sccsrm.sh
diff --git a/usr/src/tools/scripts/Makefile b/usr/src/tools/scripts/Makefile.new
rename from usr/src/tools/scripts/Makefile
rename to usr/src/tools/scripts/Makefile.new
--- a/usr/src/tools/scripts/Makefile.new
+++ b/usr/src/tools/scripts/Makefile.new
@@ -50,11 +50,6 @@ SHFILES= \
nightly \
onblddrop \
protocmp.terse \
- sccscheck \
- sccscp \
- sccshist \
- sccsmv \
- sccsrm \
sdrop \
webrev \
ws \
$
Note that you still need to do "hg commit" to check in your new version. All this assumes that your workspace is in sync with /ws/onnv-clone. If it isn't you may get messages like
wx2hg: can't rename: usr/src/tools/scripts/sccscheck.1 doesn't exist.
or
wx2hg: usr/src/tools/scripts/Makefile.new: parent mismatch;
resync with /ws/onnv-clone or specify branch point with -r hg_rev.
Doing a bringover from /ws/onnv-clone, and resolving any conflicts, should fix things up. You may also see a message like
Please run
hg --cwd /export/kupfer/tonic/wx2hg-tests/tw.no-sccs-tools.demo-hg update -C
before retrying.
This is telling you you can reuse the Mercurial child, but you
need to reset it first. Once you've resynched with /ws/onnv-clone
and run the "hg ... update..." command, you use the
/opt/onbld/bin/wx2hg -t tw.no-sccs-tools.demo-hg tw.no-sccs-tools.demo
There's more that SCM Migration: The Big Picture When Steve Lau left Sun at the end of last September, I became the go-to guy inside Sun for the migration to Mercurial. I had thought that I had a good high-level grasp of the project. But after getting blindsided a couple times by dependencies I hadn't considered, I drew up a diagram to help me get oriented, identify stakeholders, and maybe anticipate future issues. Here's a slightly simplified version of the original diagram from the whiteboard in my office:
Blue parallelograms indicate repositories, tan boxes are software modules, solid lines indicate data flow, and dashed lines tie users with the modules that they're using. The three red-rimmed boxes (gk tools, gate hooks, and onbld tools) are where most of the development effort is going. The primary simplifications in this diagram are
Even so, this is a moderately busy diagram. There are several components to keep track of and make sure they all fit together. Most of the work so far has been in the area of the ON build (onbld) tools, pieces of which are used by other consolidations and by the Solaris Companion project. Many of the changes are related to making the tools work with Mercurial as well as with TeamWare/SCCS. We've also had to consider the implications of moving everything outside the Sun firewall, which has meant rethinking interfaces to things like the bug database and our RTI (Request To Integrate) system. We haven't done as much work on the gatekeeper (gk) tools, although we've started to think about design issues. Many of the design decisions boil down to this question: do we make the minimal set of changes needed to work with Mercurial, or do we make more extensive changes so that the tools can make better use of the features provided by Mercurial? In some cases we are staying with the current approach. For example, we are using separate repositories for build snapshots, rather than using branches and tags in the main gate repository. In other cases we will be changing the tools to use Mercurial features. For example, any automated post-putback processing will be driven directly by Mercurial hooks, rather than the email-based hook system that is needed with TeamWare. Another set of interesting design decisions has centered around the use of gate hooks to enforce various style and bookkeeping rules. With the current TeamWare setup, we enforce these rules after a putback (at least for ON). The putback triggers various checks, and if your putback violates a rule, you get notified of the problem and given a short window to fix it or your putback is reverted. The gate is normally configured so that anyone (inside Sun) can putback. While this approach worked when Solaris was closed source, we expect it not to scale for OpenSolaris, where the repository is accessible from anywhere on the Internet and both Sun employees and non-employees can have commit rights. Certain Mercurial hooks can abort a putback ("push" in Mercurial terms), so we could move all the post-putback checks to pre-transaction checks. But moving more checks means more work (e.g., testing), which means a longer time before we can move to Mercurial. So the question becomes which checks really need to happen before putback, and which ones can happen after putback. The check to ensure that a putback has an approved RTI probably needs to happen prior to the putback. The check for adherence to the C style rules can happen after the putback, at least for now. The opensolaris.org webapp has various bits of functionality for source code management. A project leader or gatekeeper can use the webapp to create, destroy, and lock repositories, as well as to manage commit rights for the project's repositories. Unfortunately, the current set of operations is limited. For example, a gatekeeper might want to lock a repository for most users, but allow access for a specific large project. Alas, this lock granularity is not currently supported. Furthermore, all the controls are currently through a web-based interface, with no scripting hooks. Although there is currently work to improve the webapp and make it easier to change, this work is unlikely to be finished in time for us to make any changes that we expect gatekeepers to want. So we will need to think about other ways to provide the needed functionality, such as giving gatekeepers shell access to the server that hosts the repositories. The SCM front-end gives a user access to repositories by creating a chroot environment which contains only the repositories that the user has commit privileges for. (Access to other repositories is done via the "anon" user.) If the user reports being unable to pull from, or push to, a repository, the problem could be with the SCM program itself, the SCM front-end, or some other general system service. This diagnosis typically requires shell access to the servers. We are using Nagios to monitor the health of the servers and services on opensolaris.org. We have written a couple simple Nagios plugins to monitor the Mercurial and Subversion services. As we gain experience with the system, we could update the probes to check for specific failure scenarios. OpenGrok makes it into this diagram because it makes a private snapshot of each repository that it indexes, so as to provide a consistent view of the tree. We once managed to break the OpenGrok indexing of ON by trying to undo (rollback) a particular putback, so that it would vanish completely from the repository. We didn't know to roll back OpenGrok's snapshot repository as well. So the next time OpenGrok tried to pull from the Mercurial onnv-gate, it created a branch that had to be merged. This was not something OpenGrok was prepared for, so the snapshot tree was not updated. After several days, we started getting complaints from ON teams who couldn't find their recent putbacks in OpenGrok. We figured out the problem, replaced OpenGrok's snapshot repositories, and vowed not to undo/rollback any future putbacks. So that's the "big picture" of what the SCM Migration project is working on. If you've been frustrated by how long things are taking, well, we're not happy about it, either. Our hope is that by keeping the entire picture in mind, we will not have any serious problems when we finally do move. (2008-02-15 14:20:46.0) Permalink Comments [4]A Different Type of Temporary E-Mail Address I was reading an update from Bay Area Consumers' Checkbook, and they mentioned a new spin on temporary email addresses. Make up a name at mailinator.com. When email arrives there, it is available to read for a few hours and then deleted. No sign-up, no password. Anyone who guesses the account name that you used can read it. I don't know how much I'd actually use this, but it's a neat idea. And their FAQ is a hoot. (2007-11-25 15:47:13.0) Permalink Comments [2]April Chin put
back I upgraded my desktop to snv_66 (build 66 of Solaris Nevada) earlier in the week and played around some with the new GNOME bits (2.18). The new Disk Analyzer GUI has a much-improved format for showing where you're using disk space. In the example below, about 25% of my home directory is email, and about 23 MB is email for NFS.
I still prefer the equivalent view in Konqueror, because it can identify individual files, whereas the GNOME tool only tells you about directories. But the radial format in the GNOME tool is pretty cool.
(In case anyone is wondering, the KDE screenshot is from back in March, so this picture is not directly comparable with the one above.) (2007-06-24 21:10:26.0) Permalink Comments [4]Mark Shuttleworth and a few Ubuntu developers stopped by the Sun Menlo Park campus on Friday May 4th. I'm not working with Ubuntu, but since I'm involved with the Solaris Companion and with general OpenSolaris issues, I wanted to see what they had to say about third-party packages and about how they do their releases. You can organize Ubuntu packages along two dimensions. The first dimension is whether the package is free (libre). The second dimension is whether Canonical (Ubuntu's corporate sponsor) provides support (e.g., security fixes). This gives us the following table:
Notice that Canonical only supports 10% of the packages in the distro. There are two levels of access to the third-party packages. The first level is an engineering repository which bypasses Canonical. That is, people can update the repository at any time, without regard to the Ubuntu release schedule. The second level is the actual distro, which has tighter controls. Some of the packages are available on the Ubuntu CD, but many are only available via network download. Canonical does not track the downloads. This would be heresy inside Sun, where there's a big emphasis on measuring things. But Mark said that Canonical doesn't really care about the download numbers, and it would be difficult to get accurate numbers anyway (e.g., because of mirroring). Someone asked Mark how they deal with packages that potentially infringe on a patent. Mark said that there's no such thing as a global patent, so those packages are allowed in the distro, but they're only available via network download. The user self-certifies that it's okay for him or her to use the package. Another issue that comes up with third-party packages is how to track bugs. Mark talked about this a bit, and it's is something we're facing with OpenSolaris, too. The basic problem is that for a given package, there may be two bug databases: one deployed by the upstream project and one deployed by the distro. So far, the industry best practice seems to be to push distro-independent information to the upstream database, leaving distro-specific details in the distro's database. This approach is less than ideal, because it requires a fair amount of manual effort to track the bug status and to keep the right information in the right database. Canonical developed a tracking application called Launchpad to help deal with this, but Mark mentioned that it's still not quite what they want, and that Canonical might be revisiting the issue in a couple years. It'd be nice if the Ubuntu and OpenSolaris communities could somehow work together on that. Mark spent a little time describing Launchpad, and it does have some nice bug-tracking features. For example, you can create hyperlinks to the upstream database entry, and Launchpad can automatically query the upstream database to get the bug's status. Launchpad also has more general collaboration support, such as mailing lists, project web space, and a code repository. Launchpad includes features that would be useful on opensolaris.org, like a translation tracker and an application for proposing and tracking project ideas. The other major topic that I was interested in was how Ubuntu releases are done. Ubuntu releases follow a train model, with releases appearing every 6 months. There is support for 18 months, except for Long Term Support (LTS) releases, where servers are supported for 5 years. For those who are not familiar with the train model, the basic idea is that if your code is not ready in time, it is bumped to the next release, rather than delaying the current release. Sun tried a train model for Solaris in the 1990s, with releases every 6 months[1]. It didn't work for us, and we eventually gave it up. I wasn't involved with Solaris release management, so I probably have a limited perspective on what all the issues were. But as a developer I could see a couple things that contributed to abandoning 6-month trains. The first problem that I saw was that we didn't stick to the cutoff dates. There was often some new feature that just couldn't wait for the next train, so we would bend the rules and let changes integrate after the nominal cutoff[2]. I suppose that having a late binding mechanism makes sense for exceptional circumstances, but I think it got overused. These days, it seems like late binding isn't just a safety net to keep the release from falling apart, it's a regular phase in the release cycle. I suppose the net effect isn't too horrible--it's effectively a gradual freezing of the code, rather than a hard freeze. But it does push back the real, final freeze date, which then reduces the time that is available for later parts of the release cycle. This ties in to the other problem that I saw, which was that the Beta Test period was too short. I forget how long the Beta periods were, but they were short enough that by the time customers had actually deployed the code, identified and reported issues, and we had worked out a fix, it was too late to get the fix into that release. Of course, this begs the question of why Canonical doesn't have the same problems with Ubuntu. One explanation is that much of what goes into Ubuntu comes from an upstream source and is already (more or less) stable. There is some original work done for Ubuntu, but it's not the "deep R&D"[3] of things like SMF, DTrace, or ZFS. It's hard to predict the schedule for cutting-edge projects, particularly ones that affect large parts of the system. That's not an entirely satisfactory answer, though, because according to the train model, if a project is late, you just bump it to the next release. So there must be more going on than that. One thing that could mess up a train model is technical dependencies. Suppose Project A depends on Project B. If you integrate parts of A under the assumption that B will integrate later in the release, there will be a strong temptation to delay the release if B is late. The Ubuntu folks try to avoid this problem by avoiding dependencies on upstream cde that's scheduled to be released near the feature freeze. How strict they are about this depends in part on how much they trust the upstream provider to meet its schedule. And in a pinch, they might take beta code if it's deemed to be stable enough. I don't know if technical dependencies were a factor in moving a way from the train model for Solaris releases. It shouldn't have been an issue for the OS/Net consolidation ("FCS Quality All the Time"), but I don't know about Solaris as a whole. I suppose there could have also been a sort of "marketing and PR" dependency problem, where we feared a loss of face if Feature X didn't make its target release. I don't know if this was actually an issue, but Sun does seem to like big, flashy announcements, and there are quite a few analyst briefings that happen under embargo[4] prior to these events. Another explanation for why Canonical can make 6-month trains work is that the 6-month releases serve a different target market than the one Solaris has been in. A noticeable chunk of the Solaris user base would go nuts with a 6-month release cycle and 18-month support tail. As soon as they got one release qualified and deployed, they'd have to do it all over again. So one thing we might want to look at for Solaris is to have two release vehicles, similar to the 6-month and LTS releases that Canonical is doing with Ubuntu. But there are still some issues with that model that we'd want to figure out. For example, the Ubuntu folks said that most of the Ubuntu LTS customers just want security fixes, whereas Solaris customers often demand patches for non-security bugs. Another thing that distinguishes Ubuntu releases from the 6-month Solaris trains is when customers actually get the bits to play with. There are only 3 weeks between the Beta release and final release for Gutsy, but there will be six snapshots that are available sooner, with the first (fairly unstable) one appearing 16 weeks before the Beta release. This gives users a larger window than we had with the 6-month Solaris trains in which to try out the release and give feedback. So, to sum it all up: I learned that distros can successfully deal with issues that OpenSolaris and Sun are facing, like how to provide the many third-party packages that users want, and how to keep them current. What we need to do now is figure out how to make it work for OpenSolaris, without sacrificing the stability that attracted many Solaris users in the first place. [1] The internal code names for SunOS 5.2, 5.3, and 5.4 were on493, on1093, and on494, respectively. [2] At some point we came up with a formalized "late binding" process, but I don't remember just when that was introduced. [3] That's the term Mark used. [4] That is, the analyst isn't allowed to publish anything about it before a certain date and time. (2007-05-30 16:24:42.0) Permalink Comments [6]Defeating the OpenSolaris Address Mangler The opensolaris.org webapp includes an automatic email address mangler to make it harder for spammers to harvest email addresses. But it's not very smart, and it mangles things that aren't email addresses, like device paths and repository URLs. If you're editing an HTML page on the web site and you want to bypass the email mangler, replace "@" with "@", as in ssh://anon@hg.opensolaris.org/hg/onnv/onnv-gate(2007-05-18 15:41:59.0) Permalink Thirty years ago this month I wrote my first program. I'd first seen the IBM 1130 demoed at my high school's open house a couple months earlier. This guy put in a deck of cards, typed a month and year at the console, and out popped the corresponding calendar on the line printer. I thought "Cool! I want to learn how to do that!". But it wasn't until soccer season was over that I had time to follow up. My first program wasn't much--a few lines of Basic. But I could see how the principles could be applied to write more interesting programs, like the calendar generator. I was hooked. It didn't take long to bump into the limits of that Basic implementation, like 2-character variable names. So I moved to Fortran. It was pretty neat, too, but there were some limitations imposed by the Fortran runtime system. For one thing, all the I/O was synchronous. And since the operating system didn't do any spooling, you weren't just waiting for the OS, you were waiting for the device. So it was slow. For another thing, it wouldn't let you generate arbitrary patterns with the card punch. So that led to assembly language. You could do asynchronous I/O. And you could compose 80x12 bitmap images and punch them on cards (the 12 punch locations in each card column mapped to 12 bits in each 16-bit word). By the time I was a senior, we had a 6800-based[1] microcomputer, which a couple of the other students had built from a kit. It had a keyboard, a CRT, a simple ROM monitor program, a speaker, and an I/O port that you could hook up to an audio cassette recorder. The audio cassette was the one storage device. You stored your source code--6800 assembly language--on a cassette. After the assembler read it in, you popped out that cassette and popped in a new one, and the assembler wrote the binary image to it. You then loaded your binary back into the system using the ROM monitor. It made the IBM 1130 Fortran look blazing fast. So one of the first things I did was write a binary patch for the assembler. The patch added an option to write your binary directly to memory. Once we had that working, I had great fun coming up with algorithms for making different sounds on the speaker. I wish I'd saved some of those programs; I remember that some of the sounds were pretty interesting. I hope I'm still writing code 30 years from now. Software is such a great blend of functionality and plasticity. [1] No, that's not a typo. (2006-12-22 14:13:01.0) Permalink Comments [1]The UC Center for Information Technology Research in the Interest of Society (CITRIS) had a half-day symposium on December 14th, with a panel discussion, a couple talks, and a poster session. The Cal alumni magazine had done an article on some of the CITRIS work not too long ago; I wanted to learn more, so I signed up. I'm glad I went. Some interesting points were made during the panel discussion and talks, and some of the student projects were fascinating. During the panel discussion, Beth Burnside (Vice Chancellor for Research) pointed out that universities can look at technology transfer offices in a couple different ways. One way to look at them is as an income source for the university, e.g., via patent licenses. But it's not a given that the office will actually produce any income worth noting. Another way to look at technology transfer is that it's a way to get research results applied in the outside world. That is, it's a way for the university to make a real difference in people's lives. One of the talks about energy and global warming. Paul Wright (CITRIS Chief Scientist) talked about things like energy conservation and energy sources other than fossil fuels. It turns out that British Petroleum is soliciting proposals for a university research center into biofuels. I was pleased to hear that BP is doing this. If companies think of themselves as fuel or energy companies, not oil companies, they and we will find the transition away from fossil fuels a lot easier. After another talk, someone from the audience asked if the speaker had any thoughts on how to deal with the combination of too much information and illiteracy. If you look at the typical results from a Google query, there are pages and pages of hits. People who can read can skim over them to find the ones that are most likely to be interesting. What do people do who can't read? The speaker answered that there is ongoing research into voice recognition and artificial speech, but that misses the point: how does someone quickly pick out the interesting results from a long list that is being read aloud, one entry at a time? I was most interested in Eric Brewer's talk about technology and infrastructure for emerging regions. Some of what he talked about was how to be successful when working in developing regions. For example, they spent a fair amount of time up-front talking with various non-governmental organizations (NGOs). Some of the organizations were actively looking at how to use new technology, others weren't really interested in new technology. Since new technology was what CITRIS has to offer, it made sense to partner with NGOs that were clearly interested in that area. Another thing they did to be successful was to do small deployments every 6 months. There was some local skepticism about CITRIS due to past interactions with other groups that had made big promises but didn't actually produce any results. These smaller but frequent deliveries helped counter that skepticism and build trust. Eric also talked about more technical issues, and I was able to learn more from talking to students at the poster session. One of the projects is a telemedicine project in India. Eric said that 70% of the blindess in India is treatable, but only 7% of the rural patients are able to get to a clinic for treatment. Even if the clinic is within walking distance, there's no guarantee you'll be seen when you get there, so many people don't bother. Telemedicine is an obvious remedy to this problem, but bandwidth is a challenge. Satellite links are too expensive, and wireless bandwidth drops off sharply when there are long distances between links. The bandwidth problem is due to the protocols that are normally used, which rely on collision detection (like ethernet). As the link distance increases, the propagation delays lead to more and more collisions, which kills throughput. So they implemented a synchronous (time-sliced) protocol, which doesn't rely on collision detection. The change in low-level protocol is entirely transparent to higher-level protocols. And it turns out, they don't need fancy telemedicine facilities to be effective. Just having a video conference link means that patients can get an initial screening. The ones who need more extensive examination or treatment can then set up an appointment at the central clinic. I'm very pleased that the University of California is doing this sort of work, and I'm looking forward to hearing more about it in the future. (2006-12-21 10:32:31.0) PermalinkI've been making a lot of changes to For awhile, I was making So now I manually do $ cp usr/src/tools/proto/opt/onbld/bin/nightly ~/bin/nightly.new It's more typing, but I've had to rerun fewer tests than I used to. Technorati tags: OpenSolaris (2006-11-03 16:10:08.0) PermalinkO'Reilly publishing hosts OSCON, which is a convention dedicated to open source. OSCON 2006 was my second OSCON. My first OSCON was in 2004, just after I started working on Sun's OpenSolaris team. Apologies for the delay in posting the trip report--life's been a bit hectic since July. general impressionsAt OSCON 2004 I tried to hit as many "experiences" and "how-to" talks as I could. This year I have a better understanding of the tools, so I skipped the open source how-to talks. I did go to a few "experiences" talks, in the hopes that I'd learn something that I was overlooking in my work with OpenSolaris. While there was good information in those talks, they weren't the learning experience I was hoping for. I did have good luck with other talks that I went to just because they seemed interesting. More on that below. I also spent a few hours helping staff Sun's booth at OSCON.
This was quite a contrast from JavaOne. At the JavaOne booth, I
spent a lot of time talking about OpenSolaris and why Sun is
doing it. At OSCON, pretty much everyone knew about
OpenSolaris. I did get a question about the status of the Wednesday talksThe first talk I went to was about the use of open source by the US government, particularly the Department of Defense (DoD). Open source software is already used in government systems, including the military. Despite that, some people in government find "open source" to be scary[1]. Also, the DoD is interested in more than just software. So they tend to talk about "open technology development", rather than "open source". The emphasis is on open standards and interfaces, not implementations. The benefits that the DoD hopes to get from open technology include support for dispersed teams, technological agility (e.g., avoid vendor lock-in), and efficient use of money (avoid duplicate work). The DoD has several interesting issues that it has to deal with. One issue is how to handle security concerns, e.g., how to participate without revealing classified information. Another issue is that the US government is not allowed to hold copyright on anything, so what happens when someone in the DoD wants to contribute code back to a project? A third issue is regulatory requirements. For example, there are regulations that bound the profit that a company can make on a government contract. So suppose there are two bids, one based on open source and one based on proprietary software that was developed from scratch. It's conceivable that the open source bid would cost less but would be ruled out because it gives the vendor too much of a profit. The second talk was an experience talk about open sourcing the MySQL Clusters code, which had been developed at Ericsson and then sold to MySQL. The talk was structured as a series of "shocks" that the development team had to deal with. Shock 1 was that the code needed to install in less than 15 minutes. Prior to this, the team was proud of the fact that they had gotten the install time down from 1-2 days to 3-4 hours. But people can be impatient--if it doesn't install quickly enough, they'll give up and move on to something else that looks cool. And the database that gets included in a final product is often the database that was used for the prototype. Ease of installation means increased likelihood of being used for the prototype, which means increased likelihood of being used in someone's final product. Shock 2 was what "easy to understand" means. At Ericsson the documentation could assume that the reader understood the basic concepts, because there were people whose job was to help the customer understand those issues. As an open source project, the documentation had to stand by itself. Also, the documentation (and code) got a lot more exposure as open source, so the weak spots showed up more clearly. Since going open source, they've put more documentation in the code and have less design documentation. In the future, they'd like to have more design documentation, which they plan to publish for early community feedback. Shock 5 was that all their bug reports must be published on the web. Even security bugs. The reason they can get away with this for security bugs is that they don't have many, and they're usually fixed quickly.[2] Shock 6 was adapting to distributed teams. One change was that they had to write more things down than they used to. They also use plain text more than they used to. They do have annual meetings for the whole developer organization, plus individual teams can get together more frequently if it seems necessary. Shock 7 was the increase in email load. They also use IRC, but
they're starting to move towards more use of the telephone. The
advantage of asynchronous communication is that it encourages
self-sufficiency, but it also makes it easier for people to
proceed along the wrong track. They have been talking about
using distributed whiteboards, but that hasn't happened yet,
though they do sometimes use Shock 9 was the use of agile development techniques, such as monthly sprints. That is, they pick the goals for the month and then focus on them. They take less interruptions than they used to; those issues are instead deferred to the next month's sprint. Shock 10 was the constant stream of feedback from the community. The third talk was another experiences talk about opening closed code that BEA had acquired. This talk focused more on business issues. For example, the speaker (Neelan Choksi) talked about how guerilla marketing does not mean there is no place for more traditional marketing. He mentioned that BEA is out-sourcing their professional and training services. He said BEA isn't really set up to do it themselves, and that out-sourcing these services helps grow the community. The fourth talk was about the best and worst of open source tactics. This talk was a grab-bag of things that Cliff Schmidt had found to work well, plus a few things that don't work so well. Phased delivery seems to be useful. One slide was about the "maturity sweet spot": the code works well enough that people can play with it, but it could be even better with some help. Another slide talked about a "series of film shorts" model; he used OpenSolaris, and how Sun is delivering it in phases, as an example of this. Modularity is important, of course ("modularity or death!"). It's what lets random people go off and hack on things and be able to easily integrate their changes later. Some things to think about when implementing to a standard:
Related to that was a caution about how hard it is to create a de-facto standard yourself (the "ubiquity play" model of open source). If there are competing standards, consider jumping on your competitor's bandwagon. I suppose this could include some sort of migration functionality, as well as finding ways to interoperate. If you're trying to establish a standard platform, it's important that the platform be able to evolve gracefully. Focus on interfaces, and lay down the backward compatibility rules early on. Marketing mistakes to avoid include marketing vaporware, tunnel vision, promoting your company over the community, ignorance of the "live web", and shooting yourself in the foot when selling your support services. There was a Solaris BOF Wednesday evening, but I missed it. There was a reception that I had been planning to graze at before the BOF, but it turned out to have mostly food I couldn't eat. So I went off in search of a restaurant. By the time I got back, the BOF was pretty much over.[3] Thursday talksThe first talk that I attended on Thursday was a fascinating talk about the history of copyright by Karl Fogel. Briefly, the introduction of the printing press made it easier for people to produce anti-government leaflets. The English government responded by granting monopoly powers over printing presses and distribution of printed works to a "stationers guild". In return, the guild had to run everything past government censors. Eventually the makeup of the government changed, and in the late 1600s Parliament decided to revoke this monopoly. The guild proposed copyright in response. It helped them retain some of the control and income that they had as a monopoly. Also, by basing copyright in property law, they made it harder for the government to take away, compared to how easily Parliament had dissolved the original monopoly setup. Karl's point is that copyright is designed primarily to benefit the distributor, not the artist or author. So now that digital technology has made copying and distribution even easier than before, what should we do with current copyright law? The second talk was Guido van Rossum's talk about Python 3000, especially about how he is approaching it and what some of the changes are likely to be. The actual release will probably be called Python 3.0. The "3000" name was a dig at "Windows 2000". One theme for 3.0 is to take the opportunity to fix some bugs from the early design of Python. But it's not a redesign from the ground up. No major changes to the look-and-feel are on the table (e.g., no macros). Nor will the changes be decided by a community vote. Guido will make the final decision(s), with lots of community input. Some of the things that will go away in Python 3000 are classic
classes, string exceptions, differences between To go
along with the strings changes there will be a new
"bytes" data type for byte arrays, which will have
some string-like methods, e.g., The time frame for Python 3.0 is still unclear. Guido was thinking of maybe doing an alpha release in early 2007, with the final release around a year later. The migration from 2.x to 3.0 still needs to be worked out. Issues include the time frame, what 3.0 features to back-port to 2.x, and what migration tools to provide. The challenge for migration tools is that there's a lot of information that's only available at runtime. The current plan is to have static analysis tools that will do around 80% of the job, and to provide an instrumented 2.x runtime that will warn about doomed code. People who would like to keep current on Guido's plans for Python 3000 can follow his blog at artima.com/weblogs/. The third talk I went to on Thursday was Simon Phipps' talk on Sun's Open Source Strategy. Part of the talk was explaining why Sun has not open sourced Java until now. Another part of the talk was about recent work, such as making the JDK redistributable. Someone asked if the compatibility test suite (TCK) will be open sourced. The answer was that folks were still trying to work that out. The fourth Thursday talk that I went to was Jeff Waugh's talk on Building the Ubuntu Community. Some of the things that Jeff said are important for a community are shared values, shared vision, and governance. He broke down governance into 3 areas: code of conduct[4], technical policies, and governance policies. He also said that it's crucial to have people who help build the community and who keep it healthy. Jeff talked a bit about authority and responsibility of community members. He said that communities who lack a "benevolent dictator" don't have a central person for making decisions and resolving conflicts, so it's easier for gridlock to set in. Jeff went on to say that if you give someone responsibility, they'll usually step up to it. But it's important to be clear who has the responsibility and authority for something. First, it helps other people figure out who to talk to. Second, it encourages the person to step up to the role. Someone asked for the justification of including NVidia drivers in what's otherwise 100% free software. Jeff answered that the end-user visual experience is very important. Ubuntu has a limited number of non-free modules, all of which are drivers. Of course, they are pressuring the relevant hardware vendors to do what's needed to support open drivers. I went to two BOFs on Thursday evening. One was the ZFS and Zones BOF; the other was the BOF on Sun's Open Source Strategy. I didn't take any notes from the ZFS and Zones BOF. I do remember that it was mostly attended by Sun employees. The second BOF was run by Simon Phipps. He kicked off the BOF by asking the non-Sun employees to say what Sun is doing wrong. Most of the responses were familiar:
I was surprised by a remark that Sun has an "asymmetric" relationship with the community. The copyright assignment requirement in the contributor agreement was pointed at as an example of this. So perhaps one of the things Sun is doing wrong is not explaining the contributor agreement well enough. Later in the BOF, Simon mentioned that all Sun open source projects (JDK 6, OpenOffice, OpenSolaris, etc.) use the same joint copyright assignment. At one point in the BOF there was a description of where Sun expects to find new customers: companies who want to put together a solution from Sun products, perhaps in combinations with others' products. Sun's value-add would be the ability to put the solution together more cheaply than the customer could. Someone pointed out that even with this business plan, Sun still has to provide things like a good desktop, in order to attract developers. Friday talksThe Friday talks were fun. In the first one, Jonathan Oxer
talked about using scripting languages to control hardware. He
started by talking about the different ports that are available
on a typical computer, with parallel ports being the easiest to
work with, and IR ports not being as useful as the others. The
reason that parallel ports are easy is that you can set or read
bits directly--there are no protocols that you have to deal
with. Most scripting languages do require a helper program to
access the port. With Linux the parallel port is available to C
programs (using Jon also talked a bit about safety. Parallel ports are safe in that the signal is only 5 volts. On the other hand, if your application controls power to appliances or other things that you might plug into a wall outlet, Jon recommended using switchable power boxes, rather than messing with 110V (or higher) directly. Jonathan then demoed several applications. One application would send his cell phone a text message when his mailbox at home was opened and closed (i.e., when he had mail). Another application was a magnetic lock that uses RFID tags as keys. The last talk I went to was by Michael Sparks, who works in a research group at the BBC. The BBC generates a lot of audio and video data[5], and they want easy ways to manage and manipulate it. Michael talked about Kamaelia, which is a Python application that lets them do that. Kamaelia provides a toolbox of simple components that can be pipelined together using Python generators. Developers don't have to deal with concurrency issues thanks to the pipeline structure. Nor do they have to deal with low-level details related to multimedia data, because that's all managed by the components in the toolbox. Kamaelia currently only runs on Linux because drivers for some of their hardware are not available on other flavors of Unix. Wrap-upAll in all, it was a good week: lots of people doing interesting stuff, and Portland is always a fun city to visit. Next time I hope I finally make it to Powell's. [1] This appears to be generational, with most of the concern coming from people who are older than 45. [2] And they presumably don't have to coordinate the announcement of the fix with other vendors. [3] It turns out that there's a perfectly fine sandwich shop a couple blocks from the convention center. But I didn't find out about it until Thursday. [4] One of the rules in the code of conduct is that the code of conduct is not to be used as a weapon. [5] One channel of video for a month is around 200 GB. Technorati tags: OSCON OSCON06 (2006-10-17 15:03:21.0) Permalink Comments [2]As one might expect, working with the external community on OpenSolaris is a bit of a learning experience, for both the people who work for Sun and the people who don't. Differences in goals, assumptions, and so on lead to different ways of doing things. One difference that I've seen pop up recently has to do with ON's policy of "FCS Quality All the Time" (aka Production Ready All the Time). Ideally, we'd like the ON master gate to be good enough that if someone told us to cut a release tomorrow, we'd be happy to do it. In reality, there are usually showstopper bugs that need to be fixed before we could cut a release. So what does "FCS Quality All the Time" mean for developers? One thing that it means is that bugs that break the build, or which make the code sufficiently unusable, must either be fixed quickly (within hours), or the gatekeeper will back out the change that introduced the problem. Another thing that it means--and this is what I want to focus on here--is that it's not acceptable to put something into the gate with known showstopper bugs. You can't say "yeah I know it's broken, but I promise to fix it before the release closes". There are a couple reasons for this. One reason is to avoid a Quality Death Spiral. The other reason is to avoid a firedrill at the end of the release, where you have a bunch of deferred stoppers to fix. What happens in this situation, almost always, is that everyone has to put in extra time and the release is delayed. It's stressful, and because a lot of fixes are going in at the last minute, the final quality is probably worse than it would have been if the fixes had gone in earlier[1]. Notice I said a "bunch" of bugs. If the policy is that it's okay to putback with known stoppers, then it's okay for everyone, not just your project. And with a 100 putbacks going into each build, that just doesn't scale. A common reason for wanting to putback early is to give the code more exposure. While more exposure will help quality, it's better for project teams to make binaries available on their web site. These can be packages, BFU archives, or tarballs. It's potentially a little more work for the project team, but doing it this way is a win for the community as a whole. So if you catch yourself thinking "it's okay, I'll just putback the fix later", be sure to ask yourself: is it okay enough to ship it as it is? If it is, great. If it's not, fix it before you putback. Everyone will thank you for it. Notes: [1] This is also why we require architectural approval prior to putback. Technorati tags: OpenSolaris Solaris (2006-08-10 14:38:07.0) Permalink Comments [2]As I mentioned in a previous entry,
the ON sources that are available via
opensolaris.org are a subset of the ON consolidation in the
Solaris product (90% as of build 42, measured in lines of
text). We've organized the ON source so that if part of a
component (library, command, kernel module) is closed, the
entire component is treated as closed. Closed components are in
their own subtree ( Since the OpenSolaris launch, I've gotten a few requests to support partial-source components. This is where the component is mostly open source, but it contains some code that can't be delivered as source for one reason or another. The usual proposal is to modify the OpenSolaris build to support a mix of .o and source files, similar to the way Sun and other vendors delivered their Unix kernels in the 1980s.
The appeal of supporting closed .o files is that once the
infrastructure is in place, the open source for these components
can be easily exposed outside Sun. And there's some precedent
for this practice, such as the closed-source files that were
split out from libc to The most obvious problem with this approach is that it complicates the OpenSolaris build infrastructure. Mechanisms have to be implemented to identify which .o files need to be included with the closed binaries, and to restore them after a "make clean" (but only for external developers, mind you). By itself, this extra complexity wouldn't be sufficient to kill the .o approach, but it means we don't want to spend energy on it unless it's an approach we want to keep.
A second problem is that this approach blurs the separation
between open and closed source. If we no longer have a system
where everything in
A third problem with this approach is that it makes it harder to
work in the open source tree. Imagine you're working on a
component that contains Worse, suppose you want to change a struct that's defined in
A fourth, more strategic, problem with this approach is that by reducing the barriers to having closed source in the Solaris product, it reduces the incentive to eliminate (or open up) the closed source. A fifth and final problem with this approach is that it assumes a delivery model where the master workspace is internal to Sun, and the external workspace is just a mirror that is produced by filtering the main workspace. That's not the model we're after in the long term. Rather, we eventually want the external workspace to be the master. So what about the precedents that I mentioned earlier? Let's
look at libc and The other precedent that I mentioned was the ath driver, which has some uuencoded .o files in the source tree. This approach reflects the regulatory requirements for wireless devices in (at least) the USA. Government certification is required to deploy these devices, and given the flexibility of the Atheros chip set, the Hardware Abstraction Layer (HAL) software is part of what gets certified. Change the HAL binaries, and you invalidate the certification. So even in Sun's internal source tree, the HAL files are kept as uuencoded binaries, not as source. This means that many of the issues with a general .o mechanism don't apply here. There is no special-case makefile magic for "make clean", and external developers get exactly what internal developers get. In summary, the .o approach requires non-trivial work, it has legal risks, and it treats the external community as second-class citizens. For these reasons, we will not pursue it. Notes[1] We could set up the makefiles in
such a way that you could tell if Technorati tags: OpenSolaris Solaris (2006-06-14 09:00:00.0) PermalinkMy wife and I were in Rochester, New York, for a couple days last week to visit one of my aunts. One night we had dinner with her and one of my cousins. At one point the question came up of where Rochester gets its water from. My cousin mentioned that when he was younger, he had worked for the government office that has the surveying records for the water pipes, so he could say with some certainty where the water comes from. Some of these pipes are pretty old. One was recorded as being "30 paces from where the bear fell". I wonder if any Solaris code will still be around in 200 years. Will it look as quaint as this surveying record? (2006-06-11 19:14:42.0) Permalink Comments [1] |
Calendar
RSS Feeds
All /General /OpenSolaris /Solaris SearchLinks
NavigationReferersToday's Page Hits: 185 |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||