Open desktop mechanic

cat /dev/random | grep "For being ignorant to whom it goes I writ at random, very doubtfully"

Anybody can reboot

Thursday Sep 23, 2004

My uncle is a shade-tree auto mechanic. Whenever my dad had a problem with his 1969 AMC Rambler, Uncle Dell would recommend, "Jack up the radiator cap and drive another car under it." From what I've seen, that's often how problems in the Microsoft desktop world are solved. Reboot or reinstall the O.S. Fear not, whatever demons were possessing your desktop will be exorcised...for a while anyway. But you don't learn much that way.

Last week I was running SunRay Server 3 beta on a Java Desktop System 2.0. A couple of users had problems logging into their NIS home directories which mounted /home on a local NFS server. I logged into my NIS home directory and experienced the same problem. Nautilus and gconfd-2 weren't happy. df -k showed mounts for dozens of users under home, even though only three of them were being used, and it showed multiple mounts for some users. Earlier I had found that restarting autofs fixed a similar problem so:

/etc/init.d/autofs restart

Well, the extra automounts are gone, but my session is still messed up. I cleared out my .gnome*, .nautilus, .gconf, .gconfd preference files and found the problem only became worse. It would have been easy to just reboot. The system had been working fine for a couple of weeks before that.

Aha, it turns out I accidently wiped out the global configuration database in /etc/gconf while experimenting with an optimization script. I decided to kill the APOC configuration daemon before I reinstalled the gconf RPMS, but I did it in the most clumsy way and killed all JVM instances running on the machine, including those responsible for core SunRay services! The SunRay clients displayed a box indicating that they can't find the server. I should just reboot, but let me look at the SunRay manual. Hmmm, /opt/SUNWut/sbin/utrestart. Like magic my session reappeared along with the sessions anyone else who had been sharing that box.

Earlier versions of nautilus/gnome-vfs had a nasty habit of searching for trash and for a writable directory on any share where it could put trash. This was not nice on a SunRay server with hundreds of deep automounted NFS trees. But I thought I remembered that this problem was solved by Sun engineers and other GNOME community members. So my next suspect was autofs. I found a Sun engineer's whitepaper on some autofs deficiencies.. Further investigation showed these deficiencies to be unrelated to my immediate problem. Then I remembered that NFS home directories were being shared between Solaris 8/9 GNOME 2.0 and the newer GNOME in Java Desktop System 2.0. The file
 ~/.gnome/gnome-vfs/.trash_entry_cache
contained entries for nearly every user under /home. Apparently even the newer gnome-vfs reads this cache and stats everything it sees there. Autofs notices that someone is looking and mounts the shares. Sure enough, if I launch nautilus without gconfd-2 and with the trash cache in place, mtab immediately fills with extra junk. So now how do we solve the problem of forward and backward compatability of GNOME configuration files? I think this will take agreement from the entire GNOME community. As configuration moves from flat files into LDAP backends the problem may become irrelevant. In the meantime, I'm glad I didn't reboot.

10:32am  up 30 days, 13:58,  3 users,  load average: 0.13, 0.09, 0.02 
Yeah, this is a beta.
I once explained my reboot philosophy to my brother as:
  • Microsoft Windows: Reboot for minor configuration changes, even to change IP address or upgrade a library!
  • GNU/Linux: You should only reboot when installing new hardware.
  • Solaris: Why would you reboot just to install new hardware?
Apologies if Linux and Microsoft Windows have improved recently, but can you swap out a CPU without rebooting?

[5] Comments
Like this post? del.icio.us | furl | slashdot | technorati | digg