alanc @ sun.com

Alan Coopersmith’s blog

Random thoughts of a disorganized mind...
(and though it should be obvious, while Sun pays me to think about things, they disclaim any responsibility for these thoughts, nor do I claim what I say matches in any way what Sun thinks)

Loading...
http://blogs.sun.com/alanc/date/20060428 Friday April 28, 2006

X Changes in Nevada Build 39

The ChangeLog for the X Consolidation for Solaris Build 39 has now been posted. However, the source drop won't be available to next week due to some lab work that made our file server and build machines unavailable today, (and since I had to work on that, I couldn't prepare the source drop on other machines either).

For OpenSolaris source release, the most notable change in this build is that it contains the first replacement of existing Solaris X sources (from our not-yet-opened portion of the tree) with the equivalent sources from the X11R7 modular release, resulting in both a newer version of the sources being used in Solaris, and more sources available as part of our OpenSolaris release. It's nothing major - just the xproto-7.0.4 package which delivers the base X11 protocol headers and some headers used by the rest of the X stack, but it's the base of the modular dependency tree, and thus a necessary first step. (Functionality-wise, the most notable change in the headers is yet another batch of keysym name definitions.)

The full list of fixes is:

6406200 need trusted logo in xscreensaver lock program
When you lock the screen in a Solaris Trusted Extensions session, the logo will show you that, instead of showing the normal Solaris lock logo.
6385078 xlock is not passing PAM_CHANGE_EXPIRED_AUTHTOK to pam_chauthtok
A trivial fix noted by our PAM gurus after they found a similar problem in Solaris su - when a password has expired and you need to change it, you're supposed to call pam_chauthtok with the PAM_CHANGE_EXPIRED_AUTHTOK flag. The basic password Solaris PAM modules don't seem to have minded this omission, but others may need it.
6374699 FMRI application/x11/xfs should run as noaccess
For years on Solaris our inetd.conf entry to start the X Font Server listed it to run as the nobody user, even though that's really only supposed to be used on Solaris for NFS mounts when "squashing" root privileges. We copied that setting to the SMF manifest when we converted from inetd.conf to SMF, but have now updated that to the more appropriate noaccess account.
6411370 X sources should use FamilyInternet6 instead of FamilyInternetV6
When I first wrote the IPv6 changes for Solaris, I called the #define for the family name FamilyInternetV6, but when the X.Org standards committee reviewed it, they decided to drop the V to be consistent with other uses such as AF_INET6 in the BSD sockets API for IPv6. I finally updated the uses of this definition in the Solaris X sources to match.
6409332 infinite loop in XFlushInt() on x86/32-bit
See my previous blog entry on “The Compiler Bug that Wasn't”.
6411857 Xorg modularization: xproto-7.0.4
As noted above
6411989 makekeys needs to handle Unicode-mapped keysyms
Since the libX11 source hasn't been updated to the latest X.Org version yet, this change from the X.Org libX11 had to be pulled into the makekeys program used to generate the hash tables used in our existing libX11 to handle the new Unicode-mapped keysyms that are now in the keysymdef headers installed in the xproto-7.0.4 package.
6413255 xdm checks for username of "root" instead of uid 0 when doing non-console login check
The description pretty much says it all, and while we haven't released our xdm source yet, this change was given back to X.Org and is now included in the just-released xdm-1.0.4 module.
6303855 ATI driver performance is poor
As discussed in X.Org bug #5867, the ATI RageXL chips builtin to certain motherboards (including those of some Sun systems, like the Ultra 20) go faster if you tell Xorg not to use extra frame buffer memory to cache pixmaps. This is an updated fix for that which doesn't add a new configuration flag as was previously proposed.
6398094 default resolution too low on metropolis workstation
6406044 Screen off center with left margin on 24.1" monitor with analog input
Two more fixes from the team working to improve our Xorg autoconfiguration experience. These update the modeline selection code in Xorg and also incorporate the CVT code from X.org that Luc Verhaegen wrote for the soon-to-be-released Xorg 7.1.

[Technorati Tags: , , , ]

http://blogs.sun.com/alanc/date/20060331 Friday March 31, 2006

X Changes in Nevada Build 37

Unfortunately, all the changes in this build happened to be in the bits we haven't released yet (though I'm hoping to get our xscreensaver sources out sometime soon), so I can't point you at the changes in our just released source drop, but when the Solaris Express build 37 images come out, you should see these changes in the binaries.

6377194 XST extension wrapping makes the Composite and Damage wrapping not work
In the X server, many extensions do their work by replacing entries in tables of function pointers with their own functions, that do some work, then call the previous functions. Our colleagues in the Project Looking Glass team found that the XST extension (from the STSF project) had installed several of these function wrappers in a way that broke the similar wrappers from other extensions. Since we're in the process of removing STSF from the system, these wrappers were disabled to allow these other extensions to work.
6255133 SunRay: Xinerama: memory leak in Xsun after calling XCreatePixmap(3X11)
The Xinerama extension allows combining multiple graphics devices into one large virtual screen. One of the things it does to allow this is to make a separate copy of every pixmap in the X server for each underlying device, so that different cards can operate on it in the most efficient way for them or store it in their on-board memory. For Sun Ray systems though, where all devices are always the same, and all pixmaps are just stored in the main system RAM, this duplication wastes RAM and CPU time (since all operations have to be repeated for each copy), so we allowed an Xsun ddx module to notify the system that it can share copies. A bug crept in though, where this wasn't registered correctly with the list of resources to be freed when the client exited, so clients that exited without releasing their pixmaps caused Xsun to leak memory. (This is also being patched for Solaris 9 and 10 Xsun.)
6232241 NSCM login takes username twice
The never ending struggle to get xscreensaver's PAM conversation to play nicely with Sun Ray's Non-SmartCard Session Mobility PAM modules goes on. Fortunately, Mahmood has been working on this, so I don't understand this enough to explain it.
6388473 xscreensaver needs to be modified for Trusted JDS
As part of the work to create a Trusted JDS desktop for the Solaris Trusted Extensions, xscreensaver had to be modified to allow admins to enforce system security policies, including deciding whether or not users can disable the screen lock or change the lock timeout, when running on a system with the Trusted Extensions installed and enabled.

[Technorati Tags: , , , ]

X code released to OpenSolaris

For now just a quick copy of the announcement - I'll write more once I catch my breath...

The first code drop from the X Window System Consolidation has been posted to opensolaris.org. It's a snapshot of a subset of the Solaris X Consolidation code from partway through Nevada build 38.

Details on what's included and links to downloads & licenses can be found on the X Community Sources page.

Source is not yet available in the OpenSolaris Source Browser, but work is in progress on preparing that for availability sometime next week.

For more information, or to discuss the X Consolidation, join the X Community on OpenSolaris.Org.

[Technorati Tags: , , , ]

http://blogs.sun.com/alanc/date/20060324 Friday March 24, 2006

Xorg bug on Ferrari 4000 laptops in Solaris Nevada build 35 and later

Since I know the Acer Ferrari 4000 laptop is popular in Solaris circles, a warning to users of it about a bug found by Sun's internal Ferrari user group: if you're using Nevada build 35 or later, rename /usr/X11/lib/modules/libvbe.so so the Xorg server doesn't find it. If it's there, the new monitor probing changes introduced in build 35 by Sun bug 6385111 (aka Xorg bug 5892) cause the Ferrari 4000 to attempt to use VESA BIOS Extensions (VBE) to get the monitor settings after the normal methods failed. Unfortunately, while this works the first time you do it, if you do it a second time without rebooting in between, it seems to cause the Ferrari 4000 BIOS to hang the entire machine, requiring you to manually power it down to recover. Since this is an optional module, if Xorg can't load it, it just skips it.

This failure is being tracked in Sun's bug database as 6402721: Restarting Xorg hard hangs the system (Acer Ferrari 4000, Ati Radeon X700) and is being worked on now by the engineer who introduced the fallback to VBE into our Xorg. So far it's only been reported on the Ferrari 4000 laptops, but could potentially be seen on other machines with similar BIOS'es.

[Technorati Tags: , , , , ]

http://blogs.sun.com/alanc/date/20060320 Monday March 20, 2006

X Changes in Nevada Build 36

Another two weeks, another list of fixes checked in. The one with the biggest share of attention is also the one with the smallest code change - two missing pairs of parentheses - four simple characters that closed one big security hole.

6387822 Wrong library path in xft.pc file
Simple fix to the pkg-config data file we ship for libXft2 so it produces the right library path flags for linking so that GNOME 2.14 builds correctly.
6383556 Problem in allocating pixmap
The last security fix in X servers added checks to both Xsun & Xorg to prevent pixmap allocations from overflowing. Unfortunately one of the checks in Xsun clamped down too far - preventing pixmaps with dimensions larger than 8192 instead of the intended 32k limit.
6390864 nevada removal of ddxSUNWdials
We bow our heads for SunButtons and SunDials - faithful servants of almost two decades, now sent to permanent retirement. The hardware for these hasn't been sold for several years now and the kernel driver for them was removed, so we had to remove the Xsun support as well. (The official end of support notice should appear in the Solaris 10 Update 2 release notes, warning of removal in the a future release - but we normally don't remove support in update releases, so users still attached to theirs can stay on Solaris 10 without fear.)

If you've never seen these they were additional input devices - SunButtons offered a big pad of extra buttons, like a jumbo set of keyboard function keys, and SunDials offered a bunch of knobs you could twist. These were accessed via the X Input Extension by software such as CAD programs for more efficient interaction with their features.

6368334 common postscript-derived font names are no longer recognized
An updated set of font aliases to fix some problems reported with the ones added in build 34.
6390453 SUNWxorg-mesa has broken links in snv nightly build for 2/24/2006
The script integrated into build 34 to make symlinks to either the nVidia or Mesa OpenGL libraries was failing to create the right links to the Mesa libraries in certain cases.
6395871 integrate Solaris Trusted Extensions to X Windows (Xsun)
6395892 integrate Solaris Trusted Extensions to X Windows (X.org)
Sun's previous Trusted Solaris product is being replaced for Solaris 10 with the Trusted Extensions to Solaris. Instead of a separate fork of the OS, it will instead run standard Solaris 10, but with additional modules loaded to provide the multi-level security features. For X, this means shipping a new library (libXtsol) and putting hooks into the X server that the XTSOL extension loadable modules delivered in the Trusted Extensions for Xsun & Xorg can use to implement their own security checks as needed. We'll be offering this back to the open source X.Org community in the near future under the standard MIT/X11 license.
6396593 [Xorg Bug 6213] local user DoS and arbitrary code execution as root [CVE-2006-0745]
See previous blog post.

[Technorati Tags: , , , ]

http://blogs.sun.com/alanc/date/20060310 Friday March 10, 2006

X Changes in Nevada Build 35

Not as much in this build as there was in build 34, but then almost everyone who works on the X source trees in Solaris was at the X.Org Developer's Conference for half of the two-week build cycle for build 35.

6303855 ATI driver performance is poor
The ATI Rage XL chip in the Ultra 20 is not exactly a speed demon. Edward from our x86 team noticed that a big part of the slowness was due to reading back pixmaps across the PCI-33 bus from the VRAM and proposed a fix - you can see his fix in X.Org Bug Report #5867. (If you want a speed demon, the Ultra 20 is offered with a selection of nVidia Quadro cards.)
6358930 Japanese keyboard should work even if XkbModel/jp106 and XkbLayout/jp are specified in xorg.conf.
A simple fix to our Japanese XKB rules from Sun's Japanese localization group to make non-Sun and Sun Japanese keyboards work the same way.
6385111 Xorg auto-configuration Improvement
As Stuart talked about at the X Developer's Conference, one of our big projects in the X group at Sun is improving Xorg configuration, especially automatically detecting the correct hardware settings. This is one step in that process, improving monitor detection, as described in X.Org Bug Report #5892.

[Technorati Tags: , , , ]

http://blogs.sun.com/alanc/date/20060222 Wednesday February 22, 2006

X Changes in Nevada Build 34

I always say I never have anything worth blogging about, and then I find the stuff I consider trivial gathers interest. For example, every two weeks I've been updating the Solaris Nevada X Consolidation ChangeLogs at OpenSolaris.org, but didn't post much about them here. As an experiment, this week I'll post the change list here as well, with some brief comments where I can (remember - I'm just one of a group of people working on this stuff, and I don't know all the details of what everyone else is doing, so apologies if I skimp on some details or get them wrong).

6365777 libfontconfig should not use fopen/stdio routines
Import to our fontconfig builds the fix from fontconfig CVS to use open() instead of fopen() to get around the Solaris 256-fd limit in stdio.
6376462 Xsun needs -br option
Port the Xorg option to change the default background from the grey root weave to a solid black screen, for less flashing when going from a graphical kernel boot into the X server startup.
6358266 Three xserver packages failed while upgrading to build 29 ( Sparc and intel )
For internal testing purposes, we have a script to pkgrm the old X packages and pkgadd the new ones. This script had to be modified to svcadm disable the X services before the pkgrm to work around bugs in some of the older X package revs so that the pkgrm could complete successfully.
6368334 common postscript-derived font names are no longer recognized
Xsun has long recognized an alternative syntax for font names that look like PostScript style font names. Xorg doesn't special case this in code but just adds aliases - but we didn't add those aliases when integrating Xorg so some software that used those names started to fail. This fix added the missing aliases to our font.aliases files.
5099951 Fonts (in particular Lucida Sans Typewriter) look terrible in JDS
Continuing progress in Jay Hobson's work to improve the quality of font display in the GNOME desktop. You can see some of the effects of this already in the current Solaris Express releases.
6308859 [s10u1] xscreensaver: password lock dialog is not localized
The GTK-based unlock dialog Solaris uses for xscreensaver was failing to call gettext() to get localized versions of various strings, so these calls were added.
6378204 Xsun splash screen graphics not in line with unified "coolstart" branding
Sun's User Experience Design team has been working with the various product groups to unify and update the look of various software pieces so it all looks like it's from the same place and fits with the "S curve" theme used in all of Sun's ads, web sites and products now. This is the Xsun splash screen update to the new graphics - you can already see the JDS changes in build 33, and more changes are coming to other parts of the system in future builds.
6380709 Xsun doesn't honor xhost +si:* on unix socket or named pipe connections
See my previous post on this one.
6380620 X11(5) lists sawfish(1) as GNOME window manager
Oops - forgot to update that when Sun's GNOME builds switched from sawfish to Metacity. While we were updating, also brought in the latest changes from X.Org, added gdm2 to the list of ways to start X on Solaris, and replaced xon with ssh -X as the recommended way of starting a remote X client.
6355580 fonts.conf need to be updated with Kacst fonts
New Arabic fonts being added to Solaris needed to be added to the fonts.conf Sans, Serif, & Monospace aliases
6377618 Need OpenGL vendor switching support
This clears one of the hurdles to getting the nVidia Accelerated Graphics Driver included in Solaris - providing a way to have the system use nVidia's libGL when it detects the nVidia kernel driver is loaded, and Mesa's libGL otherwise.
6245431 Xorg fails with no mouse connected
Previously if Xorg did not find a mouse at startup, it would start with none, and fail to recognize mice hotplugged later - this changes it to go ahead and open /dev/mouse anyway, so that hotplugging the mouse (or switching to it on your USB kvm) later will work as expected.
6376708 Update nv Xorg driver to Jan-2006 version
Mark Vojkovich added support for some new nVidia chipsets (follow the bug link for the list) to XFree86, and Aaron Platner integrated that to X.Org CVS - this just pulls in those changes.
6379980 Xorg: fatal: theatre_drv.so: open failed: No such file or directory
Some of the modules for TV ports on ATI cards were linked incorrectly in Solaris and caused Xorg to exit when it detected the right hardware and tried to use that module.
6339635 mesa GLU/GLw locations should be consistent with SPARC
Symlinks were added so that the GL libraries could be found in the same places on SPARC & x86 to ease porting software.

(This was a much bigger batch of changes than usual for some reason - not sure why, though code freeze was just before the X.Org Developer's Conference so maybe we were just trying to get everything checked in early so we wouldn't have to worry about it during the conference.)

[Technorati Tags: , , , ]

http://blogs.sun.com/alanc/date/20060216 Thursday February 16, 2006

Xorg for Solaris SPARC?

I got two comments quickly after my last post, and have been asked a few other times lately, so I'll post the answer here as a new entry so it's more visible. The question is “So when will we have Xorg on Solaris SPARC?” The answer unfortunately is simply “I don't know.” The Sun Ray team are working on porting the Sun Ray drivers to Xorg and the SPARC graphics group are porting the XVR-2500 drivers to Xorg. Our group provides assistance as needed, but most of the work is in the hands of those groups, and I don't know their plans for release schedules yet (and it wouldn't be my place to announce for them even if I did). It's being worked on, but that's about all I know or can say right now.

[Technorati Tags: , , , ]

http://blogs.sun.com/alanc/date/20051206 Tuesday December 06, 2005

Solaris Desktop Summit: Performance Day 1

The Performance portion of the Solaris Desktop Summit got off to a rousing start today with Bryan's talk on the secrets of the DTrace gurus (or at least all the new bits they've added recently and are still working on documenting in the DTrace Guide.

In the afternoon, we broke into working groups, and I joined the team looking at Boot Time. We split our group further, so I worked with gdm maintainer Brian Cameron on trying to see if we could figure out where we could improve the time it takes to start gdm and the X server once the rest of the system is up. We pieced together enough dtrace to determine that about half of the time was spent in gdmlogin, the program that draws the actual login dialog, about a quarter in Xorg itself, and the rest spread across the other processes involved (including the many svcprop calls from the /usr/X11/bin/Xserver script to get the X preferences from the Solaris SMF registry - an area already noted by others as ripe for improvement).

So we started with the gdmlogin process and dug in - it wasn't long before we hit the limit of my DTrace skills, and we called in Bryan to pinch-hit. [To keep things less confusing in the next part, I'll switch to login names, and refer to Bryan as “bmc” and Brian as “yippi.”] He flew through it - finding things that looked strange and digging in, while yippi looked in the source code for explanations. We saved the scripts and DTrace one-liners on yippi's laptop, and will hopefully be able to post more later. There were three main things we found to investigate/try improving:

  • bmc noticed a lot of time being spent in the close() system call, which seemed strange. Tracking this down to the stack trace involved, yippi saw it was socket closes in the gdmlogin configuration calls. When gdmlogin needs to get a configuration value from the parent gdm daemon it opens a socket, checks the version, asks for the value it needs, then tears down the socket. Over and over again it does this, instead of simply caching the socket and reusing it for future calls, so yippi is looking into seeing if we can convert it to do so.
  • bmc also noticed a lot of stat() system calls, and found two big causes:
    • gdm has a list of possible supported locales and stats the locale directories for each ones to see which are installed. We thought of two ways to reduce the cost of these. First, instead of having gdmlogin do this every time, having the parent do it and pass the list to the children could reduce the time on systems with multiple login screens, such as multiseat systems, including Sun Ray. That shouldn't make much difference on single user system though, but the second idea, to change the scan to instead read the parent directory containing all the locales and use the list of it's subdirs instead of a huge list of all possible subdirs could greatly cut down the number of system calls on systems with only a few locales installed.
    • gdm was also scanning lots of directories to find the icons for the gtk theme, including stat'ing the same directory over and over again. Another group is looking into gtk performance and will be looking into this more.

We didn't get a chance to make all the changes and test the code — at least not before I left, though yippi was still debugging the first set of changes then and if he didn't finish today, we should be tackling it again tomorrow to see if we can measure improvememnts from the changes and find any more or start digging into Xorg.

P.S. John has written a lot more than I did about the Usability portion of the Summit we held last week, including some of the problems we identifies and the ideas we kicked around to solve them.

[Technorati Tags: , , , , , .]

http://blogs.sun.com/alanc/date/20050916 Friday September 16, 2005

Solaris patches for CAN-2005-2495

A security hole in processing XCreatePixmap requests in the Xserver (known as “CAN-2005-2495”) was announced this week. This affects most X servers based on the original X11R6 code from the X Consortium at MIT, so we've released preliminary patches for the Xsun & Xorg servers in Solaris. These haven't had time to go through the full patch regression test process yet, so aren't in the main patch site for now, but in the special Preliminary Security T-patches area on SunSolve.

Further details, including the list of which patches to use for each Solaris release, can be found in Security Sun Alert #101926. (And yes, there is a slight mistake in the current version since it references XPM files, which are not involved in this exploit - that was an accidental copy of the description from the previous libXpm security alerts. Unfortunately, I didn't notice that until after I told the Sun Alert team the draft alert was correct. I let them know it was wrong, so hopefully they can fix that. It should say something more like “A program that has access to the X server (via xhost or xauth authentication) can make calls that may allow it to execute arbitrary code with the privileges of the X server.” Which is of course, just another reason you should just say no to “xhost +”.)

[Technorati Tags: , , , ]

http://blogs.sun.com/alanc/date/20050711 Monday July 11, 2005

Can GNOME startup time be improved via ld flags?

Bryan Cantrill, Master DTrace Guru, First Class, spent some time today looking at what exactly GNOME is doing when you login to a Java Desktop System session on Solaris, and posted his findings to his weblog. (The current JDS on Solaris is based on GNOME 2.6, since that's what was the stable release last year when Solaris 10 hit feature freeze. The JDS team is working on an update to GNOME 2.10 now.)

One of the things Bryan found was that a large part of the I/O time was spent loading shared object text. I took a quick look at some of the binaries and libraries using elfdump, and noticed that there were no signs of using flags that could reduce the time needed to load shared libraries at process startup. Some of these (like -z lazyload) defer work until later - others (like -z combreloc) reduce the work needed whenever it happens.

I sent some suggestions to the JDS team on using these flags and others to improve this and suggested especially reading the Performance Considerations chapter of the Solaris Libraries and Linkers Guide for more ideas. I also cc'ed the linker gurus, and Senior Linker Alien Rod Evans added a suggestion to try out the check_rtime perl script on the binaries to check for the recommended flags and whether any of the libraries linked against aren't really needed. It's currently set up for use in the build system of the OS/Networking consolidation (the portion of the Solaris sources already released via OpenSolaris), but should be adaptable to the JDS build system or in fact, any project that wants to try to optimize it's library/linker use on Solaris.

Unfortunately, just tweaking the flags will mostly help Solaris, but the GNU binutils ld used on Linux and some other platforms offers some similar functionality - it recognizes many of the same -z options for instance, though I haven't tried them to see how they compare.

Something that may help more on both platforms is ensuring the libraries listed in the various .pc files for GNOME only list the direct requirements, not all the dependencies they depend on as well. For instance, look at what is linked into every program on Solaris that uses the gtk toolkit:

alanc@unknown:~ [2] pkg-config --libs gtk+-2.0
-lgtk-x11-2.0 -lgdk-x11-2.0 -latk-1.0 -lgdk_pixbuf-2.0 -lm -lmlib -lpangoxft-1.0 
-lpangox-1.0 -lpango-1.0 -lgobject-2.0 -lgmodule-2.0 -lglib-2.0
But if you run elfdump -d /usr/lib/libgtk-x11-2.0.so you'll see libgtk-x11-2.0.so already lists those dependencies, so duplicating them in the applications simply wastes time as the linker at runtime will load libgtk-x11-2.0.so and have to check the same list of libraries it already checked in the application (though it should find it's already taken care of them and doesn't duplicate all the work). Additionally it hardcodes in the applications knowledge of the internals and backends used that they shouldn't need to know about, and makes it harder to change or replace one of them. While all those libraries need to be listed when statically linking, or on older systems (mainly pre-ELF I think), the pkg-config entries should be streamlined when using ELF shared libraries on modern systems.

[Technorati Tags: , , , ]
[Now Playing: Deep Space 9 series finale (recorded today off Spike TV by our TiVo)]

http://blogs.sun.com/alanc/date/20050708 Friday July 08, 2005

Xserver provider for DTrace

A few months ago I sent mail to Sun's dtrace and X11 internal mailing lists about something I'd been playing with:

After trying to absorb as much as possible at last week's dtrace classes, I sat down with the manual and tried things out for a bit to help it sink in before I forgot it all. One of the chapters I stumbled across was the one on adding your own probes to your own applications, which reminded me of some conversations I'd had with various people (Bart and Mahmood and others I've probably forgotten). Caffeine was consumed, and one thing led to another, and after a push in the right direction from the dtrace-interest list...

# dtrace -l -n 'Xserver*:::'
   ID   PROVIDER            MODULE                FUNCTION NAME
    4 Xserver848              Xsun                Dispatch request-start
    5 Xserver848              Xsun                Dispatch request-done
    6 Xserver848              Xsun              InitClient client-connect
    7 Xserver848              Xsun         CloseDownClient client-disconnect 

# dtrace -q -n 'Xserver*:::request-start { t = vtimestamp } \
  Xserver*:::request-done { \
   printf("Client: %3d Request: %20s Size: %5d Time: %10d\n", \
          arg3, copyinstr(arg0), arg2, vtimestamp -t)}'
Client:   4 Request:  X_SetClipRectangles Size:     5 Time:      46736
Client:   4 Request:           X_ChangeGC Size:     4 Time:      16307
Client:   4 Request:           X_CopyArea Size:     7 Time:      68328
Client:   4 Request:  X_SetClipRectangles Size:     5 Time:      26480
Client:   4 Request:           X_ChangeGC Size:     4 Time:       8833
Client:   4 Request:  X_PolyFillRectangle Size:     5 Time:      43680
Client:   4 Request:  X_SetClipRectangles Size:     5 Time:      28506
Client:   4 Request:           X_ChangeGC Size:     4 Time:      11920
Client:   4 Request:           X_CopyArea Size:     7 Time:      46566

[sitting at the dtlogin screen, watching the cursor blink - it could probably cache the GC's and clip lists and use a little less bandwidth and CPU to blink that cursor]

This script waits for the next client to connect, keeps a count of all the requests it makes, then prints the count and exits when that client does:

#!/usr/sbin/dtrace -s

string Xrequest[uintptr_t];

Xserver$1:::client-connect
/clientid == 0/
{
    clientid = arg0;
    printf("\nClient %d connected - tracing requests from it\n", clientid);
}

Xserver$1:::client-disconnect
/clientid == arg0/
{
    printf("\nClient %d disconnected - ending trace\n", clientid);
    exit(0);
}

Xserver$1:::request-start
/Xrequest[arg0] == ""/
{
    Xrequest[arg0] = copyinstr(arg0);
}

Xserver$1:::request-start
/arg3 == clientid/
{
    @counts[Xrequest[arg0]] = count();
}

This started out as a plaything to give me something to learn dtrace with, but it looks useful to me and could easily turn into something more if others see uses for it. Since some of these probe points are directly in the hottest code path of our benchmarks, we'd have to make sure that they don't affect our benchmark scores too much, but that shouldn't be much of a problem. Yes, you can do much of this with xscope, but this doesn't require tunnelling everything through the slow xscope proxy server and then finding some way to make sense of the huge output logs.

And I've just put probe points in the easiest and most obvious places - there's other interesting places we could put in probes - when a client does a grab for instance, or on outgoing events and/or errors.

So I guess this is also a request for comments - would others find this useful? What probe points would be useful to use in the X server and what data would you like to get out at those probe points?

For instance, this is what I picked to make available in the current probes:

request-start: request name/extension name, minor code (for extensions), request length, client id, client sequence
request-done: request name/extension name, minor code (for extensions), request length, client id, result code
client-connect & disconnect: client id

After some feedback from people, I made some refinements to the existing probes, and added some more probes, then made it available for internal use. Since then I've added a few more probes to help in tracking down the CDE window manager pixmap leak I blogged about earlier. And now, I've made it available outside Sun as well, so more people can try it out, see if the probes available are useful or if they should be modified or expanded, and give feedback on it. At some point I hope to integrate this directly into both the Xsun and Xorg servers delivered in Solaris, as well as into the open source Xorg server code, but I'd like to get some more experience with it from more people first.

After you install it, you should see this set of probes available: (the number after "Xserver" in the provider name is the process id of the currently running Xorg server process)

# dtrace -l -n 'Xserver*:::'
   ID   PROVIDER            MODULE                          FUNCTION NAME
    4 Xserver1335              Xorg    FreeClientNeverRetainResources resource-free
    5 Xserver1335              Xorg                FreeResourceByType resource-free
    6 Xserver1335              Xorg                      FreeResource resource-free
    7 Xserver1335              Xorg                          Dispatch request-done
    8 Xserver1335              Xorg           EstablishNewConnections client-connect
    9 Xserver1335              Xorg          AllocLbxClientConnection client-connect
   10 Xserver1335              Xorg        CloseDownRetainedResources client-disconnect
   11 Xserver1335              Xorg                   CloseDownClient client-disconnect
   12 Xserver1335              Xorg                    ProcKillClient client-disconnect
33598 Xserver1335              Xorg                          Dispatch client-disconnect
33667 Xserver1335              Xorg                       AddResource resource-alloc
33668 Xserver1335              Xorg                          Dispatch request-start
33669 Xserver1335              Xorg                  ClientAuthorized client-auth
33670 Xserver1335              Xorg                  FreeAllResources resource-free
33671 Xserver1335              Xorg               FreeClientResources resource-free

One of the example scripts I've also made available, client-watch.d reports every client that connects and when it disconnects a count of how many X requests it made and how much time the X server spent processing them. For example, I captured this from logging into and then out of a Java Desktop System (GNOME 2.6) session on Solaris 10:

   connect -> id:    8
 client id -> id:    8 is from local process 1706 (/usr/bin/gnome-session)
1706:   /usr/bin/gnome-session
[...]
   connect -> id:   14
 client id -> id:   14 is from local process 1831 (/usr/bin/nautilus)
1831:   nautilus --no-default-window --sm-client-id default3
   connect -> id:   15
 client id -> id:   15 is from local process 1833 (/usr/bin/gnome-volcheck)
1833:   gnome-volcheck -i 30 -z 3 -m cdrom,floppy,zip,jaz,dvdrom --sm-client-id default
  connect -> id:   16
 client id -> id:   16 is from local process 1857 (/usr/lib/clock-applet)
1857:   /usr/lib/clock-applet --oaf-activate-iid=OAFIID:GNOME_ClockApplet_Factory --oaf
   connect -> id:   17
 client id -> id:   17 is from local process 1831 ()
1831:   nautilus --no-default-window --sm-client-id default3
   connect -> id:   18
 client id -> id:   18 is from local process 1859 (/usr/lib/wnck-applet)
1859:   /usr/lib/wnck-applet --oaf-activate-iid=OAFIID:GNOME_Wncklet_Factory --oaf-ior-
   connect -> id:   19
 client id -> id:   19 is from local process 1867 (/usr/lib/gnome-netstatus-applet)
1867:   /usr/lib/gnome-netstatus-applet --oaf-activate-iid=OAFIID:GNOME_NetstatusApplet
   connect -> id:   20
 client id -> id:   20 is from local process 1875 (/usr/lib/mixer_applet2)
1875:   /usr/lib/mixer_applet2 --oaf-activate-iid=OAFIID:GNOME_MixerApplet_Factory --oa
   connect -> id:   21
 client id -> id:   21 is from local process 1877 (/usr/lib/notification-area-applet)
1877:   /usr/lib/notification-area-applet --oaf-activate-iid=OAFIID:GNOME_NotificationA
[... logout of JDS ...]
disconnect -> id:   16, lifetime: 8508 ms, requests: 364 (55 ms of CPU time)
disconnect -> id:   12, lifetime: 9597 ms, requests: 979 (50 ms of CPU time)
disconnect -> id:   20, lifetime: 6908 ms, requests: 286 (2 ms of CPU time)
disconnect -> id:   19, lifetime: 7157 ms, requests: 424 (53 ms of CPU time)
disconnect -> id:   14, lifetime: 9466 ms, requests: 940 (1532 ms of CPU time)
disconnect -> id:   15, lifetime: 9389 ms, requests: 119 (0 ms of CPU time)
disconnect -> id:   13, lifetime: 9542 ms, requests: 4544 (209 ms of CPU time)
disconnect -> id:   21, lifetime: 6718 ms, requests: 308 (1 ms of CPU time)
disconnect -> id:    8, lifetime: 15691 ms, requests: 1607 (1530 ms of CPU tim
disconnect -> id:    7, lifetime: 15851 ms, requests: 16 (0 ms of CPU time)
disconnect -> id:   10, lifetime: 11327 ms, requests: 130 (0 ms of CPU time)
disconnect -> id:   17, lifetime: 8483 ms, requests: 16 (0 ms of CPU time)
disconnect -> id:   18, lifetime: 7402 ms, requests: 789 (14 ms of CPU time)

I've also made the request itself available to copy in so you can get any part of it you want. An example of tracing the CreatePixmap and FreePixmap calls from a single client during a JDS session:

# ./client-pixmaps.d 1306 \"/usr/bin/nautilus\"
Creating pixmap: id: 0x1c00004 size: 1,1
Creating pixmap: id: 0x1c0001e size: 48,35
Creating pixmap: id: 0x1c0002a size: 48,48
Creating pixmap: id: 0x1c0002c size: 48,48
Freeing pixmap: id: 0x1c0001e
Freeing pixmap: id: 0x1c00025
Creating pixmap: id: 0x1c0002d size: 48,48
Creating pixmap: id: 0x1c0002f size: 48,48
Freeing pixmap: id: 0x1c0002a
Freeing pixmap: id: 0x1c0002c
Creating pixmap: id: 0x1c00030 size: 48,48
Creating pixmap: id: 0x1c00032 size: 48,48
Freeing pixmap: id: 0x1c0002d
Freeing pixmap: id: 0x1c0002f
Creating pixmap: id: 0x1c0003f size: 2,2
Creating pixmap: id: 0x1c00041 size: 1024,768
Creating pixmap: id: 0x1c00025 size: 48,35
Creating pixmap: id: 0x1c00047 size: 48,48
Creating pixmap: id: 0x1c00049 size: 48,48
Freeing pixmap: id: 0x1c00030
Freeing pixmap: id: 0x1c00032
Creating pixmap: id: 0x1c0004a size: 1024,768
Creating pixmap: id: 0x1c0004e size: 1,1
Freeing pixmap: id: 0x1c0004e
Creating pixmap: id: 0x1c00051 size: 1,1
Freeing pixmap: id: 0x1c00051
Freeing pixmap: id: 0x1c0004a
Creating pixmap: id: 0x1c00059 size: 1,1
Freeing pixmap: id: 0x1c00059
Creating pixmap: id: 0x1c0005b size: 109,496
Freeing pixmap: id: 0x1c0005b
Freeing pixmap: id: 0x1c00064
Creating pixmap: id: 0x1c00075 size: 1024,768
Freeing pixmap: id: 0x1c00075
Creating pixmap: id: 0x1c00064 size: 1024,768
Creating pixmap: id: 0x1c00086 size: 1024,768
Freeing pixmap: id: 0x1c00086
Creating pixmap: id: 0x1c00097 size: 199,384
Freeing pixmap: id: 0x1c00097
Creating pixmap: id: 0x1c000a0 size: 199,384
Freeing pixmap: id: 0x1c000a0
Creating pixmap: id: 0x1c000a9 size: 1024,768
Freeing pixmap: id: 0x1c000a9
Freeing pixmap: id: 0x1c00047
Freeing pixmap: id: 0x1c00049
Freeing pixmap: id: 0x1c00041

(This simple example just traces - but I'm sure you could enhance it or write a perl script to post-process the output to find pixmap leaks without much trouble, and even from just the trace I note nautilus is creating pixmaps the size of the entire root window (1024 x 768 due to the video card currently in this test machine) more often than I expected it to - all I did for this test was login and choose the menu item to log out.)

Please let me know if you find this useful, find more places that probes would be useful, or have other suggestions. You can reach me via e-mail, comments on this blog, or share with myself and other interested people in the OpenSolaris discussion forums/mailing lists for the dtrace and/or the X Window System.

Meanwhile, if you've got your own applications you'd like to add probes to like this, see Bart's blog on putting developer-defined DTrace probe points in an application, Alan H's blog on statically defined dtrace probes, and the Statically Defined Tracing for User Applications chapter of the DTrace manual.

[Technorati Tags: , , , ]

http://blogs.sun.com/alanc/date/20050706 Wednesday July 06, 2005

Xorg 6.9 and 7.0 for Solaris

Two comments on my last blog entry asked questions that I thought should be answered in a new entry so that it would be seen by more people, since I expect many people will want to know the answers.

Laurent asked:

And, in an area that concernes you more directly, whare is the current plan at Sun with those new versions? Will 6.9 or 7.0 make it in Solaris 10 at some point, or will it wait for Solaris 11?
One of the reasons we (X.Org) are doing 6.9 and 7.0 from the same codebase is to allow distributors to move quickly to 6.9, dropping it in their existing build and package setups as a replacement for 6.8.2. They can thus get the new hardware support, bug fixes, and features out to their users with 6.9, and work on migrating their build systems and packaging to 7.0 at their own pace, without worrying about holding their users behind. Hopefully, by the time 7.1 comes out approximately 6 months after 7.0, most distributors will be ready to adopt it.

So for Solaris, we (Sun) will probably take advantage of this, integrating 6.9 into our builds for Solaris Nevada (the development branch for the next full release of Solaris), and then once it's had some "soak time" there to shake out any issues, will backport that into the next available Solaris 10 Update Release (which will also involve building patches for earlier Solaris 10 releases, as you can see today with the patches to upgrade Solaris 10 from Xorg 6.8.0 to 6.8.2 which we produced in order to include 6.8.2 in Solaris 10 Update 1). It's too early to say exactly which update release that will happen in, as it depends on various schedules and priorities that aren't fully set yet.

Once that's in place, then we'll look at moving to the 7.0 build system. I'm tempted to use this migration as an opportunity to also move our customized build and packaging systems for X in Solaris to one more like those used in other Solaris consolidations such as the OS/Networking sources already released via OpenSolaris, or possibly to the RPM-inspired pkgbuild used by Sun's Java Desktop System teams, to reduce the number of build/packaging systems that people need to learn.

Andrew Watkins posted:

I am wondering what Sun will do about X.org on Sparc. I have not played with x86, but I beleive that X.org is better than Xsun and may even solve the memory leaks (500M currently) Any thoughts or is it X.org for x86 and Xsun for Sparc!
To answer this, it may help to understand some history. The group I work in at Sun is responsible for the core X technologies - the server, libraries, and clients. Before we shipped Xorg, we didn't ship any of the video card "driver" modules (except for really old cards - the only one left in our Xsun tree today is cg6, the last of the Sbus graphics). The SPARC hardware organization that produces the graphics cards for SPARC workstations makes the drivers for them, the Sun Ray group provides their modules as part of the Sun Ray Server Software, and the x86 platform driver team delivered the x86 modules for Xsun. With Xorg, we've worked with the x86 group so that we just ship the x86 drivers out of the open source tree builds we do. Unfortunately, the open source tree doesn't contain any drivers for modern SPARC graphics cards (I haven't dug out an old enough system, but think it could work on Solaris SPARC with the suncg6 driver in the Xorg source), nor for Sun Ray, so Xorg isn't useful on SPARC yet. Our group is working with the Sun Ray & SPARC graphics groups to try to change this in the future, but I can't say how long it will take for those groups to be ready to ship.

We would like to see Xorg available on all Solaris desktops, and believe it will bring both improved performance and new features - some already here, like dynamic desktop resizing, some still in the future, like Project Looking Glass - but Xsun is still important to us and many of our users. At this point, I'm not aware of any sizable memory leaks in Xsun that haven't been fixed (though the patches are still being tested for a recent fix, for pixmap leaks in Xinerama mode on Sun Rays, so you probably haven't seen those yet). If there are still leaks of a large size in Xsun, we want to know so we can fix them - but be warned that most reports of X server memory leaks turn out to be from one of two misunderstandings of X. Due to the way video card memory is mapped in X, X servers often appear in 'ps' and the like as using much more memory than the actual RAM or swap space used. On Solaris, you can use pmap to look at the various memory mappings to see which are from video cards and which are actual allocations. The other issue is that clients allocate space in the server for things like pixmaps, and the server can't free it up until the client either releases it or exits, so the "leak" could be in a long-running client, such as your web browser. The xrestop program can be used with Xsun on Solaris 9 or 10 (or on recent XFree86 and Xorg releases on other OS'es) to see how much each client has allocated.

And while I'm on that topic, in just over a week I'll be giving a talk on how to use tools like xrestop, xscope, dtrace, etc. to help track down issues with X clients at the 2005 Desktop Developer's Conference. Since I'm running on my usual schedule (a bit late, but I am on vacation this week), I haven't gotten it all put together yet, so if anyone wants to suggest additional tools to cover, or things you want to know how to observe in X server/client interactions, now would be an excellent time to drop me an e-mail or leave a comment here.

[Technorati Tags: , , ]

http://blogs.sun.com/alanc/date/20050614 Tuesday June 14, 2005

Solaris Express changes for desktop users

Along with the rest of the OpenSolaris downloads made available today, was a new Solaris Express: Community Edition, which like the current Solaris Express is based on the "Nevada" development branch of Solaris. While the bits released are the same, the community edition comes out more often, and with less testing first, to allow building, testing and debugging OpenSolaris with the same bits the engineers have internally. Once testing is done, one out of every two or three releases should be promoted to the main Solaris Express program, for those who like living near the cutting edge, but not right on it.

So the first community edition release today is build 16 of the "Nevada" branch of Solaris. The latest Solaris Express release was Solaris Express 4/05, which was build 10 of the "Nevada" branch. I imagine Dan or the Alan from Down Under will put together their usual lists soon covering the entire feature set, but I've already got a list gathered of changes you'll see in the Solaris desktop software consolidations (X, CDE, JDS, etc.) when you move from build 10 to build 16, so I figured I'd go ahead and post it. Some, but not all, of these changes should also be in Solaris Express 6/05, with the rest coming in the Solaris Express release after that.

Unfortunately, since these desktop components are not yet part of OpenSolaris, their bugs aren't yet available on the bugs.opensolaris.org portal yet either, so I can't link to the bug database for more info on them yet.

  • Once again, new login screen graphics. (The "squiggles" didn't compress well for remote display or low-bandwidth Sun Ray, like the people in Sun's Sun-Ray-at-home internal pilot program.) Also, a 1400x1050 graphic size is added for all the laptop users.
  • Java Desktop System theme changes to match the upcoming JDS3 for Linux release. The Java coffee cup is back in the default background and the curved lines in the background don't clash with the icons as much. The Launch menu button is changed to look like a single button instead of a Java button next to a launch button.
  • Sun Update Connection icon in the JDS toolbar.
  • Updated versions of Xorg's nv (open source nVidia) & i810 (Intel 8xx & 9xx series chipsets) drivers to support newer hardware. [These are both updated to the X.Org CVS versions from around March 2005.]
  • Added libraries and clients for extensions created by XFree86:
    • programs (in /usr/X11/bin): xgamma, xrandr, xvidtune, xvinfo
    • libraries (in /usr/X11/lib): libXxf86misc.so.1, libXxf86vm.so.1
  • SPARC OpenGL upgraded to Sun's OpenGL 1.5 alpha release
  • Mesa open-source OpenGL-workalike for Xorg included on Solaris x86/x64.
  • Xscreensaver OpenGL modules now shipped on Solaris x86/x64 too. (Were previously SPARC-only since we had no x86 OpenGL support.)
  • agpgart kernel driver and Xorg server support (only used by the Xorg driver for Intel 8xx & 9xx series chipsets though) [See the OpenSolaris X Window System community announcements page for links to the sources integrated into X.Org CVS for this.]
  • Virtual mouse and keyboard drivers in the kernel make all USB & PS/2 keyboards and mice available via /dev/mouse and /dev/kbd, so you no longer need to manually configure your X config files to support multiple devices, and hotplugging additional devices simply works. If you open one of the other mouse or keyboard device files directly, it splits it out of the coalesced virtual devices and lets you access directly, if you need special configuration for specific devices.
  • Bugs in 3-button mouse emulation in both Xsun & Xorg have been fixed.
  • xorgcfg GUI tool provided for configuring xorg.conf files for Xorg.
  • libXext is now compatible with the XFree86/Xorg flavor of Xinerama. (See my June 2nd blog post for details. )
  • gdm2 better configured to work on Solaris (reboot & shutdown commands properly set, etc.)
  • JDS no longer locks up the keyboard when displayed to foriegn X servers like Xvnc or Exceed.

Though as I mentioned, these bits are from a development release, not fully debugged yet, so there's a couple of things to watch out for as well:

  • If you've installed the nVidia accelerated driver and upgrade to build 16, you need to reinstall that driver after upgrading since it clashes with the Mesa OpenGL in this build. That will be fixed in the future as we work on integrating the nVidia driver directly into Solaris.
  • If you're using an Xorg configuration file with multiple mouse devices, make sure the /dev/mouse device comes before the /dev/kdmouse or your system may hang on Xorg startup if you don't have a USB mouse plugged in. (Alternatively, delete the kdmouse device and just use the virtual mouse via /dev/mouse.) This is bug 6275666, fixed in Nevada build 17.

[] [] []

Opening Day

access(2) OpenSolaris As you may have noticed, the release of the first big batch of code from OpenSolaris project today is being accompanied by blogs from many of the Solaris engineers. When you read these blogs you'll see most of the engineers are writing about the parts of the code they know best. Unfortunately, for me, that's the parts not yet released. (There are a couple of lines of code here and there from me that I submitted as suggested fixes in bugs I filed, but I've never checked in any code directly to the source tree the released code comes from.)

Why is that? Because of the way Solaris is built - it's not one single monolithic source tree, but instead organized into chunks called "consolidations." Today's first phase is a single consolidation, called "OS/Networking" or ON for short, which contains most of the Solaris kernel, drivers and core utilities. The consolidation most of my work goes into is simply called "X", and includes both the Xsun & Xorg source trees for Solaris, but not the Xsun device modules. Other consolidations include CDE/Motif, JDS, Graphics and Networking drivers, Developer Tools, Administration/Installation, and so on. Each consolidation manages it's own source trees and builds, and delivers the built packages to a central dock where all the consolidations are combined into the Solaris WOS ("Wad of Stuff") which is the CD or DVD set delivered to the users.

Since Solaris is huge, instead of waiting until everything was ready for release, a staged plan was developed for OpenSolaris, starting with the consolidation at the core of the OS and then working out to the higher layers. Now that ON is out and the infrastructure used is set up, we're starting to work out how to release code from other consolidations. Once everything is in place for additional consolidations to be added to OpenSolaris, the next phase will probably be the source trees that come from open source releases and are thus not needing as much work to separate out the bits that can be released from those that can't, and that is expected to include our Xorg server source tree.

Of course, if you want to build Xorg for Solaris, you don't need our changes - there's not actually much you need that's not currently in the X.Org CVS repository on freedesktop.org. There's even stuff there that's not yet in our builds, like the experimental support for building 64-bit Xorg servers for AMD64 machines. Our Solaris builds are based on the Xorg 6.8.2 source tree, with various changes backported from the CVS head, such as the newer ATI, nVidia, and i810 drivers, and a few small customizations. Some of those can be enabled in the open source builds by using the BuildLikeSun Imake option recently added to CVS to document some of our customizations. The rest we'll be working on getting out there later this year.

We've also set up a "X Window System" community on the new OpenSolaris.org website where you can find out more information about the status of our X source releases, and talk to the engineering teams and other users. Come join us over there if this part of OpenSolaris interests you!

[] [] []