RoboGeek
RoboGeek's (David Herron) Weblog: co-developer of Robot and several other things related to Java testing.

Wednesday March 30, 2005
Open Office & Java
Last week there was a big to-do over the expanded use of Java in Open
Office. It seemed to be riling people that a free product, OOo,
was being tied to a "non-free" one, Java. Hurm.
What's always bothered me is looking at the software Sun ships, and
witnessing how many GUI toolkits Sun is supporting in the various
packages. It used to be Motif/CDE, Open Look, GNOME, Java and the
toolkit in Open Office. er.. that's 5 GUI toolkits being
supported by different teams at Sun. As a Sun employee (and stockholder) I'm sure
there's better use for Sun's cash than paying for 5 GUI toolkit teams.
Fortunately Open Look and Motif/CDE have gone away, and that leaves us
with three teams. But I have to wonder if even that is too
much. Okay, I'm biased for working in the Java team, but todays
Java is not the Java of old. Java GUI's today are responsive and
depending on your choice of L&F module, of which there are many
available, the Java GUI can look very pretty. Want proof?
Take a look at the transformation of Netbeans from molassas-slow to
really good. All it took was dedicated effort in the Netbeans
team, and the public nagging on Netbeans for being molassas-slow, and
they profiled the thing seven ways from sunday and got it to perform
really well.
This obviously isn't going to be a popular opinion among the Open
Office crowd, but my druthers would be to replace the entire OOo GUI
with Java. That needn't mean replacing the modules that
read/write different file formats, since part of Java is JNI for
plugging in native code. Another native code hook is JAWT which
could allow reuse of some GUI modules while allowing the bulk of the
GUI to be in Java. For example the part of OOo that handles
document editing is probably rather complex and would be a big
investment to rewrite, but is a great candidate for encapsulating with
JAWT.
Why? Well, witness the difficulty with getting OOo to run on Mac
OS X, that's why. You have two choices ... one choice is to run
the "current" version as X11 under Apple's X server, which results in a
very foreign looking application that doesn't know much about native OS
X features like the ubiquitous print-to-PDF button. The other
choice is NeoOffice which is good, and acts pretty well as a native OS
X application, but it's based on the old OOo sourcebase, not the
current sourcebase.
My personal favorite story about Java's portability was an MP3 player I
saw announced once. I think the name was JLgui, and it mimic'd
the WinAMP player popular at the time. It claimed to be written
in pure Java, so I downloaded the application and ran it on my
Mac. Ran perfectly fine. I sent them an email and they said
something like "woah, we haven't even tested there". A similar
story happened with Moneydance, a personal finance application written
in Java, where the author announced Mac support saying his portation
effort was simply to copy the class files over and they ran.
In other words, what I see with Todays Java is that performance and
responsiveness have become very good, and the cross platform
compatibility is strong.
(2005-03-30 10:54:51.0)
Permalink

Monday March 14, 2005
Re: 'xkill' approach to identifying GUI components on X11 ...
A comment on my previous posting asked
why not use the "xkill" method and have the tool use XQueryTree in
combination with a manual "user points at the button" approach? As you
remember, xkill lets you kill an application by clicking on the window
that a process owns. Perhaps the combination of XQueryTree-derived
information and a user pointing out the various buttons, just once,
will give you enough data on each button to determine programmatically
what can be pressed?
I had to dredge my memory banks for "xkill". The approach you say
is to do a mouse grab, change the cursor, and follow it around until
there's a click. At that point you can do XQueryTree to determine
the GUI hierarchy under the mouse.
I had considered something like that ... We could obviously examine the
rectangles we get from XQueryTree and even if we don't know their GUI
Component class, we could manually examine the tree and learn something
that way. I vaguely remember writing an application that would do
screen grabs of all these rectangles that could help with the analysis.
The problem is applications written with GUI toolkits that don't show their GUI tree through XQueryTree. (e.g. Java/Swing)
In the original Xlib approach to GUI applications, each component would live within a
Window. The word "
Window"
here doesn't mean a top-level window, but any rectangular area.
They're arranged in a parent-child tree-like hierarchy, and this is the
hierarchy XQueryTree shows you. The Xt
based toolkits (Motif) religiously used these Window objects to contain
their Widgets. If you want to read about this in more depth get
the O'Reilly books on X11, volumes 2 and 4 (or else it's 1 and 3).
The easiest way to see what XQueryTree returns is with
xwininfo -tree
And it does the mouse-grab letting you select an application and dump
its GUI component tree. Notice how nothing has a 'name' attached
to it, just coordinates and sizes.
Unfortunately I'm not seeing today what I remembered from last
year. Namely, I remember Mozilla based applications not returning
much through xwininfo, but checking today (with Firefox 1.0.1, Mozilla
1.7.5 and Thunderbird 1.0, all compiled with GTK2) I see a rich tree
shown in xwininfo.
The other issue with this approach is - what happens when the GUI
layout changes. GUI layouts change in a couple ways - such as
each revision of the application - or simply if the font
characteristics change. The first means you'd have to regenerate
all the mappings to the GUI component tree you're interested in.
The second may be okay as the hierarchy ought to stay the same, just be
rearranged a little.
(2005-03-14 11:53:05.0)
Permalink

Friday March 11, 2005
Test tools across platforms
I must apologize for getting distracted with PHP. Let me get back to the Automation mission I started with.
The nastiest problem we've been facing for years in J2SE SQE is
automating Plugin and JaWS testing. I am close to finishing a
tool which handles this, though I don't know if I'll ever be able to
release it. Knock on wood and pray to the gods that be, let's
hope this can be released through the Jemmy project.
Testing a Swing or AWT application is simple. This is because we
can browse through the AWT Component tree, discover Components, and get
their location. All of the Java GUI test tools use that same basic approach.
This approach works great when the GUI under test is contained within
one JVM. But as soon as you need to interact with a window
outside the JVM, you quickly become blind as a bat. To interact
with a GUI component using Robot, you need to know it's location so you
can click on it. BTW, this holds true for some GUI features
inside the current JVM, e.g. menu components don't export their
location, and if you want to click on the arrow button of a scrollbar
you're out of luck.
In what J2SE SQE tests the main area this issue shows up is, as I said,
Plugin and JWS. It also affects drag&drop testing as some
test scenarios there involve D&D between applications.
Robot could click on anything, so long as it knows the location.
So the key clincher is to learn the location of the GUI components in
other applications.
We tried using native windowing system API's (on Windows, EnumWindows
and EnumChildren, while on X11 it's XQueryTree). This worked
fairly well, but has limitations, especially on X11 where XQueryTree
only returns rectangles and no higher level information about the
rectangles. Further XQueryTree can only find Window objects,
meaning that toolkits that don't use Window objects (such as Mozilla's
GUI toolkit) won't have their internal structure visible to XQueryTree.
An approach I looked at approximately a year ago, and discarded at the
time, was to become an Accessibility Tool. The Accessibility
support helps with the devices that disabled people use to facilitate
their computer use. Accessibility support means that some
software tool is querying the system for nitty-gritty details about the
components in all applications on the screen. Among the data
returned to an Accessibility tool is the screen location, GUI class,
etc. This is the perfect sort of information needed to solve the
problem I outlined above.
Unfortunately it appeared the presence of Accessibility support was
spotty. It was available for Windows, but not for X11. On
the X11 side some Sun engineers are working with both the GNOME and
Mozilla projects to implement Accessibility in both tools. I'd
also seen an announcement that KDE would be absorbing the Accessibility
implementation from GNOME. The latter is great news for
compatibility, no doubt. Unfortunately this Accessibility support
was still in development, and wasn't shipping as part of the
default/base installation on any system, but maybe that's changed since
then.
(2005-03-11 16:44:32.0)
Permalink

Wednesday March 09, 2005
Re: PHP versus JSP/etc [UPDATE: I think I want to make clear, that with what I'm writing here,
that this is purely my opinion. I can't pretend to be making any
claim/statement about Sun's position in regards to PHP. So, any
of y'all who might think what I'm saying means that Sun hates PHP,
please don't use this as any "evidence" of such a slant. Heck, I
don't even hate PHP, I am just witnessing and writing about some flaws.]
Apparently I struck a chord yesterday with my posting about PHP.
I thank the nice and friendly commenters for setting me straight on a
couple points. I have a couple clarifications to make, and I
still think JSP is superior in some ways and I want to discuss that.
First about my comment on MVC, or how PHP encoourages the intermingling
of layout with application code. Obviously earlier versions of
JSP also had the same problem, which is why STRUTS and JSTL came
along. In the current JSTL and JSP specifications you can define
new tags purely through writing some HTML code, which even further
separates the layout from application code.
In any case, the commenters mentioned MVC modules being available for PHP. In particular they mentioned Smarty.
Judging from the documentation Smarty looks to be interesting, though I
don't quite get why there's such a fervor about it. I found
myself impressed with its features, but turned off by two things.
First is that the Smary markup is not in any kind of HTML or XML
format, which means that you're unlikely to ever see a WYSIWYG editor
for Smarty markup. I suspect that designers,
for whom Smarty is targeted, will prefer a WYSIWYG approach rather than
the write-some-code-and-preview-it-in-a-browser you'd have to use with
Smarty. But I think that's a workstyle issue, as some designers
prefer to work in straight HTML, rather than a WYSIWYG editor like
GoLive or Dreamweaver, and would be quite at home with what I see in
Smarty.
Speaking for myself, when I write a web page I prefer a WYSIWYG
approach even though I'm comfortable with writing HTML code. I
also like the current iteration of GoLive where you can interactively
design CSS styling and immediately see the results in the editor.
The second thing which turned me off about Smarty is its use. It
seems a little strange to make the PHP page have a magical incantation
to require the Smarty code, instantiate a Smarty object, load up some
data, and set that data into Smarty, all before you can invoke the
template. I'm sorry, but to me that looks backwards. I'd
rather have it just be "Smarty" code (or whatever template system
you're using), and have the the infrastructure take care of the
details. of interpreting its way through the template file.
The other comment was about availability of extra programming
libraries. I must not have been clear enough in what I said,
because the comments missed the mark.
What I meant to say is:
- Java, upon which JSP is implemented, is an existing language with a wide variety of programming libraries available.
- Those libraries can be used for any application, not just server side.
- Those libraries are very available to be used in JSP applications.
PHP is a relatively new language. While quite a few PHP modules have been written and the
pear.php.net
site does a great job of collecting them into one place, if PHP had
been implemented on top of Java then those libraries would not have
needed to be written. That's my point, is that JSP gets immediate
use of all the existing Java libraries, whereas if equivalent
functionality is needed for PHP it has to be written anew.
Okay, but I see that
PHP dates back to 1995.
JSP hadn't been invented yet in 1995. I remember 1995, and was
writing server side scripts by 1997 (or so) myself. At that time
I wasn't happy with the state of server side scripting languages (my
hosting provider supported Perl, yuck) and
wrote my own template engine built atop TCL. Again, I appreciated
the capability to reuse an existing language and theoretically be able
to tap upon an existing base of libraries. The application I
built with my self-written templating engine is still running today at
The Reiki Page Practitioner Directory.
(2005-03-09 16:18:08.0)
Permalink

Tuesday March 08, 2005
PHP versus Java/JSP/J2EE/...
Last weekend I decided to try a bit of PHP programming. I wanted to see what the language was like, whether it was any easier to create pages, etc. I'm a little experienced with JSP programming, and in the long distant past I once wrote my own page template engine around TCL because at the time there was nothing else available.
To compare PHP versus J2EE is utter nonsense, because they serve completely different purposes. J2EE is very rich in capabilities, ones that appeal to large enterprises or very busy web sites. These are capabilities the PHP programmer can only dream of, because PHP doesn't provide anything like it.
It seems fair to keep the comparison to the closest analog to PHP, namely JSP.
Another note is that obviously I have a Java bias, given that my job involves working on Java.
With that out of the way ... what do I think?
Learning curve
Well, PHP is pretty easy to learn. I simply stumbled my way over to php.net, found the online manual, and read a few things, and said "hey! that's easy" and started coding. Within a few hours I had a website whose pages are dynamically built from a database. I'd previously written some JSP pages to use that same data, displaying it in the same format, so this is a fair comparison.
The JSP pages I'd written also went very quickly. Except, by the time I wrote the JSP pages I'd already gone through the steepish learning curve to learn JSP and JSTL programming.
Here's the rub with JSP, that somehow JSP programming is somewhat difficult to pick up. I think it stems from a couple places: a) the servlet/JSP specifications are written in a difficult legalistic style, and b) there's baggage from the early servlet history.
If one were to stick purely with JSP and JSTL then programming JSP's is pretty simple. Even writing new JSP tags is very simple, if you use the new XML-tag file format. But unfortunately the books that teach JSP and JSTL have to talk you around all the old baggage.
In comparison I found the PHP model and documentation to be very clear and easy to understand. Like I said, after a couple hours I had some pages going, having never done anything with PHP other than glance at a few source files. That's a pretty easy learning curve so far as I can see.
Capabilities
I find great fault with PHP in its capabilities.
First PHP ignores the MVC model. The "business logic" is hopelessly intertwined with the presentation. In the pages I've been writing, the core of the presentation is embedded in strings in echo statements. This in turn makes it very difficult for a WYSIWYG style web editor to do much to help you lay out your pages.
By ignoring the MVC model, this relegates PHP programmers to indirectly coding the page layout, rather than designing the layout through direct manipulation in a WYSIWYG editor. By comparison, JSP (if you use the modern JSTL or STRUTS approaches) can show quite a lot through a WYSIWYG editor, and dynamic pages can be designed through direct manipulation rather than indirect coding.
Second, PHP as a language isn't very scalable. Since JSP resides atop Java, you have a tremendous, modern, object oriented programming language with tons of software modules ready to be tapped upon. PHP doesn't have this backdrop, hence this world needs to be reinvented for PHP in order for a PHP programmer to use it. Also object oriented programming is a new feature to PHP, which I've not studied yet, but I expect since it's a new feature layered on top of a non-OO language that it's likely going to be clumsy.
Market realities
On the other hand, it's clear that PHP is very popular with some. For example I've spent a lot of time looking at hosting providers. I know it's trivial to find a hosting provider that offers PHP service, and that it's very difficult to find one offering JSP service.
That distinction has given me a lot to ponder. The conclusion I've reached is that the hosting providers have razor thin margins and need to keep costs pared to the bone. This means open source solutions are great for them, because the cost is minimal. Now, the Linux+Apache+MySQL+PHP (LAMP) is very easy to install and relies completely on open source software packages. Comparatively Linux+Apache+MySQL+Tomcat+Java is pretty darned difficult to install, if only because the mod_jk (or is it mod_jk2) installation is a mess. The installation process is very poorly documented, and it is being completely unclear whether one is supposed to use mod_jk or mod_jk2.
Putting myself in the hosting providers shoes, it looks like Java/JSP is too difficult and possibly with a low payback. Even if one were to use a different appserver than Tomcat it's almost certainly going to involve money, which the hosting provider can't afford (razor thin margins).
So in summary I think JSP is capable of a lot more than PHP, but hampered by a steepish learning curve and a difficult installation process. (2005-03-08 11:05:34.0)
Permalink

Monday March 07, 2005
The bottom level of GUI automation tools
My previous posting was probably too long and circuitous. Just to prove that I'm overly obsessed with GUI automation, I want to discuss the bottom layer. With the foundational principles of GUI automation covered, I ought to be able to discuss the higher level issue more freely. That's my plan anyway...
Seven years ago when I was hired by Sun, my manager said "Your job will be to test AWT. Oh, and if you want to avoid manual test execution, you'll find a way to automate the tests." Since the last thing I wanted to do was shift my career to being a tester, I spent more time looking for automation technology than writing tests. Please don't tell my manager that!
At the time there was nothing suitable. In the commercial arena we had WinRunner and X/Runner but I'd had a previous bad experience with X/Runner and didn't want to use it. Basically, X/Runner is specialized to testing Motif applications and the scuttlebut was it wasn't able to deal well with Java applications (even though AWT was based on Motif). My previous experience had been at Mainsoft where I'd spent two years working on a WIN32 implementation for X11, and we tried to use X/Runner for test automation. It didn't work terribly well because X/Runner had no clue where any of our components were, and could only work on absolute coordinates.
Sun had a product named JavaStar that I thought was peculiarly useless while being almost useful. It was unable to test AWT applications because it only sent fake events through the AWT/Swing EventQueue, and in my testing of that idea the AWT components would not react to those events. The AWT components would only react to events that arrived through the normal operating system channel. I did have a meeting with that team and was unable to get their attention on this flaw.
Now, I promised a bit of discussion of the fundamental level of GUI automation. Since I hadn't been able to find an existing tool, I had to study this fundamental level. This study eventually led to the java.awt.Robot class.
I created, on X11, a class I called NativeEvent which had around five methods.
- Mouse-Button press
- Mouse-Button release
- Key press
- Key release
- Mouse move (set-location)
I knew these were the fundamental operations of any GUI interaction, and that if I could perform those fundamental operations from Java, that they could be used to create any GUI interaction. It was an Alan Turing sort of moment, where I knew the fundamental state machine for any GUI interaction.
So with this implementation in hand, I presented it to my immediate team. They liked it enough, but wanted to see some help from the AWT developers. So I wrote a summary, and went to meet with the developers. In the hallway on the way there they met me and said "David, we have this idea for GUI automation and wonder what you think of it". It was one of those great minds think alike moments, because they had the same five methods written on a sheet of paper.
After a moment of "hey, we've got the same idea" ...well... We turned to collaboration, with Robi Khan doing the Windows implementation and I did the Solaris implementation.
I've written about this before, but we purposely kept Robot to those minimal methods. We didn't want to embed a GUI test framework in Java (due to size), we didn't want to force the development community to a specific method of testing, etc. We left it up to the community to develop this idea further, and they did.
I like to think of those five methods as the GUI automation equivalent of assembly code. You probably don't want to code your automation with those methods, but you can, and you can certainly build amazing contraptions with only those five methods. But you'll be more efficient if you use a higher level tool.
(2005-03-07 10:26:33.0)
Permalink

Thursday March 03, 2005
Testing Java GUI applications & multiple platforms
As I said in earlier blog entries, I work in the Java SQE (J2SE) team. It is the SQE team that develops functional tests of the Java functionality. These tests are outside the scope of the JCK tests the Java customers use to validate their Java implementations.
My specialty is with tools for automating GUI oriented testing. For example I am partially responsible for the java.awt.Robot class.
This background has given me an interesting conclusion to share. This conclusion is likely different than our marketers want to have said. So I should say that this is purely my opinion.
What is this conclusion?
It's very simple - the presence of (commercial) GUI test automation tools have a large impact on the potential success for an operating system.
Why? It has to do with the relative testability of applications built for that operating system.
Why? The rule of thumb for GUI application testing is "Testing is never finished, only abandoned". This means that any application development team has a certain budget for testing, and once that budget is exhausted they won't test further. The effect of automated GUI tests is to amplify the amount of testing that can be done with a given budget.
Here is the core of my contention. Windows is the only platform with commercial GUI automation tools. This makes for an advantage to Windows since application developers can automate their tests, and thereby do more testing with their budget. Similarly, on a platform lacking commercial GUI automation tools the application developers won't be able to automate their tests, and hence their application won't be quite as well tested.
Java, however, is in a different situation.
Java has the java.awt.Robot class, and there are several GUI automation tools for Java. I have a javapedia page here which goes over those tools. This makes a Java application testable across platforms, even when there is the general lack of GUI automation tools.
That's a good thing, yes? Mostly. Turns out there is a limitation in that the Java GUI automation tools rely on being able to use java.awt.Component.getLocationOnScreen(). But what if you want to interact with another application besides the Java application?
Like, what if you want to test an applet, and some of your scenarios require interacting with the web browser containing the applet? Bzzzzt!! No luck, because that browser is a foreign application. The Java GUI automation tool is unable to "see" the browser components.
What if you went to the native operating system calls and, through the magic of JNI, brought information about the GUI components into Java?
On an X11 system you can write some native code using XQueryTree, and thereby find locations of components. However XQueryTree only gives you a set of rectangles, and doesn't give you any clue of what those rectangles are. Does anybody have a time machine so we can go back to the mid-80's and beat someone up? If you don't know what those rectangles are, you don't have any ability to know which rectangle to click on.
On a Win32/64 system native code (EnumWindows and EnumChildren) is able to give you more information. It's not perfect, but it's better than nothing.
But the cruelest twist is that on every platform there are several GUI toolkits which do not form their components in the traditional way. The Mozilla Application Object Model is a great example of this, where inside their toolkit they draw their own widgets purely as graphics. This means not even the Windowing System is going to know where Mozilla application components are. Even if one were to write a JNI module to get native windowing system data into Java, that module would also be unable to see the native components.
The bottom line for Java application developers is that ... so long as your application exists within a single Java VM, and doesn't interact (much) with native applications, you'll be able to automate tests against it. But if your application interacts with native applications you're stuck with manual testing for those scenarios.
(2005-03-03 13:42:46.0)
Permalink