RoboGeek

RoboGeek's (David Herron) Weblog: co-developer of Robot and several other things related to Java testing.


« Previous month (Feb 2005) | Main | Next month (Apr 2005) »
20050330 Wednesday March 30, 2005

Open Office & Java Last week there was a big to-do over the expanded use of Java in Open Office.  It seemed to be riling people that a free product, OOo, was being tied to a "non-free" one, Java.  Hurm.

What's always bothered me is looking at the software Sun ships, and witnessing how many GUI toolkits Sun is supporting in the various packages.  It used to be Motif/CDE, Open Look, GNOME, Java and the toolkit in Open Office.  er.. that's 5 GUI toolkits being supported by different teams at Sun.  As a Sun employee (and stockholder) I'm sure there's better use for Sun's cash than paying for 5 GUI toolkit teams.

Fortunately Open Look and Motif/CDE have gone away, and that leaves us with three teams.  But I have to wonder if even that is too much.  Okay, I'm biased for working in the Java team, but todays Java is not the Java of old.  Java GUI's today are responsive and depending on your choice of L&F module, of which there are many available, the Java GUI can look very pretty.  Want proof?  Take a look at the transformation of Netbeans from molassas-slow to really good.  All it took was dedicated effort in the Netbeans team, and the public nagging on Netbeans for being molassas-slow, and they profiled the thing seven ways from sunday and got it to perform really well.

This obviously isn't going to be a popular opinion among the Open Office crowd, but my druthers would be to replace the entire OOo GUI with Java.  That needn't mean replacing the modules that read/write different file formats, since part of Java is JNI for plugging in native code.  Another native code hook is JAWT which could allow reuse of some GUI modules while allowing the bulk of the GUI to be in Java.  For example the part of OOo that handles document editing is probably rather complex and would be a big investment to rewrite, but is a great candidate for encapsulating with JAWT.

Why?  Well, witness the difficulty with getting OOo to run on Mac OS X, that's why.  You have two choices ... one choice is to run the "current" version as X11 under Apple's X server, which results in a very foreign looking application that doesn't know much about native OS X features like the ubiquitous print-to-PDF button.  The other choice is NeoOffice which is good, and acts pretty well as a native OS X application, but it's based on the old OOo sourcebase, not the current sourcebase.

My personal favorite story about Java's portability was an MP3 player I saw announced once.  I think the name was JLgui, and it mimic'd the WinAMP player popular at the time.  It claimed to be written in pure Java, so I downloaded the application and ran it on my Mac.  Ran perfectly fine.  I sent them an email and they said something like "woah, we haven't even tested there".  A similar story happened with Moneydance, a personal finance application written in Java, where the author announced Mac support saying his portation effort was simply to copy the class files over and they ran.

In other words, what I see with Todays Java is that performance and responsiveness have become very good, and the cross platform compatibility is strong.


(2005-03-30 10:54:51.0) Permalink Comments [6]

20050314 Monday March 14, 2005

Re: 'xkill' approach to identifying GUI components on X11 ... A comment on my previous posting asked

why not use the "xkill" method and have the tool use XQueryTree in combination with a manual "user points at the button" approach? As you remember, xkill lets you kill an application by clicking on the window that a process owns. Perhaps the combination of XQueryTree-derived information and a user pointing out the various buttons, just once, will give you enough data on each button to determine programmatically what can be pressed?

I had to dredge my memory banks for "xkill".  The approach you say is to do a mouse grab, change the cursor, and follow it around until there's a click.  At that point you can do XQueryTree to determine the GUI hierarchy under the mouse.

I had considered something like that ... We could obviously examine the rectangles we get from XQueryTree and even if we don't know their GUI Component class, we could manually examine the tree and learn something that way.  I vaguely remember writing an application that would do screen grabs of all these rectangles that could help with the analysis.

The problem is applications written with GUI toolkits that don't show their GUI tree through XQueryTree.  (e.g. Java/Swing)

In the original Xlib approach to GUI applications, each component would live within a Window.  The word "Window" here doesn't mean a top-level window, but any rectangular area.  They're arranged in a parent-child tree-like hierarchy, and this is the hierarchy XQueryTree shows you.  The Xt based toolkits (Motif) religiously used these Window objects to contain their Widgets.  If you want to read about this in more depth get the O'Reilly books on X11, volumes 2 and 4 (or else it's 1 and 3).

The easiest way to see what XQueryTree returns is with

   xwininfo -tree

And it does the mouse-grab letting you select an application and dump its GUI component tree.  Notice how nothing has a 'name' attached to it, just coordinates and sizes.

Unfortunately I'm not seeing today what I remembered from last year.  Namely, I remember Mozilla based applications not returning much through xwininfo, but checking today (with Firefox 1.0.1, Mozilla 1.7.5 and Thunderbird 1.0, all compiled with GTK2) I see a rich tree shown in xwininfo.

The other issue with this approach is - what happens when the GUI layout changes.  GUI layouts change in a couple ways - such as each revision of the application - or simply if the font characteristics change.  The first means you'd have to regenerate all the mappings to the GUI component tree you're interested in.  The second may be okay as the hierarchy ought to stay the same, just be rearranged a little.


(2005-03-14 11:53:05.0) Permalink Comments [0]

20050311 Friday March 11, 2005

Test tools across platforms I must apologize for getting distracted with PHP.  Let me get back to the Automation mission I started with.

The nastiest problem we've been facing for years in J2SE SQE is automating Plugin and JaWS testing.  I am close to finishing a tool which handles this, though I don't know if I'll ever be able to release it.  Knock on wood and pray to the gods that be, let's hope this can be released through the Jemmy project.

Testing a Swing or AWT application is simple.  This is because we can browse through the AWT Component tree, discover Components, and get their location.  All of the Java GUI test tools use that same basic approach.

This approach works great when the GUI under test is contained within one JVM.  But as soon as you need to interact with a window outside the JVM, you quickly become blind as a bat.  To interact with a GUI component using Robot, you need to know it's location so you can click on it.  BTW, this holds true for some GUI features inside the current JVM, e.g. menu components don't export their location, and if you want to click on the arrow button of a scrollbar you're out of luck.

In what J2SE SQE tests the main area this issue shows up is, as I said, Plugin and JWS.  It also affects drag&drop testing as some test scenarios there involve D&D between applications.

Robot could click on anything, so long as it knows the location.  So the key clincher is to learn the location of the GUI components in other applications.

We tried using native windowing system API's (on Windows, EnumWindows and EnumChildren, while on X11 it's XQueryTree).  This worked fairly well, but has limitations, especially on X11 where XQueryTree only returns rectangles and no higher level information about the rectangles.  Further XQueryTree can only find Window objects, meaning that toolkits that don't use Window objects (such as Mozilla's GUI toolkit) won't have their internal structure visible to XQueryTree.

An approach I looked at approximately a year ago, and discarded at the time, was to become an Accessibility Tool.  The Accessibility support helps with the devices that disabled people use to facilitate their computer use.  Accessibility support means that some software tool is querying the system for nitty-gritty details about the components in all applications on the screen.  Among the data returned to an Accessibility tool is the screen location, GUI class, etc.  This is the perfect sort of information needed to solve the problem I outlined above.

Unfortunately it appeared the presence of Accessibility support was spotty.  It was available for Windows, but not for X11.  On the X11 side some Sun engineers are working with both the GNOME and Mozilla projects to implement Accessibility in both tools.  I'd also seen an announcement that KDE would be absorbing the Accessibility implementation from GNOME.  The latter is great news for compatibility, no doubt.  Unfortunately this Accessibility support was still in development, and wasn't shipping as part of the default/base installation on any system, but maybe that's changed since then.


(2005-03-11 16:44:32.0) Permalink Comments [1]

20050309 Wednesday March 09, 2005

Re: PHP versus JSP/etc [UPDATE: I think I want to make clear, that with what I'm writing here, that this is purely my opinion.  I can't pretend to be making any claim/statement about Sun's position in regards to PHP.  So, any of y'all who might think what I'm saying means that Sun hates PHP, please don't use this as any "evidence" of such a slant.  Heck, I don't even hate PHP, I am just witnessing and writing about some flaws.]


Apparently I struck a chord yesterday with my posting about PHP.  I thank the nice and friendly commenters for setting me straight on a couple points.  I have a couple clarifications to make, and I still think JSP is superior in some ways and I want to discuss that.

First about my comment on MVC, or how PHP encoourages the intermingling of layout with application code.  Obviously earlier versions of JSP also had the same problem, which is why STRUTS and JSTL came along.  In the current JSTL and JSP specifications you can define new tags purely through writing some HTML code, which even further separates the layout from application code.

In any case, the commenters mentioned MVC modules being available for PHP.  In particular they mentioned Smarty.  Judging from the documentation Smarty looks to be interesting, though I don't quite get why there's such a fervor about it.  I found myself impressed with its features, but turned off by two things.  First is that the Smary markup is not in any kind of HTML or XML format, which means that you're unlikely to ever see a WYSIWYG editor for Smarty markup.  I suspect that designers, for whom Smarty is targeted, will prefer a WYSIWYG approach rather than the write-some-code-and-preview-it-in-a-browser you'd have to use with Smarty.  But I think that's a workstyle issue, as some designers prefer to work in straight HTML, rather than a WYSIWYG editor like GoLive or Dreamweaver, and would be quite at home with what I see in Smarty.

Speaking for myself, when I write a web page I prefer a WYSIWYG approach even though I'm comfortable with writing HTML code.  I also like the current iteration of GoLive where you can interactively design CSS styling and immediately see the results in the editor.

The second thing which turned me off about Smarty is its use.  It seems a little strange to make the PHP page have a magical incantation to require the Smarty code, instantiate a Smarty object, load up some data, and set that data into Smarty, all before you can invoke the template.  I'm sorry, but to me that looks backwards.  I'd rather have it just be "Smarty" code (or whatever template system you're using), and have the the infrastructure take care of the details. of interpreting its way through the template file.

The other comment was about availability of extra programming libraries.  I must not have been clear enough in what I said, because the comments missed the mark.

What I meant to say is:

  1. Java, upon which JSP is implemented, is an existing language with a wide variety of programming libraries available.
  2. Those libraries can be used for any application, not just server side.
  3. Those libraries are very available to be used in JSP applications.
PHP is a relatively new language.  While quite a few PHP modules have been written and the pear.php.net site does a great job of collecting them into one place, if PHP had been implemented on top of Java then those libraries would not have needed to be written.  That's my point, is that JSP gets immediate use of all the existing Java libraries, whereas if equivalent functionality is needed for PHP it has to be written anew.

Okay, but I see that PHP dates back to 1995.  JSP hadn't been invented yet in 1995.  I remember 1995, and was writing server side scripts by 1997 (or so) myself.  At that time I wasn't happy with the state of server side scripting languages (my hosting provider supported Perl, yuck) and wrote my own template engine built atop TCL.  Again, I appreciated the capability to reuse an existing language and theoretically be able to tap upon an existing base of libraries.  The application I built with my self-written templating engine is still running today at The Reiki Page Practitioner Directory.


(2005-03-09 16:18:08.0) Permalink Comments [2]

20050308 Tuesday March 08, 2005

PHP versus Java/JSP/J2EE/...

Last weekend I decided to try a bit of PHP programming. I wanted to see what the language was like, whether it was any easier to create pages, etc. I'm a little experienced with JSP programming, and in the long distant past I once wrote my own page template engine around TCL because at the time there was nothing else available.

To compare PHP versus J2EE is utter nonsense, because they serve completely different purposes. J2EE is very rich in capabilities, ones that appeal to large enterprises or very busy web sites. These are capabilities the PHP programmer can only dream of, because PHP doesn't provide anything like it.

It seems fair to keep the comparison to the closest analog to PHP, namely JSP.

Another note is that obviously I have a Java bias, given that my job involves working on Java.

With that out of the way ... what do I think?

Learning curve

Well, PHP is pretty easy to learn. I simply stumbled my way over to php.net, found the online manual, and read a few things, and said "hey! that's easy" and started coding. Within a few hours I had a website whose pages are dynamically built from a database. I'd previously written some JSP pages to use that same data, displaying it in the same format, so this is a fair comparison.

The JSP pages I'd written also went very quickly. Except, by the time I wrote the JSP pages I'd already gone through the steepish learning curve to learn JSP and JSTL programming.

Here's the rub with JSP, that somehow JSP programming is somewhat difficult to pick up. I think it stems from a couple places: a) the servlet/JSP specifications are written in a difficult legalistic style, and b) there's baggage from the early servlet history.

If one were to stick purely with JSP and JSTL then programming JSP's is pretty simple. Even writing new JSP tags is very simple, if you use the new XML-tag file format. But unfortunately the books that teach JSP and JSTL have to talk you around all the old baggage.

In comparison I found the PHP model and documentation to be very clear and easy to understand. Like I said, after a couple hours I had some pages going, having never done anything with PHP other than glance at a few source files. That's a pretty easy learning curve so far as I can see.

Capabilities

I find great fault with PHP in its capabilities.

First PHP ignores the MVC model. The "business logic" is hopelessly intertwined with the presentation. In the pages I've been writing, the core of the presentation is embedded in strings in echo statements. This in turn makes it very difficult for a WYSIWYG style web editor to do much to help you lay out your pages.

By ignoring the MVC model, this relegates PHP programmers to indirectly coding the page layout, rather than designing the layout through direct manipulation in a WYSIWYG editor. By comparison, JSP (if you use the modern JSTL or STRUTS approaches) can show quite a lot through a WYSIWYG editor, and dynamic pages can be designed through direct manipulation rather than indirect coding.

Second, PHP as a language isn't very scalable. Since JSP resides atop Java, you have a tremendous, modern, object oriented programming language with tons of software modules ready to be tapped upon. PHP doesn't have this backdrop, hence this world needs to be reinvented for PHP in order for a PHP programmer to use it. Also object oriented programming is a new feature to PHP, which I've not studied yet, but I expect since it's a new feature layered on top of a non-OO language that it's likely going to be clumsy.

Market realities

On the other hand, it's clear that PHP is very popular with some. For example I've spent a lot of time looking at hosting providers. I know it's trivial to find a hosting provider that offers PHP service, and that it's very difficult to find one offering JSP service.

That distinction has given me a lot to ponder. The conclusion I've reached is that the hosting providers have razor thin margins and need to keep costs pared to the bone. This means open source solutions are great for them, because the cost is minimal. Now, the Linux+Apache+MySQL+PHP (LAMP) is very easy to install and relies completely on open source software packages. Comparatively Linux+Apache+MySQL+Tomcat+Java is pretty darned difficult to install, if only because the mod_jk (or is it mod_jk2) installation is a mess. The installation process is very poorly documented, and it is being completely unclear whether one is supposed to use mod_jk or mod_jk2.

Putting myself in the hosting providers shoes, it looks like Java/JSP is too difficult and possibly with a low payback. Even if one were to use a different appserver than Tomcat it's almost certainly going to involve money, which the hosting provider can't afford (razor thin margins).

So in summary I think JSP is capable of a lot more than PHP, but hampered by a steepish learning curve and a difficult installation process. (2005-03-08 11:05:34.0) Permalink Comments [7]

20050307 Monday March 07, 2005

The bottom level of GUI automation tools

My previous posting was probably too long and circuitous. Just to prove that I'm overly obsessed with GUI automation, I want to discuss the bottom layer. With the foundational principles of GUI automation covered, I ought to be able to discuss the higher level issue more freely. That's my plan anyway...

Seven years ago when I was hired by Sun, my manager said "Your job will be to test AWT. Oh, and if you want to avoid manual test execution, you'll find a way to automate the tests." Since the last thing I wanted to do was shift my career to being a tester, I spent more time looking for automation technology than writing tests. Please don't tell my manager that!

At the time there was nothing suitable. In the commercial arena we had WinRunner and X/Runner but I'd had a previous bad experience with X/Runner and didn't want to use it. Basically, X/Runner is specialized to testing Motif applications and the scuttlebut was it wasn't able to deal well with Java applications (even though AWT was based on Motif). My previous experience had been at Mainsoft where I'd spent two years working on a WIN32 implementation for X11, and we tried to use X/Runner for test automation. It didn't work terribly well because X/Runner had no clue where any of our components were, and could only work on absolute coordinates.

Sun had a product named JavaStar that I thought was peculiarly useless while being almost useful. It was unable to test AWT applications because it only sent fake events through the AWT/Swing EventQueue, and in my testing of that idea the AWT components would not react to those events. The AWT components would only react to events that arrived through the normal operating system channel. I did have a meeting with that team and was unable to get their attention on this flaw.

Now, I promised a bit of discussion of the fundamental level of GUI automation. Since I hadn't been able to find an existing tool, I had to study this fundamental level. This study eventually led to the java.awt.Robot class.

I created, on X11, a class I called NativeEvent which had around five methods.

  1. Mouse-Button press
  2. Mouse-Button release
  3. Key press
  4. Key release
  5. Mouse move (set-location)

I knew these were the fundamental operations of any GUI interaction, and that if I could perform those fundamental operations from Java, that they could be used to create any GUI interaction. It was an Alan Turing sort of moment, where I knew the fundamental state machine for any GUI interaction.

So with this implementation in hand, I presented it to my immediate team. They liked it enough, but wanted to see some help from the AWT developers. So I wrote a summary, and went to meet with the developers. In the hallway on the way there they met me and said "David, we have this idea for GUI automation and wonder what you think of it". It was one of those great minds think alike moments, because they had the same five methods written on a sheet of paper.

After a moment of "hey, we've got the same idea" ...well... We turned to collaboration, with Robi Khan doing the Windows implementation and I did the Solaris implementation.

I've written about this before, but we purposely kept Robot to those minimal methods. We didn't want to embed a GUI test framework in Java (due to size), we didn't want to force the development community to a specific method of testing, etc. We left it up to the community to develop this idea further, and they did.

I like to think of those five methods as the GUI automation equivalent of assembly code. You probably don't want to code your automation with those methods, but you can, and you can certainly build amazing contraptions with only those five methods. But you'll be more efficient if you use a higher level tool. (2005-03-07 10:26:33.0) Permalink Comments [1]

20050303 Thursday March 03, 2005

Testing Java GUI applications & multiple platforms

As I said in earlier blog entries, I work in the Java SQE (J2SE) team. It is the SQE team that develops functional tests of the Java functionality. These tests are outside the scope of the JCK tests the Java customers use to validate their Java implementations.

My specialty is with tools for automating GUI oriented testing. For example I am partially responsible for the java.awt.Robot class.

This background has given me an interesting conclusion to share. This conclusion is likely different than our marketers want to have said. So I should say that this is purely my opinion.

What is this conclusion?

It's very simple - the presence of (commercial) GUI test automation tools have a large impact on the potential success for an operating system.

Why? It has to do with the relative testability of applications built for that operating system.

Why? The rule of thumb for GUI application testing is "Testing is never finished, only abandoned". This means that any application development team has a certain budget for testing, and once that budget is exhausted they won't test further. The effect of automated GUI tests is to amplify the amount of testing that can be done with a given budget.

Here is the core of my contention. Windows is the only platform with commercial GUI automation tools. This makes for an advantage to Windows since application developers can automate their tests, and thereby do more testing with their budget. Similarly, on a platform lacking commercial GUI automation tools the application developers won't be able to automate their tests, and hence their application won't be quite as well tested.

Java, however, is in a different situation.

Java has the java.awt.Robot class, and there are several GUI automation tools for Java. I have a javapedia page here which goes over those tools. This makes a Java application testable across platforms, even when there is the general lack of GUI automation tools.

That's a good thing, yes? Mostly. Turns out there is a limitation in that the Java GUI automation tools rely on being able to use java.awt.Component.getLocationOnScreen(). But what if you want to interact with another application besides the Java application?

Like, what if you want to test an applet, and some of your scenarios require interacting with the web browser containing the applet? Bzzzzt!! No luck, because that browser is a foreign application. The Java GUI automation tool is unable to "see" the browser components.

What if you went to the native operating system calls and, through the magic of JNI, brought information about the GUI components into Java?

On an X11 system you can write some native code using XQueryTree, and thereby find locations of components. However XQueryTree only gives you a set of rectangles, and doesn't give you any clue of what those rectangles are. Does anybody have a time machine so we can go back to the mid-80's and beat someone up? If you don't know what those rectangles are, you don't have any ability to know which rectangle to click on.

On a Win32/64 system native code (EnumWindows and EnumChildren) is able to give you more information. It's not perfect, but it's better than nothing.

But the cruelest twist is that on every platform there are several GUI toolkits which do not form their components in the traditional way. The Mozilla Application Object Model is a great example of this, where inside their toolkit they draw their own widgets purely as graphics. This means not even the Windowing System is going to know where Mozilla application components are. Even if one were to write a JNI module to get native windowing system data into Java, that module would also be unable to see the native components.

The bottom line for Java application developers is that ... so long as your application exists within a single Java VM, and doesn't interact (much) with native applications, you'll be able to automate tests against it. But if your application interacts with native applications you're stuck with manual testing for those scenarios.

(2005-03-03 13:42:46.0) Permalink Comments [1]