RoboGeek
RoboGeek's (David Herron) Weblog: co-developer of Robot and several other things related to Java testing.

Tuesday April 26, 2005
Cell phones as PDA's A few years ago, while Rich Green was still in charge of the Java
efforts at Sun, I remember him talking to us about Java in Cell Phones,
and the future of cell phones.
The gist of the message was: Stop thinking about them as telephones,
because they're really mobile computing devices with a built-in
telephony applet.
Flash forward to today ... We have cell phones with pretty color displays, and we have the Virtual Girlfriend
available to satisfy the, ahem, needs of (presumably) young men around
the world. You can use your cell phone to play games, look up
stock quotes, do email, send text messaging, and more. Just
yesterday I was reading how the Chinese Government is worried about how
text messaging over cell phones is being used to organize protesters,
so they're planning a crackdown.
What's prompting this posting is an email on an internal mailing list,
the person is seeking advice for the choice of a new cellphone.
He wants to decrease the number of gadgets he carries (currently a PDA
and a cellphone) leaving him with just a cell phone, so therefore he
wants one of the higher end phones, but not so big as the Treo's.
Looking at the question, I'm seeing my personal dance around which cell
phone to use. I want features, but I want it to be
carryable. So....?
What occurred to me is this question.... why store any data on the
phone (portable-computing-device)? Why not make the
gadget-that-we-currently-name-"cell-phone" just an access device?
Why not store the data out on the network somewhere, and you access it
as-needed?
By storing the data "out there" rather than on your phone, the device
you carry with you can remain small yet offer large capabilities.
Further, the service that stores your data could offer more services
than your portable device can. e.g. you could have multiple UI's,
like from your work computer, your home computer, or web pages.
It could send you email reminders. It could integrate with
something like the Franklin/Covey planning system. All this would
be difficult if the data were to be stored only in your
portable-computing-device.
Hmmm....
(2005-04-26 11:07:55.0)
Permalink

Monday April 25, 2005
A "JTable" clone in javascript
OS3Grid - Grid for Web Sites
I saw this announced on freshmeat and found it interesting. It's
written in pure javascript and runs across browsers. And, it causes me to write again about the rich client experience.
The look is similar to the JTable, but the functionality is nowhere the
same. For example, while you can click on column titles to sort
the table by the column, you can't drag column titles to rearrange the
table, nor can you resize rows or columns by dragging. Still, it
looks to be a pretty convenient way to put a table in a web page and
have a little more interaction than the HTML TABLE element allows.
For example, on twiki sites if you build a table, clicking on a column
header lets you sort the table by that column. But to do so means
a round-trip between the browser and the wiki software so the wiki
software can reformat the whole page with the table sorted a different
way. THis is very wasteful because it makes the user wait.
The rich client story is to do that sorting in the client. A rich
client knows a little more about what the data is, and can act directly
on the data in a rich way. For example a Swing rich client knows
it has a JTable, it has a TableModel full of data, and it has some nice
interactions it can offer the user depending on how the JTable is
configured. Similarly the OS3Grid thingy has a data model and can
offer some nice interactions.
But with the current incarnation this OS3Grid thingy is less capable than the Swing JTable.
The question, I suppose, is whether JTable's extra capability makes a
difference. A developer wanting to display a fancy table that has
a good user experience, could write a simple APPLET and use the
JTable. But is the extra capability in JTable important to the
end user? Or is the OS3Grid thingy good enough? And can it
improve over time to take on more and more capabilities? How
close can it get to JTable?
(2005-04-25 09:01:06.0)
Permalink

Friday April 22, 2005
10 years, sometime around today Well, Java has been known to the world for 10 years or so, now.
We had a party in the courtyard of the Santa Clara campus. There
was drinks, snacks, a dunk tank, James Gosling (and others) being
dunked in said tank, and Jonathan Schwartz refusing to get near the
tank. There was a bunch of people, music, and it being Northern
California in the Spring, there was some rain.
I was with the team for the 5 year anniversary, and that party was a
lot bigger. For whatever that's worth. Of course, 5 years
ago was a completely different Universe so far as these things are
reckoned (e.g. the bubble hadn't even burst yet).
Okay, so what happened? Other than schmoozing with co-workers and
former co-workers who'd been snuck onto campus for the party - we had
some speechifying, and a handing out of the official t-shirt and pin.
Hearing James Gosling recollect about Java's early days is always
interesting. I remember when we were discussing moving Swing into
"the core", there'd been a raging debate about whether to name the
package "java.swing", "java.awt.swing", "javax.swing" or something
else. And that one day James happened upon the Swing/AWT team in
the lunch room and I heard him talking about "in the 10 minutes we had
to decide X" about how quickly some of the decisions were made in the
early days.
Anyway, today he talked about how the "Project Green" people had holed
themselves up in a small office suite on Sand Hill Road and been
dreaming up science fiction fueled by soda's and ding dongs. He
claimed to be astonished at how we turned their science fiction dreams
into science reality.
And that this weeks most amazing thing he's seeing done with Java is
that in Brazil the whole health care system is run by Java.
They've got a zillion things running through cell phones with Java
MIDLET's, including filing ones taxes that way. That's
interesting alright.
What I found amazing is the required raising of the hands to find the
old-timers. Most software projects the people come and go, 2-3
years being the typical "stay". But with Java, there's many who
have been with the team for 10+ years.
It's not every day a software engineer gets to work on a product like
this. Speaking for myself that's been a source of why I've been
here for 7 years when my original plan was to join for a few months,
learn Java, and then move on to somewhere as a Java developer. To
know that what I'm working on is used by millions of people around the
world is "juice" ...
(2005-04-22 17:04:16.0)
Permalink
The "rich client"
Let's think about this question for a bit: Is AJAX worth adopting?
But let's first define AJAX since it's a relatively new model for
GUI applications. AJAX threatens not only the Java APPLET, but
also Flash, and perhaps some of the things Microsoft does. A
great example of AJAX is at http://maps.google.com/, the interactive local map service that Google recently launched.
The idea is that with modern HTML+CSS, modern browsers, modern
Javascript+DOM techniques, one can mimic a wide variety of GUI
applications. For example I'm typing this into a widget that
looks like a stripped down word-processor. It's creating HTML
that will eventually be uploaded to the blog server software. The
great thing is I don't have to code the HTML. If you search for
"WYSIWYG Javascript HTML editor" you'll find several similar
applications out there.
The goal remains the same - how do you deliver a "rich client" experience to your customers.
For many years the "web application" has been popular. At the
simplest you have HTML and the FORM elements, the application logic is
written on the web-server end and controlled by a sequence of
CGI/JSP/PHP/ASP/etc scripts. The advantage is the application
logic is centralized, making it easier to update, easier to deploy, and
perhaps more secure. The disadvantage is that the user experience
is very poor, because of the limited flexibility at the client.
At Java ONE last year, and for a couple years now, the message from the
Java Client team has been "we got rich client experience".
Anytime you have a real GUI toolkit you can construct a proper rich
client experience. This involves good feedback, interactive data
checking in filling out forms, a wide variety of GUI components (e.g.
sliders or sortable tables or ...), rich presentation quality, etc.
Generally, to have a proper "rich client" experience one has had to
either develop a native application and be tied to a specific operating
system, or to use Java. There's been some exceptions of course,
but in the big picture that's it.
At the same time, the goal is still there. Delivering applications to any client computer.
Some in the world are happy being limited to Microsoft's operating
systems. They do have the biggest market share for desktop
systems, and hence provide the biggest market to sell into. But
many in the world are searching for alternatives, and I think the
number is growing with the continuing virus and security problems
Microsoft consistently refuses to do anything serious about.
While I prefer people to use a Java solution, this AJAX thing is up-and-coming.
What it offers is a very simple deployment model. One simply
visits a web page, and the javascript loads and sets up the user
interface. That's it.
Of course both Applets and Flash offer the same model. Clearly
some in the world want to use the AJAX approach, by using javascript
instead of Applets or Flash.
The "is AJAX worth it" author makes some good points. The makers
of web browsers have done a terrible job on the compatibility story, so
any application developer using the AJAX approach is going to have to
work through the incompatibilities.
(2005-04-22 13:02:09.0)
Permalink

Wednesday April 13, 2005
What problems are faced in Java GUI automation? To help the Java Quality team understand the hurdles faced in
automating Java GUI tests, I wrote up a paper detailing the
problems. This was a couple years ago, and I want to summarize it
here. The purpose was to document the challenges we face in
automating the Java GUI tests, and guide us in choosing the tools we
now use in the automation work.
What I came up with was a list of attributes ... each attribute
contributing its own automation challenge. I kept the attributes
as orthogonal as possible, to let them be considered independantly
where possible.
Has a GUI: While this might seem nonsensical, not all "client side" testing requires a GUI. Therefore, if a test application does have a GUI, then the challenge is to have an automation tool.
Graphics
rendering destinations: In Java there are at least
three destinations upon which Java2D can render. To test Java 2D,
one must test all those destinations. What are these
destinations? They are: a) on-screen via a Component, b)
off-screen to a BufferedImage, c)off-screen to a printer. Of
course, for each there are several alternatives depending on how you
configure the destination
The automation challenge here is to capture the output for each rendering destination.
Keyboard and Mouse interaction with GUI: Having to interact with the GUI is a step up in complexity from Has a GUI,
since a test application could have a GUI, but you don't interact with
that GUI. For example, a test application could be a simple
container for a rendering test, and you simply run the test application
and do a screen capture.
The automation challenge is to have an automation tool capable of
sending keyboard and mouse events. I've listed several on this page.
One of the big concerns in this area is to ensure the events sent by
the automation tool are indistinguishable from keyboard or mouse events
sent when a human bangs around on their computer.
In Java (AWT, Swing) it is quite possible to inject Event objects into
the AWT Event queue. A Swing application will respond to those
events, and in some test tools they claim it is a big feature to be
able to test a Swing application this way. But this is an
incomplete test.
Why? Because the events haven't traversed the normal code path
used when humans interact with the application. There is a lot of
processing of incoming events done by AWT, and these events eventually
turn into the objects that go through the EventQueue. If you
inject events into the EventQueue, that code in AWT is not traversed,
and you are missing out on possible bugs inside AWT.
Complexity of the GUI:
Obviously, the more complex a GUI is, the more difficult it will be to
automate it. A highly complex GUI presents automation challenges
all over the place.
However, complexity actually hides in the open. For example, if
you want to click on the arrow button in a scroll bar, how do you do
it? Every automation tool depends on being able to answer this
question:
Where do you point the mouse?
For things which appear in the AWT Component tree (traversed by calling
Component.getChildren) it is trivial to determine their location,
because you can call Component.getLocationOnScreen. However some
objects you want to interact with do not appear on the Component tree,
such as the scroll bars.
Another complexity comes when you parent someting in a Viewport
(JScrollPane) component. How do you know which parts of the child
component are actually visible? How do you know how far to scroll
the viewport to make something visible?
The text components like JTextArea present an interesting
challenge. What if you want to use the mouse to select a specific
range of text? These components give you no clue the x,y
coordinate of the text they've drawn on-screen.
The AWT MenuComponent's do not offer a getLocationOnScreen method, meaning you don't know where they are.
In AWT there are a couple native dialogs, which are opaque
objects. If you want to automate interactions with those dialogs,
the only choice is to write some JNI native code to call the native
windowing system and discover the component location that way.
For testing Applet's running in a browser, several types of tests
require interactions with the browser. One cannot know where the
browser is, unless you go to the trouble of writing a tool such as the
one I discussed yesterday. If you have more than one Applet you
want to test together, then that offers its own automation difficulty,
because while the Applets are all running in the same JVM, they are in
separate AppContext's. There are a few security walls one must
breach to discover component locations in another AppContext.
A multiple-window application is more difficult to test than a
single-window one. So long as the multiple windows are in the
same Java VM then it's relatively easy, but like I said for the
Applet/Browser case if the application involves native windows then you
do not know where the native windows are.
If you are testing several Java Applications together, each in their
own JVM, this is a challenge similar to the native window
challenge. A test can only "see" the Components in its own JVM,
and is just as blind to Components in other JVM's as it is to the
Components of native applications.
Drag & Drop is relatively simple to automate, but presents a few
challenges. One biggie is, again, often you want to drag
something to a different application. e.g. Dragging a URL into
the Location Bar of a browser.
Requires Visual Verification:
You can do a lot of testing without verifying that the application
looks right while running. There's a risk here, of course, that
you might miss rendering problems. However, visual verification
is tough.
The simplistic approach is to take screen shots, and then verify those
screen shots against later builds of your product. But ... how do
you organize the screen shots making it easy to maintain them?
And easy to update them when needed? The maintanence of a set of
golden images, as we call them in the Java team, is actually very
expensive.
Then, which platform do you take the screen shots on? All
platforms? The more screen shots you have, the more expensive is
the problem of maintaining the golden image repository. Also, you
may find subtle differences between graphics cards, even on the same
OS/CPU.
What about when (not if) the application GUI changes. You have to
reshoot all the golden images, adding to the maintenance cost. If
the appliction is being actively developed, it might be changing
weekly, meaning you have to recapture all the images each week.
The more often you recapture the images, the more expensive, and the
harder it is to get a return on investment.
Testing your application under different look&feel's is an obvious use for visual verification.
One thing you cannot test under visual verification is whether you have
the correct cursor. Robot cannot capture the cursor image.
Graphics Complexity: If you are doing visual verification, there are added complexities that can arise.
The most obvious are animations. The problem is, where in the
animation cycle do you do the screen capture, and how can you predict
which of the possible images you will capture? You can't, in the
general sense. Though in some specific cases you can change the
animation algorithm allowing you to single-step it, and then do screen
captures at each/most steps.
Text, fonts, international languages, oh my: Here be dragons (from China). This problem is of exponential proportions.
The niaive approach is to take a given character string (e.g. "
the lazy brown fox jumped over the lazy dog")
and render it in each font, in each font size, in each styling, with
each Java2D perturbation you can think of, on each operating system, in
each locale/language, in each color, and in each phase of the
moon. Or something like that. The point is that you quickly
get to huge numbers of tests, because of combinatorial mathematics.
What to do? Be smart, and test the most strategically significant combinations.
Sound rendering accuracy:
Some applications make bleeps and bloops as sound cues. For
example, Quicken makes a cash register sound every time you enter a
transaction. Other applications play audio as its main purpose
(e.g. an MP3 player).
If you want to test the accuracy of sound playback, well, good
luck. You have several possible innacuracies in the mix.
The first is that the sound card is going to translate the digital
sound instructions into analog signals. The second is when the
analog signal is played on a speaker, the speakers have varying
accuracy. The third is when you go to digitize the sound, you're
doing it through a microphone and an audio digitizer, each of which
have their own innacuracies. To verify the test, you'd have to
play the sound, digitize it, and "compare" the two files, but with all
the innacuracies you've got a tough row to hoe in doing the comparison.
You can bypass some of the innacuracies by connecting the line output
of the sound card to the line input of another computer. But
there are still innacuracies.
(2005-04-13 16:08:00.0)
Permalink

Tuesday April 12, 2005
Automation of web browsers and the java plugin I've been working on a tool meant to help the java plugin SQE team
automate their testing. I've finally "finished" the tool, and
want to write a bit about it.
The first question is "why"?
Why do this? I've covered it before, so here's a summary:
- "Testing is never finished, only abandoned", which I take as "there's a given budget for testing, and automation will expand the testing you can do with that budget".
- There's no existing tool for linux or solaris which will automate
java applets and web browsers. Especially not for the modern
browsers (mozilla, firefox).
- Automating the browser/applet scenarios are the biggest pain
factor in the java quality team, hence we'd get the biggest payoff by
resolving their pain.
- The plugin has a different implementation on each platform, hence it has to be tested across platform.
We considered several routes, and selected the one that had me
rummaging around the innards of Mozilla for a long time. The
routes were:
- Use javascript inside mozilla/firefox to find out the GUI
component tree. Send the GUI component tree to a Java
program. Use Robot from the Java program to send events at the
browser.
- Develop a C/C++ extension to mozilla that does the same as Robot, and write something like Jemmy in Javascript.
- Use the platform accessibility API's to discover the GUI components, get them into a Java program, and use Robot to send events.
- Continue using the low level windowing system calls to get GUI
components, but this gives us limited information (especially on X11).
Route (4) has been very painful mostly because of the limited that
XQueryTree gives you on X11. When you automate an interaction
against some component, you need to know much more about the component
tree than the x,y,width,height rectangle (which is all XQueryTree gives
you).
Route (3) would be really cool, as it would work with any GUI
application. Unfortunately on the X11 side of the world,
accessibility is a very mixed story. The bright spot of X11
accessibility is that GNOME has an accessibility interface (developed
by some Sun engineers), and that the KDE folk have apparently agreed to
adopt the GNOME accessibility API. Since Windows and Mac OS X
already have good accessibility support, the flourishing of the GNOME
accessibility API will eventually mean that accessibility is available
"everywhere". But "eventually" wasn't good enough for us.
Route (2) would be a more complex way to hook into Mozilla's innards. So I chose route (2).
It took a lot of research, and along the way my eyes became opened to a
major flaw in the Mozilla project. Namely, DOCUMENTATION.
The source tree is millions upon millions of lines of code -- written
in around 8 different programming languages (C, C++, javascript, XUL,
XBL, HTML, CSS, IDL, ...?) -- there's somewhere around 0 internal
documentation -- and somewhere around 1/2 of mozilla.org is stale
out-of-date documentation.
All that goes together to make an extremely steep learning curve to
being productive in understanding Mozilla. Further, my project
was not the typical use of javascript or any of the API's, and most of
the documentation I found had limited use even if it was
up-to-date. There were two resources of great use. First
was an Sun engineer in the Beijing office who answered dozens of emails
from me. The second was the netscape.*.mozilla.* newsgroups.
For years before I had been looking at the DOM Inspector tool, noticing
how it had screen x,y coordinates for everything in the Mozilla
browser, and desparately wanting to be able to use that. Well,
it's the DOM Inspector tool which provided me some of the
information. Unfortunately the DOM Inspector is also a very tough
piece of code to understand ... some of the crucial bits are buried in
the object model for the tree, and it's tough, even now, unraveling how
that object model works.
Would it be too much to ask for a few stinkin' comments????!?!?!?!
(2005-04-12 15:45:46.0)
Permalink

Monday April 11, 2005
More on Jonathan S's GPL comments Last week I posted about Jonathan S's strange comments on the GPL. Today he posted a clarification.
Basically he's got two things going on. First is to show how
great the CDDL is, and the second is he continues to slam the
GPL. The things he says about the CDDL are very interesting, and
I like the ideas. I've only skimmed the CDDL so I won't say more
than it is rather readable.
But his posting continues to slam, needlessly, the GPL. And to top it off he offers this story as justification.
The story is that at the 2005 CeBIT conference, an Open Source
programmer was going to various companies to point out how they're
delivering Linux-based products, but not following the GPL licensing
requirements by disclosing their source.
Okay, this is very strange. First he puts up this strawman of
poor 3rd world countries wanting to develop software based on open
source, and then he proposes Motorola and 12 other companies as
concrete examples of the problem. Motorola???
Motorola is hardly an example of a poor 3rd world country. Heck,
Motorola probably has yearly sales exceeding the GDP of most 3rd world
countries. And clearly Motorola has enough smart lawyers to be
able to figure out the GPL and know how to comply with it.
Yet, apparently they did not. (I haven't checked further ...)
Let me offer a counter example:
Hacking the Linksys NSLU2
Linksys is selling several devices that use an embedded Linux.
They are also complying with the GPL, by distributing source code and a
compiler toolchain. This allows end users to customize the
products by compiling and installing modules of their own choice.
The above product is a network attached file server box. It
lets you plug in a USB2.0 box containing a disk drive, and have the
disk appear on the network. A simple little box, easy to use,
etc. But, the geeks of the world saw that and wanted more.
e.g. out of the box it supports SMB and the Windows world, but what
about those of us who use other systems? Systems that prefer NFS
over SMB? Well, we'd be out of luck, but for these geeks who
worked out how to get into the box and compile NFS support. Now
getting the box to support NFS is as simple as downloading new firmware
off a website and installing it.
But wait, that's not all. The geeks also wanted more.
Some are using these as MP3 storage in a way that lets iTunes see the
files on the network attached disk. Others are using it for other
media devices. Others are running mail servers or web sites with
it.
It's the GPL that enables this flourishing of creativity and power to the people.
(2005-04-11 14:41:18.0)
Permalink

Wednesday April 06, 2005
Jonathan S's nonsensical criticism of GPL
Sigh, it would be great to have a leader I could agree with more
often. He seems like such a smart guy, full of energy, but the
ideas that come out of his mouth so often seem strange.
Tuesday (April 5, 2005) he addressed the Open Source Business Conference
and slammed the GPL. The resoning is very strange. There's
a provision requiring that people who use GPL'd code to create another
product must release that product under GPL as well. His claim is
this provision is an onerous burden imposing "a rather predatory obligation to disgorge all their IP back to the wealthiest nation in the world".
er....
Okay, first, nobody is requiring these people to use GPL'd code as the
base of their projects. NOBODY. They can use other code, or
write their own.
Second, the GPL is not owned by the United States. One of the
biggest GPL projects, Linux, was started by a fellow from
Finland. Thus, when someone follows the GPL and does share their
code with the world, it is the WORLD they are "disgorging" their code
to, not the United States.
I once heard a very smart person give a talk on open source
software. Who? Ken Arnold, a Sun employee, and one of the
Jini people. He advised using the license for your project that
fits with your purpose. If you don't want to use the GPL then
don't use that license. And given the nature of the GPL, that
means you must also eschew code based on the GPL. In other words,
the choice is up to the people running each project, and there's no
need to slam the GPL.
The GPL isn't for everyone, but it's obviously suiting the purposes of
a lot of people. And, you know what? That's a GOOD thing.
(2005-04-06 11:12:49.0)
Permalink