RoboGeek

RoboGeek's (David Herron) Weblog: co-developer of Robot and several other things related to Java testing.


« First time | Main | Strange IT Managers... »
20040722 Thursday July 22, 2004

Mozilla Internals are MESSY

I don't think this will come as a surprise to those who work with Mozilla itself ... unfortunately it was a surprise to me:

Mozilla Internals are MESSY

I use Mozilla and FireFox all the time. Even on my Mac, where Safari is freely available and a pretty decent web browser. However, as a user I didn't ever have to dig under the covers.

The last 4 months I've been digging deep under the covers of the Mozilla source code. The purpose is to create an automation tool for testing the Java plugin in Mozilla. The design we had in mind meant understanding deep internals of how Mozilla keeps track of the GUI components, so that we could know where they are, and then use the Java AWT Robot class to interact with the components.

In most GUI toolkits there's a fairly simple way to work with the component tree. In Java the Container class provides a getChildren method that lets you easily traverse through all the Components instantiated in the application. In WIN32 you use EnumWindows and EnumChildren(?). In Motif/Xt there's a "get children" method of some kind, whose name has faded in the 10 years since I've programmed with Motif. So, I thought, there ought to be something simple like that in Mozilla.

Nope. Instead, and this took some digging to find out, the components are stored as an RDF data structure. Why RDF? HeckifIknow, maybe someone thought it would be cool. This is especially odd as the description of the Mozilla UI is done with XUL, but it has to be translated into RDF before the GUI toolkit can work with it. Uh? Why? HeckifIknow. That's just the way it is. Fortunately I eventually found a few methods that would let me traverse through the component tree, but it was bassackwardsly hard to discover this, and even then I cannot get all the components.

But, to find this all out, well, that meant becoming conversant with the Mozilla source code.

See, the problem is that the Mozilla source tree is millions of lines of code, written in several languages (at least, C, C++, javascript, CSS, XUL, XBL, and DTD) and all of it poorly documented. There's some documentation on http://mozilla.org/, but it is all in varying states of being out of date. Plus it is rather scanty considering the size of the source tree. Next, the internal documentation is itself rather sketchy.

One of the core concepts of Mozilla is XPCOM - cross-platform common object model (which is somehow XPCOM). It's pretty cool, sorta, and sorta derived from Microsoft's COM model. You have a language-independant programming interface language derived from CORBA's IDL. Any XPCOM object implements programming interfaces defined in IDL. XPCOM objects can be implemented in some of the several programming languages mentioned above. XPCOM has a concept of "interfaces" similar to Java, where a specific IDL file defines one XPCOM interface, any object can implement multiple of these interfaces, etc.

While "cool" it's also a nightmare. This is because javascript doesn't have those interface concepts integrated with the language. In Java you have "reflection" built right in and you can easily query to determine what an object does, but in Mozilla's javascript this is tough. Instead when you call an XPCOM method from javascript you have an unknown object, with very little way of determining what that object can or cannot do. You're left .. get this .. with having to look into the Mozilla source code, discovering the actual code you're calling, looking at the actual object it's putting together for you, and that way learning which interfaces the object might actually implement. And, since the source code is so huge and cumbersome, and undocumented, and convoluted, and obtuse, this is very hard.

The seamonkey web site helps a bit - this lets you run queries over the Mozilla source tree and find related source files. It doesn't work as well as it could, as it's rather klunky around getting to the file you want, and cross-referencing between files. It also only lets you search the HEAD of the Mozilla source tree, but what if you want to search relative to some release in the past?

What does this all mean? It means that there's a hugely steep learning curve before you can become effective in working with Mozilla.

Looking back on it, I'm not at all sure I'd want to keep working with Mozilla. My project is going to be continuing for some months longer, however. Still, I'm reluctant to do much with it because the source is such a pain to work with. Give me the niceties of Java any day over this. Oh, and I'm not at all surprised, now, after having worked with Mozilla, that Apple's Safari developers chose to use the KDE based web browser rather than the Mozilla web browser when they created Safari. I'm sure anything has got to look like a dream compared to Mozilla.

(2004-07-22 17:12:38.0) Permalink Comments [3]

Comments:

I'm not sure where you got that impression, but the UI components aren't stored in RDF. Instead, they are stored in a DOM tree and you can, and should, manipulate them using the DOM methods, which are quite well documented in various places. In the DOM, the method you seek is childNodes.

Posted by Neil Deakin on July 23, 2004 at 08:28 AM PDT #

Neil,

Your site was one of the few bright spots that helped me through the hurdles of learning my way around Mozilla.

I didn't mean to claim that the UI components are "stored" as RDF. However the documentation at mozilla.org says that at runtime the UI component tree is stored as an RDF data structure. Such as this document:

http://www.mozilla.org/xpfe/xulrdf.htm

Now, the date at the head of that document is 5 yrs old and maybe something has changed since then. If it has changed since then, that's part of the problem isn't it? That the mozilla.org web site has a bunch of out of date articles leaving stale misinformation behind.

You claim that the UI component tree is well documented and stored in a DOM tree. I agree that's mostly correct -- but, for example, how do you obtain a list of top-level windows? By using the nsIWindowMediator object. But the objects you retrieve with the enumerator (getEnumerator method) are undocumented, so you have to grunt around in the source code to see what is being returned. That turns out to be an nsIDOMWindowInternal, and there's several important pieces of information that can only be retrieved through that interface.

Posted by David Herron on July 30, 2004 at 01:23 PM PDT #

Mozilla & Safari are Part of the Problem; Speaker Panel to Address the Issue

It's great to hear someone expose the unfortunate reality of Mozilla's internal code. I'm only familiar with it as user of it, but that's been quite enough for me. One of the greatest ironies in business today seems only to be understood by those who have spent a significant amount of time trying to develop rich-GUI Web applications according to W3C (World Wide Web Consortium) standards. Ironically, Microsoft has done a far superior job of reliably implementing the core W3C standards than Mozilla, since about 1997. I have been working in this area for five years.

I believe that we sorely need competition in the Web browser market. But it must be finally said that the release of faulty Web browsers such as Mozilla and Safari as "ready for prime time" is damaging to the Web. I have spent a total of 23 months doing heavy client-side JavaScript almost all the time, including 15 in a 8-front-end developer team setting. Additionally, I've maintained active involvement in this type of software development since 1998.

The sad fact that seems so hard for most supporters of "choice on the Web" and open-standards to face is that everyone I know who has had a similar experience to mine in working with client-side JavaScript/DOM finds Mozilla and especially Safari absolutely maddening. These Web browsers should be released as "experimental-only" until their major bugs are fixed. I've filed several of them with Mozilla's BugZilla, but to the best of my knowledge none have been permanently fixed.

To illustrate, one of my favorite bugs is how Mozilla sometimes "doubles" all the HTML written to deeply nested markup containers (such as 'div'). To better bring the important implications of these issues to light, I am holding a Carnegie-Mellon-West Speaker Panel on the topic. I'm looking for more panelists who can speak from experience (business or technical) to this "controversial" issue. Please contact me at cbalz@andrew.cmu.edu if you have suggestions for panelists. So far, I have some very good speakers signed up for the panel. Please find the description of the panel below.

Announcement about a Speaker Panel on Current Web Browsers

As we continue to see rich-GUI Web software slide into proprietary formats, and the pressure on Microsoft to play nicely from legal battles ease, the time is right to address the future of the Web. I would like to invite you to come to a <a href="http://west.cmu.edu/specialPrograms/speakers/">speaker panel at Carnegie-Mellon University's West Coast Campus, at Moffett Field (45 minutes south of San Francisco), entitled, "Back to Proprietary Client-Server, or Web Renaissance?", on November 10th, 2004, at 6:30pm.

Details:

In this panel, I would like to have the panelists speak to the following questions:
  • What role did open standards (specifically, HTTP and HTML) play in the initial adoption (from 1994 onward) of the Web and its development into a giant new business market and more?
  • How did these open standards come to take hold? What were the major obstacles? Was the driving force to adoption a mix of technological evangelism and market forces?
  • Compare and contrast HTML to DHTML (Dynamic HTML) with the JavaScript binding.
  • Give your perspective on client-side Web software development, and its importance today on the Web.
    • What is the importance of Dynamic HTML and the binding to JavaScript on the Web today?
    • Do you view the application of OO techniques to JavaScript, particularly simulation of Java-like class-based inheritance in JavaScript, as helpful in deploying Web software?
  • Is JavaScript a more realistic choice on the Web than Java for the client-side of consumer Web applications, due to JavaScript's current near-ubiquity, lack of need for installation, fast start-up time, and security (especially, it's lack of a file API)?
  • Give your view of the pros and cons of server-centric Web development (i.e., JSP, ASP) as opposed to a Web based on the distributed application or client-server concept.
    • What are the pros and cons to the consumer Web user, who needs high usability, interactivity, and speed, on dial-up and on high-speed?
    • What are the pros and cons to the server center, from the perspectives of IT (i.e., scalability), engineering, and security?
  • Why is it that the "other" Web browsers -- the non-Microsoft Web browsers -- do not support the W3C standards correctly enough to support building "next generation" Web applications based on public Web standards?
  • What is the impact of this situation, as we see rich-GUI Web applications migrate to proprietary formats such as Flash and IE-specific extensions?
    • What are the security implications of a Web highly fragmented among Flash browsers, Microsoft Longhorn Client browsers, Web-standard browsers, and Internet Explorer-specific Web sites?
      • Would this amount to an unmanageable blizzard of security patches for consumers?
  • What can be done to improve the support of Web browsers for rich-GUI Web applications built solely on W3C standards?
    • Would an industry consortium be an appropriate vehicle for this task?
    • Can the market alone, in its current state, take care of this situation?
    • Is technological evangelism needed?
  • Imagining a supportive climate for distributed Web applications built on Web standards, what market spaces would this create?
    • What would be the role of current vendors of traditional Web application frameworks, such as BEA, IBM, and Sun/Netscape?

Posted by Christopher M. Balz on September 30, 2004 at 11:23 AM PDT #

Post a Comment:

Comments are closed for this entry.