Wednesday Sep 28, 2005

Invokedynamic

We’re looking to improve support for dynamically typed languages on the Java platform.

  Tangent : that’s dynamically typed languages - not the increasingly common horrible misnomer dynamic languages (as opposed to static languages, where nothing moves, like hieroglyphics, perhaps?).

A lot of hype has been generated over the .Net VM’s support for multiple programming languages.

Another tangent: I always refer to the .Net VM and not the the CLR. It’s a VM, such things have always been called VMs, and the only reason I can think of for giving it a different name is to confuse people. Confusing people might be useful if you wanted to convince them that you’ve invented something when you actually copied it from somebody else.

More generally, jargon has been used throughout the ages to intimidate and exclude. This is a tradition in every profession: don’t say parameterized type when you can say parametric polymorphism .  

Introducing non-standard terminology is popular in industry, as it prevents people from easily comparing your product to others, so that they aren’t tempted to defect. It’s a common trick used by many large companies throughout the history of computing. For extra credit - what’s a long, complicated term for a B-tree?

The fact is, the .Net VM provides support for all manner of statically-typed, single-inheritance imperative object-oriented languages. It could potentially host C#, or Java, or Oberon or Ada or Modula-3 just fine. In other words, you can program in whatever you want, as long as it is basically C#.

Despite the hype, languages with fundamentally different models won’t work all that well. For example, if you don’t have static type information, you cannot use the VM’s highly tuned dynamic dispatch for method calls, and end up doing your own emulation in software. This is tiresome for the implementor, and more importantly, really slow.

The JVM is not really that different. In fact, a study done by some folks at Aarhus University, part of the culture that invented object-orientation, found that there was no significant difference in difficulty between porting Beta to the JVM or the .Net VM.

So what can we really do to make dynamically typed languages easy to port, and port so they run well, to the JVM?

Last winter we had a meeting with various people who work on such languages - things like Groovy, Perl, Python/Jython.  Our conclusion was that the most practicable thing was to support dynamically typed method invocation at the byte code level.

The new byte code, invokedynamic , is coming to a JSR near  you very soon. I’m forming an expert group which I will chair (because of my deep fondness for standards, committees, meetings and process). This group will get to argue over various fine details of how this instruction should work.

Basically, it will be a lot like invokevirtual (if you don’t know what that is, either open a JVM spec and find out, or stop reading). The big difference is that the verifier won’t insist that the type of the target of the method invocation (the receiver, in Smalltalk speak) be known to support the method being invoked, or that the types of the arguments be known to match the signature of that method. Instead, these checks will be done dynamically.

There will probably be a mechanism for trapping failures (a bit like messageNotUnderstood in Smalltalk).

Does this do everything everyone wants? No, but that is not the point. It isn’t really feasible to accommodate the exact needs of a wide variety of disparate languages. Instead, one should provide a good general purpose primitive, that all these languages can build on.

Some might like the new byte code to support multiple inheritance - but each language has its own multiple inheritance rules, and supporting all of them is hopeless. In most cases, the lookup process for multiple inheritance can be benefit from this primitive.

Dynamic languages with a clean single inheritance semantics like E will be able to use this primitive directly for most calls.

Another problem is calling methods written in Java from a dynamically typed language. Static overloading makes this tedious. Languages may have varying mechanisms for dealing with this, often reminiscent of multiple dispatch. This is way too complicated to put into the VM, but the trap mechanism mentioned above should help implementors deal with that problem relatively efficiently.

All of this should eventually make it the use of dynamically typed languages on the Java platform easy, efficient and common place. That’s a good thing. I’ve long been a fan of such languages (well, not the popular ones; rather, languages like APL, Scheme, Smalltalk and Self; face it, I’m a snob and proud of it).

Why does this matter: I’m convinced these languages have a growing role to play in the practice of computing in the coming years. The extra flexibility of dynamic typing will become more and more important as software evolves.

This is not to say that static type checking is to be avoided. As I indicated in my prior posting about pluggable types, the static-vs-dynamic typechecking wars are pointless; one can eat one’s cake and have (most of it) too.   Invokedynamic is a modest, pragmatic yet very important step that helps the JVM become a hospitable environment for such cake-eating and having.

Saturday Sep 10, 2005

Pluggable Types

A few ideas on type systems have been proposed by people who commented on my last entry.

The way to handle these issues is through the notion of pluggable types. Briefly, the idea is that the language is dynamically typed, and various type systems/static analyses can be added as plug-ins.

I wrote a brief position paper on this for a workshop last year. My website contains a presentation on the topic as well.

I'll go over the basics here. Most people are familiar with two approaches to types in programming:

  • Dynamically typed languages, like Lisp, APL, Smalltalk and (much more popular these days), scripting languages like Perl, Python, Ruby (that's basically Smalltalk with a Perl style syntax) and Javascript.
  • Statically typed languages, like Java, C, C++, C#, Fortran etc. Note that statically typed doesn't imply that the type system is sound, or gives any guarantees of any sort. What it really means is that there is a mandatory type system. Your program isn't legal unless it passes the type checker (however broken that type checker may be).
  • There are endless religious arguments over the merits of one approach or the other. These debates are often pointless, because the split between the mandatory and dynamic type religions is a false dichotomy.

    An alternative is to view typechecking as an optional tool, like lint. Now, I define an optional type system very strictly. There are two requirements:

    1. The dynamic semantics must not depend on the type system.

    2. Type annotations are syntactically optional.

    The first requirement is the really important one. The second requirement is obvious to many people, but in fact it's not that significant. People often get hung up on things like type inference to address (2), when in fact that is exactly the wrong thing to focus on.

    A few optional type systems have been built, but less than you think. The definition above excludes quite a few efforts. I built such as a system for Smalltalk. Phil Wadler did some work on Erlang.

    If your language doesn't depend on the type system, you can in principle have multiple type systems that can check different properties; you can evolve these systems independently, as tools. The type systems can be viewed as plug-ins, hence the notion of pluggable types.

    At this point, all the good or bad type checking ideas anyone cares to come up with are up to them to implement. People would not need to appeal to the keepers of the language to consider their favorite idea.

    How to actually do pluggable types in a clean way is still subject to some research. I'm sure it can be done with good IDE support.

    Read the position paper if you're interested.

    Friday Sep 02, 2005

    The madness begins

    Welcome to my weblog, said the spider to the fly ...

    I have a long list of issues that I might expound upon here, on the egotistical and satisfying assumption that you, gentle reader, have nothing better to do with your time than read this (or, more realistically, that you find that reading blogs beats working).

    Expect postings on closures, tuples, dynamic language support, generics, verification etc. However, I might as well start off with the sexiest issue on my list.

    People that know me are aware that I don’t regard the Java programming language as the be-all and end-all of programming languages.

    In fact, several people at Sun, past and present, would love to spend their time on a new language, free from the shackles of compatibility. So while much of this blog will be about the existing language (and VM), why it is the way it is, and ideas for where it is going, some of it will be about some semi-mythical future language. Graham Hamilton already gave it a name: Kenya.

    We would like to hear ideas for the next-big-thing in programming languages. Maybe, just maybe, the company will even start giving grants to academics to study these ideas. I, for one, welcome brutal critiques of mainstream programming languages. Brutal critiques that make sense are even better.

    Now, don’t expect me to actually respond to any comments on this blog. I almost certainly won’t; I may be too busy, or I may think you’re too stupid, crazy, evil or all of the above for me to spend time on your rantings. Nevertheless, do try to be articulate, polite (yes, brutal and polite - no contradiction there) and above all, intelligent. I don’t tolerate fools gladly, as that only creates a feedback loop that breeds more fools.