Thursday May 03, 2007
Thursday May 03, 2007
People talk about the "conceptual weight" of language features. Here's a way to make that a more precise concept, and more accurate too.
The JLS is structured bottom-up: universal elements like grammar, types, values and names; then Java artefacts both major (packages, class, interfaces) and minor (arrays, exceptions), then back to universal concepts like statements and expressions. (The chapters on execution and binary compatibility are in the wrong place; they should be at the end with assignment analysis and the memory model.)
We can exploit this structure when adding a new language feature. By asking which JLS chapters would be affected by the feature, we can gain an idea of the feature's semantic impact, and thus its complexity. For example, adding a new statement form is done near the end, so it's a minor addition. Adding a new keyword is done up front, implying there are many artefacts the new syntax could interact with - such as the names of all your variables.
So if a new language feature L needs changes in a set of chapters S, the complexity of L is:

where degree(S) is the number of cross-references between chapters necessary to fully describe L. An approximation of this factor is P(S,2) - note a permutation not a choice because a forward reference and a backward reference each add complexity.
S should never include 1, the introduction, or 18, which is really an appendix. (And a syntactic grammar is already introduced in chapters 3-15.) You'd have to watch out for a couple of things: some chapters (notably 6) trivially recap or preview other chapters, and example code should not generate vacuous cross-references.
So, consider these JDK 1.5 changes:
- Hexadecimal floating-points literals: S={3}, degree(S)=1. Complexity: 0.33.
- Enumerations: S={3,8,9,13,14,15,16}, degree(S)=a. Complexity: 2.33a.
- Generics: S={4,5,8,9,10,13,15}, degree(S)=b. Complexity: 1.75b.
Since b is probably higher than a, 1.75b will be higher than 2.33a, reflecting the sense that the complexity of generics is higher than the complexity of enums. However, to really capture how much higher, you'd want to improve the granularity of S's elements from chapters to sections or subsections, so that |S| shoots up for generics. Also you'd want to force the measure's range into a reasonable value set by means of constant factors. (Different factors would be needed for different granularities of S.) Finally, since I'm claiming that a total order exists for the chapters, the breadth of the changes - i.e. max(S)-min(S) - could be used in the formula.
I don't use this measure in real life but it is fairly plausible. Comparing an abstract view of a single feature against the abstract view of the whole language might serve as a proxy for the effort needed to implement and test the feature in a compiler.
Monday Mar 19, 2007
Given version 1.0 of an interface, adding a method to produce version 1.1 is:
Migration compatibility is the least understood and the most interesting. It concerns how a type is used, and is discussed in JLS 4.7 in the context of generics. It essentially means that:
Tuesday Feb 06, 2007
With all the blogging about Java 7 language features, I thought I'd point out that many ideas are already represented by proposals in the Sun Developer Network database. The comments about each proposal go back years - to a time before blogging when people left their thoughts on a sun.com site.
Why have these proposals been hanging around for so long? Mostly because the process of evolving the language can only handle a relatively small number of features per release. There are hundreds and hundreds of possible features in the database. The tough part isn't designing any single one of them (and I would encourage you not to read too much into the exact designs contained at the links below) but choosing which ones to design. We have to Do The Right Thing as well as Do The Thing Right.
Suppose there are 200 proposals and we can implement 10 per major JDK release. Now calculate C(200,10) and you'll see that every Java developer on the planet can have their own favourite combination. Which combination should make it into the JLS? (Note I say the JLS rather than javac; people can play with javac to their hearts' content, but the JavaTM Programming Language can only take so much.)
So, while the proposals below may be excellent in and of themselves - and it does seem like some will be getting a new lease of life - please realize that in the past, there were more-excellent features which you don't see below because they made it into Java 1.3, 1.4 and 1.5! Now, without further ado:
Friday Feb 02, 2007
In recent years, many Java language features have been developed under JSRs. Notable examples are generics (JSR14), assertions (41), annotations (175, 308), and enums, autoboxing, foreach, varargs and static import (201). Language JSRS are part of into the core platform and get incorporated into the JLS.
As people adopt a new Java SE release, they explore these features for the first time and file Requests For Enhancement about them. RFEs in the scope of JSR201 are especially common. Unfortunately, it's really hard to approve any language request that once fell within the charter of a language JSR. A JSR Expert Group spends years considering every aspect of a feature, so if they design something a particular way, that's what the JLS will say. Determining an Expert Group's reasoning years after the event can be hard. However, I do try to discern it and include it in the Evaluation of any RFE that concerns a JSR-derived language feature.
But by default, and especially if history just isn't available, it's not appropriate to overturn or extend the scope of such a feature, no matter how reasonable the request. Only in an exceptional case will the JLS change in a non-trivial way.
Wednesday Jan 10, 2007
Hans writes an excellent post on the use of bound properties, and how a simple property keyword that simulates getX/setX methods wouldn't buy him much.
Properties also have a difficult interaction with access control. Declaring a property to be publically readable but only package/protected/privately writeable would either be impossible or need some hacky syntax for the read v. write access level. This is not an improvement over getX/setX.
Now, since no-one is talking about VM support, properties would be implemented through translation to methods and fields. Obviously this makes them less amenable to reflection, but my main concern is this. We have an increasing list of language constructs implemented through translation: instance initializers, bridge methods, inner classes (and the calling convention for their constructors), enums. Specifying such translations in the JLS is rare because they are implementation details. (Notable exceptions: 15.9.3 implies the calling convention for anonymous classes and 8.9 has some info about enums.) We don't want to restrict the classes emitted by compilers except when it's essential for source and binary compatibility. (The binary representation of a class in 13.1 is rather loosely specified for this reason.) Clearly, a cross-compiler convention for representing properties would be necessary in the JLS, so no-one would ever be able to implement properties in a more lightweight fashion.
I must admit I do like the increased safety available in Stephen Colebourne's property proposal, though maybe you could get that with method references: (borrowing from the Javapolis whiteboards)
binder.bind(user, User.getFirstName.method);
binder.onChange(user, User.setFirstName.method, closure);
But overall, like Peter von der Ahé, I am moving away from properties.
Wednesday Nov 08, 2006
I want to talk about the enhanced for ('foreach') loop, an immensely popular construct. Many people are surprised to find that it only accepts Iterable expressions (and arrays, which I'll ignore). Why not also Iterators?
The JSR201 Expert Group considered this issue at length. foreach is syntactic sugar; the compiler generates an iterator() and a loop variable and a basic for loop in its place. The primary reason against passing an Iterator to foreach is that the user-provided body could modify it during iteration, and so break the compiler's assumptions about its generated code:
Iterator i = myList.iterator();
for (Object o : i) { /* Maybe I'll just call i.remove() here, it'll be fun */ }
By requiring Iterables, JSR201 essentially placed safety above raw functionality. But even with Iterables, user code can still interfere with the compiler's code. A Collection passed to foreach can be modified concurrently and most Collection implementations aren't synchronized internally, so the compiler-generated iterator could break.
So, given that interference is possible with Iterables, and given that using Iterators would be very convenient, maybe we should dial down the safety a little to add some functionality. I'm not proposing any changes now, but I am keeping an open mind on Iterators. I control for the fact that people who want Iterator support shout the loudest 
(The argument against interference from user code is also why the loop variable isn't visible. This decision is very sensible.)
Greetings one and all. You're probably here because Gilad kindly pointed the way. His contributions to Sun span two millenia and have been more immense than most people know; the Java Language and VM Specifications set the bar for modern platform documentation. It is an honour to follow in his footsteps and I wish him the best of luck with his new and dynamic endeavours.
I plan to blog about interpretation of the JLS and JVMS; design issues on JSRs that I'm involved with; and proposals for language features both old and new. I also hope to bring the JLS and JVMS into the blogging age by publishing clarifications and corrections as tagged entries.
So with Java SE 6 emerging and the lifecycle for SE 7 starting - and with the debate about Java's place in the world reaching fever pitch - it will surely be the most interesting of times.