Wednesday May 27, 2009
Joseph D. Darcy's Sun WeblogJoseph D. Darcy's Sun Weblog Project Coin: For further consideration, round 2 The first group of proposals selected for further consideration were:
After due deliberation, and next set of proposals meeting the Project Coin criteria for further consideration are:
All the selected proposals were reviewed and judged to have favorable effort to reward ratios and to preserve the essential character of the language. Work should continue refining the selected proposals and producing prototypes. In particular, a unified proposal for integer literals should be produced. Language change proposals not on the combined "for further consideration" list will not be included in JDK 7; there is no need for continued discussion about them on the Project Coin mailing list. Detailed rationales for why particular proposals were not selected will not be provided. Final selection of the five or so proposals to be included in the platform will occur within the next few months. (2009-05-27 20:43:57.0) Permalink Comments [9]Project Coin: The Call for Proposals Phase is Over! Update: Added links to proposal for switch for all types and simple expressions and a link to a revised method chaining proposal. Project Coin's call for proposals phase is now over! Thirty four days long, the proposal period included nearly 70 proposals being sent to the mailing list, 19 coming in over the last two days, and over 1100 messages on the list discussing those proposals and related topics. With the flurry of pre-deadline activity over, the more deliberative task of finishing reviewing and evaluating the proposals awaits. Including several sent in a few hours after deadline, the proposals received since week four are:
The figure below graphs when proposals were received; nothing like an impending deadline to focus the mind!
Sometime after the next for further consideration cut is made, I'll post some thoughts on and reaction to the call for proposals phase as a whole. This will not include a detailed analysis of why each proposal was or was not chosen; however, there will be discussion of common aspects of proposals that led them to be selected or not. (2009-03-31 17:26:42.0) Permalink Comments [2]Update: Corrected to include Stephen Colebourne's enhanced enhanced for loop proposal. Further update: Added links to updated large arrays and compile time access proposals. Project Coin's fourth week saw continued lively traffic on the mailing list. As the submission deadline approaches, a flurry of new proposals were sent in:
The field of over two dozen proposals previously sent in over the first three weeks of Project Coin was narrowed to six proposals still in consideration for inclusion in JDK 7. The proposals submitted this week and until the end of the call for proposals period will be similarly evaluated for their appropriateness to be added to the language. Finally, the combined list of candidate changes will be produced. (2009-03-27 17:36:11.0) Permalink Comments [6]
For those interested in specifically following Project Coin related posts, my Project Coin entries are tagged with "projectcoin": Project Coin: For further consideration... In the first three weeks of Project Coin over two dozen proposals have been sent to the mailing list for evaluation. The proposals have ranged the gamut from new kinds of expressions, to new statements forms, to improved generics support. Thanks to all Java community members who have sent in interesting, thoughtful proposals and contributed to informative discussions on the list! While there is a bit less than a week left in the call for proposals period, there has been enough discussion on the list to winnow the slate of proposals sent in so far to those that merit further consideration for possible inclusion in the platform. First, Bruce Chapman's proposal to extend the scope of imports to include package annotations will be implemented under JLS maintenance so further action is unnecessary on this matter as part of Project Coin. Second, since the JSR 294 expert group is discussing adding a module level of accessibility to the language, the decision of whether or not to include Adrian Kuhn's proposal of letting "package" explicitly name the default accessibility level will be deferred to that body. Working with Alex, I reviewed the remaining proposals. Sun believes that the following proposals are small enough, have favorable estimated reward to effort ratios, and advance the stated criteria of making things programmers do everyday easier or supporting platform changes in JDK 7:
As this is just an initial cut and the proposals are not yet in a form suitable for direct inclusion in the JLS, work should continue to refine these proposed specifications and preferably also to produce prototype implementations to allow a more thorough evaluation of the utility and scope of the changes. The email list should focus on improving the selected proposals and on getting any remaining new proposals submitted; continued discussion of the other proposals is discouraged. The final list of small language changes will be determined after the call for proposals is over so proposals sent in this week are certainly still in the running! The final list will only have around five items so it is possible not all the changes above will be on the eventual final list. (2009-03-24 15:21:00.0) Permalink Comments [8]Project Coin's third week was another week of lively traffic on the mailing list. New proposals were sent in:
existing proposals were revised:
and discussion continued on ARM and other proposals. The scoping and utility of a few pre-proposals was discussed on the list too. Ten days remain to get language change proposals in! (Purely libraries changes will be handled by other JDK 7 processes.) (2009-03-20 10:37:34.0) Permalink Comments [2]After the vigorous start of week 1, the pace of new proposals being sent to the list slowed:
However, brisk discussion continued on refining and exploring ARM blocks and their variations. (2009-03-13 09:30:00.0) Permalink Comments [1]Expanding on a few slides from my JavaOne talk last year, here are a few tips to keep in mind when designing exception types. First, all exceptions are serializable since Throwable implements Serializable; therefore, like all other serializable classes, exception types should declare a serialVersionUID field to ease evolving the type in the future. Using's javac's -Xlint:serial option will warn about missing serialVersionUID fields on serializable classes, amongst other possible issues. Second, when adding a new exception class, consider providing more information beyond just a distinct name, such as methods to return information about what specific situation triggered the exception and possibly how to recover from it. Providing this additional information can interact with being serializable; when the additional information is not logically serializable, the specification may need to allow the information to be unavailable after deserialization. Third, when multiple related exceptions types are added, a common direct superclass allows a single catch block to handle the related exceptions uniformly. (Having a common super-exception would still be useful even if multi-catch is added to the language in JDK 7.) Looking at the exceptions in the JSR 269 API, various methods note the possible impact of serialization-deserialization on the returned values. JSR 269 provided a trio of similar exceptions for the situation of encountering a kind of object unknown in an earlier version of the language, such as a JDK 6 era annotation processor coming across a module from JDK 7: However, the original JSR 269 API does not have a common direct superclass to group these related conditions. That deficiency was addressed in JDK 7 build 48 with the addition of javax.lang.model.UnknownEntityException as the direct superclass of these three exceptions (6794071). Retrofitting this change is binary compatible because UnknownEntityException directly extends RuntimeException as did the old exceptions and serialization compatibility is preserved by the existing serialVersionUID fields in the old exceptions. (2009-03-12 12:00:07.0) PermalinkThose interested in following JDK 7 happenings from Sun engineers can track "jdk7" tagged entries on http://blogs.sun.com/main/tags/jdk7. (2009-03-10 11:27:20.0) Permalink Comments [1]The crested butte of Crested Butte Last week I was off attending my first Java Posse Roundup in scenic Crested Butte Colorado, pictured below. There were many good discussions related to JDK 7 and other programming, and non-programming, topics. Unfortunately, there were some difficulties with my flight back. Once again, the plane I was flying on had to be rebooted, but at least this time the passengers didn't need to be reinstalled! I missed my scheduled connection in Denver and caught the next flight to the bay area a few hours later. The wait in Denver was made more pleasant by the airport's free-after-a-short-ad wi-fi and recharging stations for electronic gear. More airports should have those amenities! (2009-03-08 22:15:09.0) PermalinkIn its first week, Project Coin enjoyed a vigorous start with well over a dozen proposals submitted:
Traffic on the the list has been high, with lots of feedback and analysis leading to some revised proposals. A few general comments on the proposals that have been sent in so far to help refine those proposals and improve future proposals before they are sent in. The proposals submitted to Project Coin should already be well thought-through. The goal is to have in short order specifications approaching JLS quality, preferably with a prototype to help validate the design. The feedback on the list should be much closer to finding and illuminating any remaining dark corners of a proposal rather than fleshing out its basic structure. If a proposal does not cite chapter and verse of the JLS for parts of its specification, that is a good indication the proposal is too vague. All affected sections of the JLS should be listed, including binary compatibility and the flow analysis in definite assignment. It is fine if someone posts to the list to solicit help writing a proposal for a given change. Proposal writers should be aware of the size and scope parameters established for the project; for background see: Also, proposal writers should search Sun's bug database for bugs related to the change. The URL for the database is http://bugs.sun.com; Java specification issues are in category Java SE and subcategory specification. Of course the database is also searchable with your favorite search engine restricted to that site. Besides the evaluation field from the bug database, the external comment can often also have valuable insight into and discussion of alternatives to solving the problem or reasons why the problem shouldn't be solved. As has already been happening on the list, authors and advocates of a proposal are responsible for responding to feedback and incorporating changes into any subsequent iterations of the proposal. For now, I think it is adequate to just send the revised proposals to the list. Only if there turns out to be frequent change would a more formal tracking system be warranted. Keeping such discussions on the list is important both to allow easy, centralized tracking of the proposal drafts and also for future language archaeologists who are curious about why a particular decision was made. After a few iterations of feedback and refinements, the specification and compilation strategy should be sufficiently detailed to provide high-confidence that the proposal is practical and can be reduced to practice. For example, I think the initial proposal for the admittedly simple strings in switch change provides adequate detail on these fronts. (2009-03-06 09:00:00.0) Permalink Comments [15]Project Coin: Proposal for Strings in switch Below is a Project Coin language proposal form I wrote for Strings in switch; send any comment to the Project Coin mailing list. PROJECT COIN SMALL LANGUAGE CHANGE PROPOSAL FORM v1.0 AUTHOR(S): Joseph D. Darcy OVERVIEW
MAJOR ADVANTAGE: What makes the proposal a favorable change?
MAJOR BENEFIT: Why is the platform better if the proposal is adopted?
MAJOR DISADVANTAGE: There is always a cost.
ALTERNATIVES: Can the benefits and advantages be had some way without a language change?
EXAMPLES
ADVANCED EXAMPLE: Show advanced usage(s) of the feature.
DETAILS
COMPILATION: How would the feature be compiled to class files? Show how the simple and advanced examples would be compiled. Compilation can be expressed as at least one of a desugaring to existing source constructs and a translation down to bytecode. If a new bytecode is used or the semantics of an existing bytecode are changed, describe those changes, including how they impact verification. Also discuss any new class file attributes that are introduced. Note that there are many downstream tools that consume class files and that they may to be updated to support the proposal!
TESTING: How can the feature be tested?
LIBRARY SUPPORT: Are any supporting libraries needed for the feature?
REFLECTIVE APIS: Do any of the various and sundry reflection APIs need to be updated? This list of reflective APIs includes but is not limited to core reflection (java.lang.Class and java.lang.reflect.*), javax.lang.model.*, the doclet API, and JPDA.
OTHER CHANGES: Do any other parts of the platform need be updated too? Possibilities include but are not limited to JNI, serialization, and output of the javadoc tool.
MIGRATION: Sketch how a code base could be converted, manually or automatically, to use the new feature.
COMPATIBILITY
EXISTING PROGRAMS: How do source and class files of earlier platform versions interact with the feature? Can any new overloadings occur? Can any new overriding occur?
REFERENCES
URL FOR PROTOTYPE (optional): (2009-03-01 10:00:00.0) Permalink Comments [7] The Project Coin OpenJDK page and mailing list are now live. The call for proposal period will run until March 30, 2009. Let the proposing begin! (2009-02-27 15:17:18.0) Permalink Comments [2]Project Coin: Small Language Change Proposal Form Available The name of the OpenJDK project hosting small language changes for JDK 7 will be Project Coin. Besides a coin literally being small change, to "coin a phrase" is to create a little bit of new language. The website for the project and its mailing lists will come into being this February. In the mean time, the initial form to use to propose a language change is listed below. If you have an idea for a change, please work on the form and post it the Project Coin mailing list once that gets started. Small language changes I think would improve the language according to the previously discussed criteria include (related Sun bugs in parentheses):
PROJECT COIN SMALL LANGUAGE CHANGE PROPOSAL FORM v1.0 INSTRUCTIONS: For a proposal to be considered, this document must be complete and stand-alone in and of itself. No URLs, citations of papers, etc. can appear except for the limited supplementary information requested in the "REFERENCES" section. A new class file version number can be assumed to be available for -target 7. The proposal must not remove existing features of the language; for example, "Get rid of checked exceptions" would not be considered. As part of being stand-alone, the proposal must not rely on any other language changes that have not already been accepted. AUTHOR(S): Who are you? OVERVIEW
MAJOR ADVANTAGE: What makes the proposal a favorable change? MAJOR BENEFIT: Why is the platform better if the proposal is adopted? MAJOR DISADVANTAGE: There is always a cost. ALTERNATIVES: Can the benefits and advantages be had some way without a language change? EXAMPLES ADVANCED EXAMPLE: Show advanced usage(s) of the feature. DETAILS COMPILATION: How would the feature be compiled to class files? Show how the simple and advanced examples would be compiled. Compilation can be expressed as at least one of a desugaring to existing source constructs and a translation down to bytecode. If a new bytecode is used or the semantics of an existing bytecode are changed, describe those changes, including how they impact verification. Also discuss any new class file attributes that are introduced. Note that there are many downstream tools that consume class files and that they may to be updated to support the proposal! TESTING: How can the feature be tested? LIBRARY SUPPORT: Are any supporting libraries needed for the feature? REFLECTIVE APIS: Do any of the various and sundry reflection APIs need to be updated? This list of reflective APIs includes but is not limited to core reflection (java.lang.Class and java.lang.reflect.*), javax.lang.model.*, the doclet API, and JPDA. OTHER CHANGES: Do any other parts of the platform need be updated too? Possibilities include but are not limited to JNI, serialization, and output of the javadoc tool. MIGRATION: Sketch how a code base could be converted, manually or automatically, to use the new feature. COMPATIBILITY EXISTING PROGRAMS: How do source and class files of earlier platform versions interact with the feature? Can any new overloadings occur? Can any new overriding occur? REFERENCES URL FOR PROTOTYPE (optional): (2009-01-27 09:05:00.0) Permalink Comments [42]Criteria for desirable small language changes The two primary goals of making small language changes in JDK 7 is to:
Over the years, certain common coding patterns have been recognized as needlessly verbose including:
These patterns can be replaced with new constructs that are more concise and more clear without fundamentally altering the language. Besides improvements to support existing Java programs, language changes should also be made to allow appropriate access to new JVM capabilities, such as those being enabled by the Da Vinci Machine project. While language changes can fundamentally improve the modes of expression in a language, language changes have a number of drawbacks as solutions to programming problems:
Therefore, language changes are rarely the preferred solution if other workable solutions are available. Since IDEs are now commonly used for Java development, mitigating or solving problems using IDE tooling is one possibility. As of Java SE 6, compliant compilers are required to support annotation processing as standardized by JSR 269, see javax.annotation.processing and javax.lang.model. Annotation processing provides a general meta-programming framework; beyond processing annotations directly, annotation processors can be used to implement many currently extra-lingual checks based on a program's structure. Checks which previously would have required language changes can now be implemented by developers and just used by convention. JSR 308, Annotations on Java Types, would enable more detailed checking by allowing annotations in more program locations. When judging whether or not any change to the platform is worthwhile, a useful notion is estimating the feature's "thrust to weight ratio," that is estimating whether the benefits of making the change exceed the full cost of implementing the change. For language changes, this metric is improved by having a larger fraction of programs potentially benefiting from the change. For example, it would be roughly the same amount of engineering to add numerical operator overloading support for classes like BigInteger and BigDecimal as to add support for bracket, "[]", syntax for Lists and Maps. Besides complications with the == operator in the numerical case, bracket syntax for Maps and Lists has much higher utility since many more Java programs use Collections than large numbers. Especially with the maturity of the Java platform, the onus is on the proposer to convince that a language change should go in; the onus is not to prove the change should stay out. Given the upcoming holidays, the language change proposal form and the seeding proposals will both be coming in January 2009. (2008-12-23 09:00:00.0) Permalink Comments [17]Guidance on measuring the size of a language change Soon a project will be starting to consider adding a to-be-determined set of small language changes to JDK 7. Given the rough timeline for JDK 7 and other on-going efforts to change the language, such as modules and annotations on types, only a limited number of small changes can be considered for JDK 7. That does not imply that larger changes aren't appropriate or worthwhile at some point in the future; in the mean time such changes can be explored and honed for JDK 8 or later. Separate from its size, criteria to evaluate the utility of a language change will be discussed in a future blog entry. The JCP process defines three deliverables for a JSR:
These three distinct aspects of a language change, specification, implementation, and general testing, exist whether or not the change is managed under a JSR. For this project, a language change will be judged small if it is simultaneously a small-enough effort under all three of specification, implementation, and testing. In other words, if a change is medium sized or larger in a single area, it is not a small change. (This corresponds to using an infinity norm to measure size; see "Norms: How to Measure Size".) Another concern is the size of change to developers, but if the change is small in these three areas, it is likely to be small for developers to learn and adopt too. Because there is limited fungiblity between the people working on specification, implementation, and testing, a single oversize component can't necessarily be compensated for by the other two components being small enough to managed on their own. The size of a specification change is not just related to the amount of text that is altered; it also depends on which text, how many new concepts are needed, and the complexity of those concepts. Similarly, the implementation effort can be large if a limited amount of tricky code is involved as well as if a large volume of prosaic code is needed. An estimate of the future maintenance effort should factor into judging the net implementation cost too. The specification size and implementation size are often not closely related; a small spec change can require large implementation efforts and vice versa. JCK-style conformance testing is based on testing assertions in the specification, so the size of this kind of testing effort should have some positive correlation with the size of the specification change. Likewise, regression testing should have at least a weak positive correlation with the size of the implementation change. However, adequate conformance testing can be disproportionately large compared to the size of the specification change depending on how the assertions interact and how many programs they affect. Due to complexity of the Java type system and the desire to maintain backwards compatibility, almost any type system change will be at least a medium-sized effort for the implementation, specification, or both. Each new feature of the type system can interact with all the existing features, as well as all the future ones, so type system changes must be approached with healthy skepticism. As a point of reference, the set of Java SE 5 language features will be sized according to the above criteria; from smallest to largest:
Some examples of bigger-than-small language changes that have been discussed in the community include:
Specific small language changes we at Sun are advocating for JDK 7 will be discussed in the near future. (2008-12-11 09:00:00.0) Permalink Comments [8]Coming Soon: A JSR for small language changes in JDK 7 I'm happy to announce that I'll be leading up Sun's efforts to develop a set of small language changes in JDK 7; we intend to submit a JSR covering those changes during the first half of 2009. However, before the JSR proposal is drafted and submitted to the JCP, we'll first be running a call for proposals so Java community members can submit detailed, thoughtful changes for consideration too. We'll be seeding the discussion with a few proposals we think would improve the language. More information on our proposed changes, guidance for measuring the size of a change, and criteria for judging the desirability of a language change will be coming over the next several weeks. I've proposed an OpenJDK project to host the discussion of the proposals and potentially some prototype implementations. Suggested Reading API Design: Interfaces versus Abstract Classes
Quoting, Effective Java, first edition, Item 16: Prefer Interfaces to abstract classes
As discussed in that item, the ease of evolution of abstract classes comes from the ability to add new methods having "reasonable default implementations" without almost surely causing source of all existing subtypes to no longer compile. The flexibility and power of interfaces involve ease of retrofitting to existing classes, allowing nonhierarchical type relations, and so on. An additional benefit of interfaces is the ability to use dynamic proxies; one notable use of dynamic proxies is creating the annotation objects returned at runtime by getAnnotation. One potential difference not worth considering with modern virtual machines is the speed difference between invoking a method on an interface versus invoking a method on a class. While there is a sound rationale backing the conventional wisdom, in my estimation the compatible evolution advantages of abstract classes are smaller than they appear at first, further tipping the balance in favor of using interfaces in more situations. The two alternatives to be considered to define the initial desired type abstraction are:
In neither case are fields being defined. In both cases a skeletal abstract implementation class, like java.util.AbstractList, could be used to share implementation code. If the type abstraction is defined by an abstract class, the skeletal class and abstract class might be able to be combined, saving a type compared to the pair of an interface plus a skeletal class. However, forcing all implementations to be based on the same skeletal class may be awkward. Interfaces can easily have multiple independent skeletal helper classes. Subclasses can blunt inheritance issues by using an intermediate subclass to abstract-ify any problematic implementations from the parent. Table 1 outlines the different kinds of compatibility impacts, source, binary, and behavioral, from adding a method to an interface and an abstract class. The effects of adding a method to an abstract class depend on whether or not the added method is abstract or has an implementation. For the purposes of discussion, we will assume the method does have an implementation (otherwise, there would be no advantage to using an abstract class).
Technically, adding a method to an interface and adding a method to an abstract class are both binary compatible since programs using those types will continue to link. However, in the case of an interface type, if a program calls the new method on an existing implementation of the interface (unless the implementation presciently had a method with a matching signature declared), an AbstractMethodError will be thrown, which is an awkward situation to recover from. Also, for the call to the new interface method to work on an existing implementor of the old interface, the method in the implementor must be an exact match, signature and return type, for the added method; if the return type in the implementor is a subtype of the added method, a covariant return, a recompile of the implementor is needed to create the bridge method joining the method from the interface with the method declared in the class. Adding a method to an interface has a wide range of possible source compatibility effects on existing code. It is possible that an implementation anticipated future developments and already has a method matching the newly added method. In that case, adding the method is binary-preserving source compatible with that particular class. Of course in general it is much more likely that existing implementations do not already have the new method, in which case they won't compile against the modified interface declaration. Therefore, the worst possible outcome is that existing implementations will stop compiling after the method is added to the interface; this worst case outcome is also the most likely outcome in the absence of other information. Adding a concrete method to an abstract class also has a range of source compatibility outcomes. If no existing extending class has a method with the new name, there is no conflict and the addition is binary-preserving source compatible given the set of actual programs. If not the expected outcome, this is certainly the hoped for outcome of adding a method to an abstract class! However, it is possible existing subclass already declare a method with the new name. If the parameter types match but the return types conflict, existing subclasses will stop compiling after the method is added. If the parameter types are not the same, an overloading situation is introduced or expanded. This can change method resolution of call sites using the existing subclass, which may or may not lead to behaviorally equivalent class files since different methods might be called. One technique to avoid changing resolution at existing call sites is for the new method to include in its parameter list a new type added at the same time as the method. If the new type is not related to existing types, then no method in an existing subclass will interact with the new method during method resolution. Therefore, the worst possible outcome is that some existing subclasses will stop compiling after the method is added to the abstract class; this can be avoided depending on the parameter list of the new method, at the potential cost of introducing new overloadings that change existing method resolution. Not counting introspective operations like core reflection, adding methods to an interface or abstract class does not have much direct appreciable behavioral compatibility impact because adding methods doesn't directly affect the code run by existing clients of the class. If an abstract class were not at the conceptual root of a type hierarchy, adding a concrete method could intercept calls to a method with the same signature in the superclass. However, if the children of an abstract superclass already have a concrete implementation for the newly added method, existing calls to the children's method would not be intercepted by the method added in the superclass. Since adding a method to an interface or an abstract class is binary compatibly and in both cases the worst case source compatibility outcome is breaking compilation of existing subtypes, any evolution advantage of abstract classes hinges on the ability to have a reasonable default implementation for new methods. But what can such a new method implementation really do? Some viable options are:
(Other sorts of behavior could potentially be added to skeletal classes, but those classes aren't an alternative to interfaces.) Adding a default implementation that throws an exception isn't necessarily very useful; throwing AbstractMethodError would mimic adding a method to an interface! If the functionality of the new method can be expressed in terms of existing methods on the abstract class, the new method could also be written as convenience static method in a helper class. In that case, the convenience method could just as easily be written in terms of methods on an interface instead. Proposals for extension methods would add syntactic support for this helper class pattern. A no-op method could be added to optionally advise subclasses to some condition or event, but it would have no useful effect on existing subclasses. While it is straightforward to add simple concrete methods to an abstract class, with sufficient advance planning, such methods could also be automatically added to implementations of an interface at compile time. Starting in JDK 6, Java compilers must support standardized annotation processing. Annotation processing is a general meta-programming framework not directly tied to annotations. Before annotation processing, the types being compiled can be incomplete, including references to types to be generated during annotation processing. The to-be-generated types can include the superclass of a class being compiled. Supporting the generation of superclasses is a very powerful technique for modifying the semantics of the child class. In this case, a class implementing an interface expected to change in the future could refer to a private superclass. With the original definition of the interface, the superclass would be empty. However, when methods were added to the interface, the annotation processor could generate implementations of those methods in the superclass. This would have the effect of adding the new methods to the class at compile time. Annotations could drive what the synthesized implementation actually did, such as throw an exception or a no-op. Compared to adding methods to an interface, adding concrete methods to an abstract class seems to be much more compatible. However, both operations are binary compatible, and while adding a method to an abstract class usually has a better "average" impact on existing subtypes, the worst possible impact is the same, breaking the compilation of existing code. As for the functionality that can be added in a concrete method, convenience methods can be put in separate class and the other sorts of limited functionality methods that can readily be added could also be generated via annotation processing for implementors of an interface. Therefore, the practical evolutionary benefits of using an abstract class rather than an interface should be considered carefully since interfaces may still be a better choice when limited evolution is anticipated. (2008-05-12 18:43:29.0) Permalink Comments [4]Compatibly Evolving BigDecimal Back in JDK 5, JSR 13 added true floating-point arithmetic to BigDecimal, which involved many new methods and constructors along with new supporting classes in the java.math package. I was actively involved in the JSR 13 expert group and integrated the code into the JDK. These changes had some surprising compatibility impacts which can be classified according to their source, binary, and behavioral effects. The numerical values representable in BigDecimal are (unscaledValue × 10-scale) where unscaledValue is a BigInteger and scale is a 32-bit integer. Before Java SE 5, scale was constrained to be positive or zero (in other words, 10 raised to a negative or zero exponent) and JSR 13 removed this restriction to allow any integer exponent. Consequently, prior to JSR 13 BigDecimal integral values with trailing zeros had to have them explicitly represented; for example the value one million had to be stored as (1,000,000 × 100) rather than (1 × 106) or (10 × 105), etc. One behavioral consequence of JSR 13 was that all the methods operating on BigDecimal values understand and accept numbers without the old exponent restriction. The new API elements added by JSR 13 are listed in table 1; the additions will be examined under each kind of compatibility.
Binary CompatibilityAdding new public methods and constructors, even ones that overload existing names is binary compatible. Adding public static final fields is binary compatible, meaning existing clients of the library will continue to link. However, there is a possible complication here since BigDecimal is not final and since it has public constructors, it can be subclassed. (As discussed in Effective Java, Item 13, Favor Immutability, this was a design oversight when the class was written.) Adding fields to classes can be binary incompatible, but the needed combination of circumstances does not arise in this case. Therefore, individually and as a whole, the BigDecimal API additions are binary compatible. Source CompatibilityFor source compatibility, we can distinguish between clients of a types and extenders/implementors of a type; certain changes can inconvenience extenders/implementors but not clients. Adding the public static final fields is binary-preserving source compatible. If a subclass, say MyDecimal, already has a field with the same name as a field being added to BigDecimal, the existing declaration in MyDecimal hides the new declaration in the parent class BigDecimal. Therefore, existing uses of, say, MyDecimal.TEN, would continue to resolve to the same binary name. Since constructors are not inherited and all the new constructors are public rather than protected, just the uses of constructors in clients needs to be considered; there are no distinct special issues for subclasses. The constructors in BigDecimal during Java SE 1.4.x, the platform version immediately predating JSR 13, are listed in table 2.
To assess the source compatibility impact, we can compare the new constructors with the old constructors and see if any possible overload resolutions would change, including the possibility of stopping an existing compilation by removing the existence of a most specific method. Of the twelve new constructors, ten are clearly not problematic and binary-preserving source compatible; the ten either have more parameters than the existing constructors or are not applicable to the same invocations, see table 3. For example, eight of the new constructors have the new type MathContext as a parameter. Because of primitive subtyping the other two new constructors, BigDecimal(int val) and BigDecimal(long val) are both applicable to and more specific than invocations that would previously resolve to BigDecimal(double val). Therefore, adding these two new constructors is not binary-preserving source compatible because a different constructor can be resolved for the same existing source code, code with one-argument calls to a BigDecimal constructor where the argument is a primitive type. These two constructors need a secondary screening to assess their behavioral equivalence.
Before JDK 5, the expressions BigDecimal(123) and BigDecimal(123L) in source code would resolve to a call to BigDecimal(double); as part of that resolution primitive widening conversion converts the argument expression to double before the constructor is invoked. All int values are exactly representable as double and the double constructor when given an integral value will return a BigDecimal with the numerical value in question and a scale of zero. The new int constructor will also return a BigDecimal with the numerical value of the argument and a scale of zero. Therefore, adding the int constructor will result in behavioral equivalent programs; although the new constructor will cause some invocations to resolve to a different constructor, calling the other constructor will still always result in an equivalent,
bd1.equals(bd2)==true, BigDecimal. However, the new long constructor does not have behavioral equivalence for all values. Some long values are not exactly representable in double and the old long → double conversion can silently lose precision. For example, printing the value of (new BigDecimal(Long.MAX_VALUE)) gives Partially because of the unintentional, if beneficial, change in source meaning as well as some of the usual reasons (possibility to cache, etc.), in retrospect I think it would have been preferable for the functionality of all twelve new constructors to be provided through static factories instead. (While not directly applicable in BigDecimal, in general even if constructors aren't considered harmful, static factories can have better generics support. A similar analysis can be undertaken for all the new methods. Additionally, since subclasses are possible, inheritance conflicts need to be considered too. Note that the new methods taking MathContext and RoundingMode parameters cannot conflict with existing methods in subclasses so all those additions are binary-preserving source compatible. However, if all the parameters of a new method are existing types, a subclass could potentially have a conflicting method with an unrelated return type. For example, MyDecimal could have a (strange) public double divide(BigDecimal divisor) method which would conflict with the addition of public BigDecimal divide(BigDecimal divisor). While BigDecimal generally shouldn't be subclassed, the addition of some of these new methods could prevent existing subclasses from compiling, yet another reason to favor composition over inheritance. Behavioral CompatibilityIn terms of evolving the behavior of existing methods after introducing the expanded exponent range, the main issues were the behavior of arithmetic operations and text ↔ BigDecimal conversion operations; the latter would prove to be unexpectedly troublesome. As summarized in table 4, the behavior of arithmetic operations was quite compatible with a number of strong invariants. Given input values a1 and b1 representable under the old system, and given an existing method, say add, and its result c1, in the old and new BigDecimal if the inputs to an operation are .equals, same numerical value and same representation, then the output is exactly equivalent too, same numerical value with the same representation. More generally, in the old and new BigDecimal if the inputs to an operation satisfy the weaker property of being compareTo() == 0, meaning they have same numerical value but with a possibly different representation, then the output will be numerically equal, but possibly with a different representation.
A main advantage of decimal arithmetic over binary arithmetic is what-you-see-is-what-you-get for input and output values, the complicated vagaries of binary ↔ decimal conversion can be avoided and exact computation can be straightforward. Therefore, when removing the restriction on exponent values, being able to have a textual representation that readily mapped to all possible unscaled value and exponent pairs was paramount to make the new arithmetic usable. Before JSR 13, the toString method did not use exponential notation, all leading and trailing zeros were explicit. For fractional values, the length of the output grew linearly with the size of the exponent, as well as the number of digits of precision. Conversely, without negative exponents, the internal representation and string output of integer-valued BigDecimal numbers grew with the magnitude of the number, even when it was inherently low-precision. To take advantage of the new unrestricted exponent range, a textual notation was needed that allowed the positive or negative exponent to be recovered; this was accomplished by changing to using scientific notation in the toString output. When converting from text to BigDecimal, a positive exponent could be reconstructed from integer values that previously would have been forced to have a zero exponent. However, the new output was legal input to the old constructors, so similar properties similar to the old and new arithmetic behavior applied:
If needed, in the new BigDecimal on textual input the old semantics on exponents is easy to code: BigDecimal bd = new BigDecimal(myString); if (bd.scale() < 0) bd = bd.setScale(0); and a toPlainString method was added to provide the old-style output when needed. Staying within the realm of old and new BigDecimal versions, these arrangements solidly preserve a very reasonable kind of behavioral compatibility, numerical value and representation are kept constant when possible, otherwise, numerical value is preserved possibly with a different representation. Backwards serial compatibility is slightly weaker; rather than being converted to exponent-zero values as done for textual inputs, new serial streams holding positive exponents are rejected by old BigDecimal implementations. Unfortunately, despite these consistencies across JDK versions, some users of BigDecimal still ran into compatibility issues from the textual output changes made by JSR 13. A common use for BigDecimal is interfacing to databases and while the new scientific notation was legal input to the old BigDecimal string constructors, scientific notation was not legal notation to databases. The addition of the toPlainString method did not help the situation without recompiling the source of the application in question; such recompilation could be unwanted since it would tie the application to JDK 5 with the new method. Other unpalatable workarounds include subclassing BigDecimal to enforce the old toString behavior or using reflection to see if the toPlainString method is available to call to avoid introducing a hard dependency on the new method. While the changes in textual input and output of BigDecimal were reasonable in the context of direct Java compatibility, the expert group underestimated the behavioral compatibility impact of these change when dealing with databases. While the changes remain justifiable in terms of supporting the new values, if the compatibility cost were known, the expert group could have and should have worked with database vendors to mitigate the migration cost associated with this change. ConclusionFully understanding the compatibility impact of changes is subtle and shortcomings are quick to lead to user anger. Merely maintaining binary compatibility is not sufficient for many purposes. Following good coding guidelines from the beginning can pay silent rewards when later evolving the class by reducing the space of possible concerns. AcknowledgmentsAlex provided helpful comments on a draft of this entry. Further Reading
Kinds of Compatibility: Source, Binary, and Behavioral
When evolving the JDK, compatibility concerns are taken very seriously. However, different standards are applied to evolving various aspects of the platform. From a certain point of view, it is true that any observable difference could potentially cause some unknown application to break. Indeed, just changing the reported version number is incompatible in this sense because, for example, a JNLP file can refuse to run an application on later versions of the platform. Therefore, since not making any changes at all is clearly not viable for evolving the platform, changes need to be evaluated against and managed according to a variety of compatibility contracts. For Java programs, there are three main categories of compatibility:
Note that non-source compatibility is sometimes colloquially referred to as "binary compatibility." Such usage is incorrect since the JLS spends an entire chapter precisely defining the term binary compatibility; often behavioral compatibility is the intended notion instead. There are many other observable aspects of the JDK not related to Java programs, such as file layout, etc. Those will not be further discussed in this note. The basic challenge of compatibility is the difficulty of finding and modifying all the software and systems impacted by a change. In a closed-world scenario where all the clients of an API are known and can in principle be simultaneously changed, introducing "incompatible" changes is just a matter of being able to coordinate the engineering necessary to evaporate the liquid in a small body of water, perhaps only a puddle or pot on a stove. In contrast, for APIs that are used as widely as the JDK, rigorously finding all the possible programs impacted by an incompatible change is as impractical as boiling the oceans, so evolving such APIs is quite constrained by comparison. Generally, we will consider whether a program P is compatible is some fashion (or not) with respect to two versions of a library L1 and L2 that differ in some way. (We will not consider the compatibility impact of such changes to independent implementers of L.) Sometimes only a particular program is of interest; is the change from L1 to L2 compatible with this program? When evaluating how the platform should evolve, a broader consideration of the programs of concern is used. For example, does the change from L1 to L2 cause a problem for any program that currently exists? If so, what fraction of existing programs is affected? Finally, the broadest consideration is does the change affect any program that could exist? Often once a platform version is released, the latter two notions are similar because imperfect knowledge about the set of actual programs means it can be more tractable to consider the worst possible outcome for any potential program rather than estimate the impact over actual programs. Stated more formally, depending on the change being considered, judging the change based on the worst possible outcome for any program is more appropriate than judging based on some other kind of norm of the disruption over the space of known programs. Generally each kind of compatibility has both positive and negative aspects; that is, the positive aspect keeping things that "work" working and the negative aspect of keeping things that "don't work" not working. For example, the TCK tests for Java compilers include both positive tests of programs that must be accepted and negative tests of programs that must be rejected. In many circumstances, preserving or expanding the positive behavior is more acceptable and important than maintaining the negative behavior and we will focus on positive compatibility in this entry. In terms of relative severity, source compatibility problems are usually the mildest since there are often straightforward workarounds, such as adjusting import statements or switching to fully qualified names. Gradations of source compatibility are identified and discussed below. Behavioral compatibility problems can have a range of impacts while true binary compatibility issues are problematic since linking is prevented. Source Compatibility
A Java compiler's job also includes mapping more abstract names to more concrete ones, specifically mapping simple and qualified names appearing in source code into binary names in class files. Source compatibility concerns this mapping of source code into class files, not only whether or not such a mapping is possible, but also whether or not the resulting class files are suitable. Source compatibility is influenced by changing the set of types available during compilation, such as adding a new class, as well as changes within existing types themselves, such as adding an overloaded method. There is a large set of possible changes to classes and interfaces examined for their binary compatibility impact. All these changes could also be classified according to their source compatibility repercussions, but only a few of kinds of changes will be analyzed below. The most rudimentary kind of positive source compatibility is whether code that compiles against L1 will continue to compile against L2; however, that is not the entirety of the space of concerns since the class file resulting from compilation might not be equivalent. Java source code often uses simple names for types; using information about imports, the compiler will interpret these simple names and transform them into binary names for use in the resulting class file(s). In a class file, the binary name of an entity (along with its signature in the case of methods and constructors) serves as the unique, universal identifier to allow the entity to be referenced. So different degrees of source compatibility can be identified:
Whether or not a program is valid can also be affected by language changes. Usually previously invalid program are made valid, as when generics were added, but sometimes existing programs are rendered invalid, as when keywords were added (strictfp, assert, and enum). The version number of the resulting class file is also an external compatibility issue of sorts since that affects which platform versions the code can be run on.
Full source compatibility with any existing program is usually
not achievable because of
will compile under L1 but not under
L2 since the name "
An adversarial program could almost always include
Due to the
Adding overloaded methods has the potential to change method
resolution and thus change the signatures of the method call sites in
the resulting class file. Whether or not such a change is problematic
with respect to source compatibility depends on what semantics are
required and how the different overloaded methods operate on the same
inputs, which interacts with behavioral equivalence notions. Assume
class
If a new method cannot change resolution, then it is a binary-preserving source transformation. If a new method can change resolution, if the different class file that results has acceptably similar behavior, the change may still be acceptable, while changing resolution in such a way that does not preserve semantics is likely problematic. Changing a library in such a way that current clients no longer compile is seldom appropriate.
Binary Compatibility
The JLS defines binary compatibility strictly according to linkage; it P links with L1 and continues to link with L2, the change made in L2 is binary compatible. The runtime behavior after linking is not included in binary compatibility:
As an extreme example, if the body of a method is changed to throw an error instead of compute a useful result, while the change is certainly a compatibility issue, it is not a binary compatibility issue since client classes would continue to link. Also, it is not a binary compatibility issue to add methods to an interface. Class files compiled against the old version of the interface will still link against the new interface despite the class not having an implementation of the new method. If the new method is called at runtime, an AbstractMethodError is thrown; if the new method is not called, the existing methods can be used without incident. (Adding a method to an interface is a source incompatibility that can break compilation though.) A design requirement from the addition of generics via JSR 14 was migration compatibility. Migration compatibility requires that a library can be generified and existing (nongeneric) clients can continue to compile and link against the generic version. Meeting this constraint led to the use of erasure, a controversial aspect of the generics design. During JSR 14, it was not known how to add generics in a way that supported both reification and migration compatibility; future work might address this shortcoming. Behavioral CompatibilityIntuitively, behavioral compatibility should mean that with the same inputs program P does "the same" or an "equivalent" operation under different versions of libraries or the platform. Defining equivalence can be a bit involved; for example, even just defining a proper equals method in a class can be nontrivial. In this case, to formalize this concept would require an operational semantics for the JVM for the aspects of the system a program was interested in. For example, there is a fundamental difference in visible changes between programs that introspect on the system and those that do not. Examples of introspection include calling core reflection, relying on stack trace output, using timing measurements to influence code execution, and so on. For programs that do not use, say, core reflection, changes to the structure of libraries, such as adding new public methods, is entirely transparent. In contrast, a (poorly behaved) program could use reflection to look up the set of public methods on a library class and throw an exception if any unexpected methods were present. A tricky program could even make decisions based on information like a timing side channel. For example, two threads could repeatedly run different operations and make some indication of progress, for example, incrementing an atomic counter, and the relative rates of progress could be compared. If the ratio is over a certain threshold, some unrelated action could be taken, or not. This allows a program to create a dependence on the optimization capabilities of a particular JVM, which is generally outside a reasonable behavioral compatibility contract. The evolution of a library is constrained by the library's contract included in its specification; for final classes this contract doesn't usually include a prohibition of adding new public methods! While an end-user may not care why a program does not work with a newer version of a library, what contracts are being followed or broken should determine which party has the onus for fixing the problem. That said, there are times in evolving the JDK when differences are found between the specified behavior and the actual behavior (for example 4707389, 6365176). The two basic approaches to fixing these bugs are to change the implementation to match the specified behavior or to change the specification (in a platform release) to match the implementation's (perhaps long-standing) behavior; often the latter option is chosen since it has a lower de facto impact on behavioral compatibility. Case StudyConsider two versions of a simple enum representing the crew of the USS Enterprise, one for the first season:
and another for the second season:
Compared to the first reason, the second season:
These changes have varying source, binary, and behavioral compatibility effects:
JDK Platform and Update Release Compatibility PoliciesThe compatibility policies we apply to platform releases, like JDK 7, differ from those applied to maintenance and update releases, like JDK 6 updates. For both kinds of releases, binary compatibility must be maintained for JCP-managed APIs. Update releases must maintain source compatibility, but platform releases are able to break source compatibility given sufficient justification. In update releases, behavioral compatibility is regarded as very important; programs may be relying on specified-to-be-unspecified behavior of a particular implementation and switching to another update in the same release family should be seamless whenever possible. In contrast, platform releases have fewer restrictions on changing such behavior. So, for example, modifying the order of iteration of elements in a HashMap to allow faster hashing algorithms, would be quite appropriate for a platform release ("This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time."), but would be much less suited to an update release. Managing Compatibility
The above statement from the original JLS could be regarded as vacuously true about any platform: except for the non-determinisms, a program is deterministic. The difference was that in Java, with programmer discipline, the set of deterministic programs was nontrivial and the set of predictable programs was quite large. In other words, the platform provider and the programmer both have responsibilities in making programs portable in practice; the platform should abide by the specification and conversely programs should tolerate any valid implementation of the specification. To make continued evolution of the platform more tractable, it may be helpful to introduce more structured ways of tracking behavioral changes so that programs could in principle by audited for depending on aspects of the platform in ways that are not recommended. For example, potentially annotations could be used to:
Annotation processing is a general purpose meta-programming framework, standardized as part of the platform as of JDK 6. Annotation processors, probably also using the tree API, could be written to check for usage of changed or problematic APIs in source code. The D compiler in DTrace can enforce analogous limits on the stability levels and dependency classes of D scripts. While there would be considerable cost and complication to designing such a scheme and retrofitting it onto at least a subset of the JDK, the ability to define and then programmatically test policies for behavioral compatibility issues could enable platform providers and programmers to have a smoother joint stewardship of keeping applications running and Java usage growing. ConclusionCompatibility is a multifaceted concept, with nuances within each broad category. In the future, annotation processors or other program analyzers might help manage source, binary, and behavioral analysis by direct analysis or program markup. AcknowledgmentsÉamonn McManus gave useful feedback on a draft of this entry. Notes
Further Reading
|
Calendar
RSS Feeds
All /Annotation Processing /General /Java /JavaOne /Numerics /OpenJDK SearchLinks
NavigationReferersToday's Page Hits: 280 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||