Tuesday Apr 04, 2006

Developing Modules for Development

I can’t control my masochistic tendencies, so I’m about to start yet another JSR. This time, the topic is modularity. Specifically, some notion of modules for development of Java programming language source code.

The upcoming JSR should not be confused with JSR 277, which deals with deployment modules - you know, JAR files and packaging, run time name space issues in support of side-by-side deployment and the class loading issues. They have the hard problem, and I’ve been quite involved in their discussions and in the design of a reflective module API that looks like it will end up being a key component of their solution.

No, the new language modules JSR deals with relatively simple issues.

The most important of these is information hiding. Currently, the only module-like mechanism available in the Java programming language is the package. As a namespace mechanism, packages have been very successful: the inverted internet domain convention has worked very well, and has been an important contributing factor in the success of the Java platform.

Alas, as a modularity mechanism, packages leave a lot to be desired. Here is the basic problem: suppose you are building a software application. Your system consists of several parts/subsystems.

These subsystems are typically more tightly coupled to each other than to the outside world. So you’d like to each to expose an internal API available to the other subsystems but not to external clients. To do this, you find that you have to place all the subsystems in one big package. The internal API is then package-private. This approach has problems. Your system is all in one giant, unwieldy package. You cannot protect elements of a subsystem from other subsystems.

The obvious alternative is to place each subsystem in its own package. Now the package structure nicely mirrors the design of your application. Each subsystem can protect its internals from the rest of the system. However, you cannot provide that privileged internal API you wanted. You can make things public so that other subsystems can get at them - but now you’ve gone and exposed them to the whole universe.

Here is a strawman solution: we introduce superpackages (The name was NOT my idea). A superpackage lists a series of member packages. Public members of member packages are accessible in all other member packages - but not to the wider world. Only members explicitly exported by the superpackage are available outside the superpackage. This might look something like this:

super package com.sun.myModule {
export com.sun.myModule.myStuff.*;
export com.sun.myModule.yourStuff.Interface;
com.sun.myModule.myStuff;
com.sun.myModule.yourStuff;
com.sun.SomeOtherModule.theirStuff;
org.someOpenSource.someCoolStuff;
}


Don’t fuss over the syntax. It is bound to change.

Tangent: Not that syntax doesn’t matter - but semantics are a much deeper issue. That is in fact why most people focus on syntax. It doesn’t require much expertise or understanding, so everyone can have an opinion. This phenomenon is closely related to Parkinson’s law of Triviality:


The time spent on any item of the agenda will be in inverse proportion to the sum involved.




So, repeat after me: fussing over syntax is for the language-design challenged, and for marketing to the illiterati.


Another, less critical issue we’d like to deal with is separate compilation. Today, if you import code from another package, you need to have that code available - either as source or as binary. In principle, all you really need are the declarations of the public members you are using. You can get around this fairly easily by constructing dummy declarations - but it is an ugly nuisance.

Here is an example:

package interface fully.qualified.packageName;
// implicitly public types and members
class C implements fully.qualified.interface {
String someMethod();
C(int i);
protected Object aFieldName;
}


Remember, this is a strawman; there should be a JSR dealing with this soon, and the expert group will resolve all of these issues. For example, it will have to decide whether superpackages nest, and what the exact VM access rules are, what the source syntax and binary formats are etc. And very importantly, all of this stuff will have to play well with JSR 277, which is why the two EGs are certain to have several members in common.

Comments:

Gilad, wouldn't "superpackage" overcomplicate development and make deployment and packaging more difficult? Existing solutions (OSGi for instance) deal with this beaking code into modules that give isolation you looking for. The missing part is probably tooling support (well, again only outside Eclipse). Though separate compilation is interesting problem. I wonder what impact it has on the bytecode level. By the way, Eclipse has interesting feature for applying refactorings scripts right to compiled code (e.g. jars built from different code branch). It would be interesting to find out if there is someting in common with this issue.

Posted by eu on April 04, 2006 at 03:13 PM PDT #

No.

Superpackages have the nice property that existing code doesn't change at all. So if you are happy now, you can keep doing what you're doing.

Note also that the interface between the language features and JSR 277 is much narrower than people think at first. If anything, this stuff facilitates JSR277, which facilitates deployment.

OSGi is irrelevant to the concerns here. It attempts to address deployment, while this is a development solution.

Posted by Gilad Bracha on April 04, 2006 at 05:20 PM PDT #

No.

Superpackages have the nice property that existing code doesn't change at all. So if you are happy now, you can keep doing what you're doing.

Note also that the interface between the language features and JSR 277 is much narrower than people think at first. If anything, this stuff facilitates JSR277, which facilitates deployment.

OSGi is irrelevant to the concerns here. It attempts to address deployment, while this is a development solution.

Posted by Gilad Bracha on April 04, 2006 at 05:20 PM PDT #

Is there any work on modules in Java for the purpose of extending an entire module (a la Scala's traits)? For that matter, is there any work on actual mixins (perhaps like "Classes and Mixins")?

Posted by Sean on April 04, 2006 at 10:40 PM PDT #

It was about time someone did this "superpackage" thing. I just hate to make things public when they really shouldn't be. It would be interesting to see if sun will be able to hide it's "com.sun.*" classes that so many are using :D

Posted by Ionutz on April 04, 2006 at 11:52 PM PDT #

Sean,
I haven't had a chance yet to check out Scala but the language ObjectTeams/Java might be interesting to you. It defines a new modularization concept: the team. A team is like a package in that it consists of several classes. A team is also a class, meaning that it can have its own methods and fields and can be inherited from. This way you can extend modules.
As to mixins, you also might want to have a look at the language CaesarJ which supports mixin inheritance. In CaesarJ you can write something like "C1 extends C2 & C3" which defines the class C1 to be a mixin composition of C2 and C3. Other than that, CasarJ allows for the definition of AspectJ pointcuts.

Posted by Paul Häder on April 05, 2006 at 02:35 AM PDT #

You'll probably want to look at the ModJava work out of IBM. They go farther than mere "Modules as information hiding" and use modules to solve versioned dependency specification issues ("jar hell"). There's also a handy-looking bit about being able to "seal" classes, so that they can be used outside a module, but can't be subclassed. Some pretty sweet stuff, although I can no longer find the paper online.

Posted by Dave on April 05, 2006 at 05:10 AM PDT #

An alternative strawman solution - how about doing the opposite? Rather than restricting what's visible via a "superpackage", why not encourage clients of your interface to use "published" methods?

Something like:

@Published public class MyInterface {
    @Published public String publishedMethod();
    public String methodUsedInternally();
}

Then, my IDE can list the published methods first when it's autocompleting, I can tell my IDE/build tool/classloader to warn/stop me if I use unpublished methods from certain packages, I can build JavaDoc for just the published methods - but I can still ignore the warnings and use unpublished methods if I want to.

Posted by Inigo on April 05, 2006 at 05:30 AM PDT #

It's probably both (1) too complex and (2) too open-ended and rule-based to be part of the language, but you might take a look at Macker for ideas:

http://innig.net/macker/

It lets you define pattern-based rules that enforce exactly the kinds of large-scale modularity you're talking about. An interesting effect of its pattern-based approach is that modularity rules can cut across packages in interesting ways, and can limit both ends of the dependency -- i.e. "only database classes can use this database API," instead of a superpackage's more limited of "anybody anywhere can access this database API."

That said, the more hierarchical approach you're outlining makes a lot more sense as a language feature.

Aside 1: I think the annotation approach Inigo suggests is promising.

Aside 2: Should superpackages nest?

Posted by Paul Cantrell on April 05, 2006 at 12:45 PM PDT #

Gilad, I was referring to the "export" feature of OSGi bundles, which can be used by compiler to decide if imported API is hiden/not allowed to use. I'd say that it won't require code changes either, but may require to repackage or break monolithic source tree into separate modules, which would be a good thing in most of the cases, though would require some extra work.

Posted by eu on April 05, 2006 at 02:11 PM PDT #

Gilad, from your description it sounds like you want the concept of inner classes for packages, and a way to introduce autogenerated headers into the language. Both ideas have a nice sort of a "workeable hack" feel around them, though I am not sure if they don't just add more cruft on top of an existing pile of hacks. Macker's rule/pattern based approach seems to be a more general way to deal with the problem, if it was supported in compilers/jars/classes/classloaders. It may also mean less explicit verbosity, which I think is crucial to get something like that actually used in practice, outside of a JSR setting. I doubt that your regular Java developer is going to spend time annotating their private classes in real life. It's not like they care about not using private sun.* classes, after all, so a lot of real life, 'enterprisey' Java code is just horrible wrt modularity.

Posted by Dalibor Topic on April 06, 2006 at 06:49 AM PDT #

By the way, the goal of super packages can be achieved using class level annotations (sort of tackling it from the other end). An interesting aspect is that same class can belong to multiple super packages. See related comments at http://tinyurl.com/q6w57

Posted by eu on April 06, 2006 at 01:08 PM PDT #

If you don't want a package to be visible to the entire world but only to a specific scope, what about to add a new 'protected' class modifier where the class isn't just visible in the single package containing it (as using the default modifier), but also in the sub-packages and in the 'root' package of the hierarchy? A simple example:

com.pany.api
com.pany.api.internal

In this case, we could think about the package 'internal' containnig only 'protected' classes, i.e. those we use in the whole API, but not exposing them to the entire world.

The 'root' package could be 'com' or maybe the first package in the hierarchy containing a class. In our example probabily 'com' and 'pany' wouldn't contain anything, so that the root package could be 'api'. This way we could use the 'internal' package since from the 'api' to the inside of the hierarchy, but not outside of it.

I have a project which have some classes I would like to put into a separate package, since they are used in the whole project (package visibility don't works), but I wouldn't like to expose them to the user. Actually the only thing I can do is suggesting the user for don't using that package, using name convention such as calling this package 'support' or 'private'.

I hope something good for us raises from your idea :)

Posted by Renato on April 06, 2006 at 02:29 PM PDT #

Gilad, on a side note, any chance of conducting this JSR, as well as others you are the spec lead for in an open fashion like Doug Lea did for the memory model JSR? I was trying to find out why pustatic spec draft does not describe the reality of how finals are working on 1.5 (you can set them outside of the static initializer without linkage errors), and of course not being able to look at the JSR 924 mailing list archives makes the whole excercise a bit futile. So, how about making the "age of participation" adage from Schwartz true for JSR's you're spec-leading? cheers, dalibor topic

Posted by Dalibor Topic on April 06, 2006 at 06:37 PM PDT #

I am glad that I am not the only one who thinks that internal APIs are useful.
The superpackages though I don't really like. It weakens encapsulation outside the realm where the code has be written originally. I'd favour the introduction of a "friend" keyword that can apply to packages, classes and members. C++ was not all bad.
Another thing I observed with my internal APIs: The types used internally are usually extensions of the exported types. This creates the need for frequent downcast when implementing a public method, because you cannot declare arguments to be of the internal type. I have no clever idea how to circumvent this besides an annotation
@implements("do(PublicThing)")
void do(InternalThing arg1){...
or an assertion like
do(PublicThing arg1 instanceof InternalThing) {...

Posted by Carsten Saager on April 07, 2006 at 03:34 AM PDT #

I want to better understand the problem you wish to solve.

Your example is of the variety "here's the problem: you'd like to do X, but you can't". I don't find this compelling: in particular, you don't say why I'd like to do X, other than I'm anal.

So let's try to delve further into the existing pain that your proposal would alleviate.

In your example, if you place each subsystem in its own package, obviously you would only document and officially support specific packages of your choice. Users could still access unsupported APIs, though. Painful consequences:

  • They could get at functionality you'd prefer they didn't use randomly, e.g. sending requests to your secure server.
  • They can screw up your package's internal state. (This is the same reason we have private data in classes.)
  • Once that has happened, they might blame you, claiming your software doesn't work properly. You might spend a lot of effort on this before realizing that they're doing it to themselves, using an undocumented API.
  • Users might get mad when you change the undocumented API, because they have software that uses it.

Which of these problems are you trying to address with this feature? Are there other important problems you think this will solve?

Generally, superpackages would fix all of the above by converting the judgement error involved in using an unsupported API into a compilation error. Is that a fair assessment? Does the feature do anything else?

Posted by Jason Orendorff on April 07, 2006 at 05:27 AM PDT #

A few points worth responding to: 1. No, no plans for mixins. 2. Many of the suggestions are the kind of horrible abuses of the annotation facility that I warned against in an earlier entry. Annotations are NOT used to add language features! 3. Superpackages are not at all like inner classes, and block structure is not a hack.

Posted by Gilad Bracha on April 07, 2006 at 03:03 PM PDT #

1) "Public members of member packages are accessible in all other member packages - but not to the wider world." How do you enforce this? Anyone who has the classes for the superpackage + its members can just delete the superpackage declaration and access the member packages' public classes. And I'd worry that member packages don't appreciate the context they're being constrained in. 2) Surprised not to see a plug for Fortress here, particularly when it recognises that a dev-time component system is lacking if it "exposes to everyone all the apis used in the development of a project." Consequently: "We can mitigate...by providing two simple operations, hide and constrain. Informally, hide makes apis no longer visible from outside the component and constrain merely prevents them from being exported. An api that is constrained but not hidden can still be upgraded."

Posted by Alex on April 08, 2006 at 04:33 AM PDT #

You can already do super packages by using static inner classes, a conventional package, and static imports. E.g.:
package superpackage;

/** Collection of a public classes (imported with one import statement) */
public class SuperPackage0 {
    public static class Public0 {}
    public static class Public0a {}
}


package superpackage;

/**
 * Collection of a package class and an internal class
 * (imported with one import statement)
 */
public class SuperPackage1 {
    static class Package1 {}
    private static class Internal1 {}
}

package superpackage;

// import other super packages - one import per collection
import static superpackage.SuperPackage0.*;
import static superpackage.SuperPackage1.* ;

/** Demonstrate package access */
public class SuperPackage2 {
    static class Package2 {}
    private static class Internal2 {}
    private static class Test {
        Public0 publicAccessAnotherSuperPackage = new Public0();
        Public0a publicAccessAnotherSuperPackageA = new Public0a();
        Package1 packageAccessAnotherSuperPackage = new Package1();
        // Internal1 internalAccessAnotherSuperPackage = new Internal1(); - No Access!!!!
        Package2 packageAccessOwnSuperPackage = new Package2();
        Internal2 internalAccessOwnSuperPackage = new Internal2();
    }
}
Probably the only language extension that would be desirable (though not vital) using this solution is to be able to declare top level classes as static packages and have all their members automatically static, e.g.
public static package SuperPackage1 {
    class Package1 {}
    private class Internal1 {}
}
This technique is only one extra level deep; it gives internal access, this is probably sufficient.

Posted by Howard Lovatt on April 09, 2006 at 05:08 PM PDT #

Could you indicate where this superpackage declaration will be stored? There seems to be only a few places and each place has drastic consequences for its practical use. If the superpackage is declared in the source files (like we do now for packages) we will add a lot redundancy because each class must agree on the same definition. Not only does this seem very error prone, opening up for unnecessary errors, it also makes it impossible to use the same classes in different superpackages. Creating a new package artefact has the same problems, albeit in a lesser scale. The only other place I can think of is in a deployment/packaging unit. However, then I fail to see the difference with the OSGi solution, which OSGi seems to add a lot of mature additional functions that were developed in response to real life use cases. For example, class filtering to hide implementation classes that had to be part of a package, versioning so you can actually safely share classes for efficiency and logic necessarity, custom attributes so developers can implement custom schemes like friends, reference integrity so you are ensured of using the right set of packages, even if you share, and so much more. BTW, OSGi is a packaging solution, not just a deployment solution. See the Eclipse compiler that honors the declarations. It would be a lot easier to add the OSGi declarations to Dolphin instead of developing something from scratch. Can you elaborate the key difference with the OSGi based solution because I am really puzzled what kind of functionality you add over this solution that has been around for some time. Kind regards, Peter Kriens

Posted by Peter Kriens on April 10, 2006 at 01:14 AM PDT #

Gilad replied by email (by email because this site was down) to my post above about using static class members as a super package. His points were:

1. Using static classes in this way is a bit ugly (especially at the binary level) and a bit beyond many developers.

2. It doesn't provide VM enforcement of all the access restrictions that appear at the language level.

3. It doesn't nest.

All of which are of course correct, but I don't think they are a show stopper (but see below). The static packages have other advantages, in partricular anything static can be included. This is very useful for static factories for generics, e.g.:

package superpackage;

import static java.lang.System.*; // using an existing super package!

public class Tuples {
    protected Tuples() {} // Nearest to: no instances, but can still extend
    public static class T1< E1 > {
        public final E1 e1;
        public T1( final E1 e1 ) { this.e1 = e1; }
        public String elementsAsString() { return "e1=" + e1; }
        @Override public String toString() {
            return "[" + elementsAsString() + "]";
        }
    }
    public static < E1 > T1< E1 > t( final E1 e1 ) {
        return new T1< E1 >( e1 );
    }
    public static class T2< E1, E2 > extends T1< E1 > {
        public final E2 e2;
        public T2( final E1 e1, final E2 e2 ) {
            super( e1 );
            this.e2 = e2;
        }
        @Override public String elementsAsString() {
            return super.elementsAsString() + ", e2=" + e2;
        }
    }
    public static < E1, E2 > T2< E1, E2 > t( final E1 e1, final E2 e2 ) {
        return new T2< E1, E2 >( e1, e2 );
    }
    // Etc.
    public static void main( final String[] notUsed ) {
        out.println( t( 'A', 'B' ) ); // Very nice syntax!
    }
}

Which prints <code>[e1=A, e2=B]</code>.

In light of Gilad's comments, maybe a slight language extension would be useful:

1. Allow <code>package</code> to be used like <code>class</code> (suggested above), e.g.:

public package {
  // Only static members
  // No constructors
  // Can be extended - adds concept of extending a package
  // Illegal to make an instance of it
  // But otherwise behaves like a class with static members
}

2. Allow <code>import</code> to appear inside a block so that it becomes block structured. Allow it to be placed anywhere in the block, not just at start; imported memembers are available throughout the block, just like other members.

3. Add <code>export</code> that is syntactically like the block <code>import</code> described above and if a package or class is imported then the exported classes etc. are also imported.

4. Allow <code>import export [static] package_name_pattern;</code> to mean that the members are both imported and exported.

Posted by Howard Lovatt on April 10, 2006 at 07:21 PM PDT #

Post a Comment:
Comments are closed for this entry.