Let me jump right into it. I propose an alternative to type inference for local variables. I'll explain why later. Everybody knows this example:
Map<String,List<Integer>> map = new HashMap<String,List<Integer>>();
I propose this solution:
Map<String,List<Integer>> map = HashMap.new();
Before I get to caught up in the details, let me examine why we are talking about this in the first place. As far as I'm aware, there are primarily two reasons for looking at examples as the above:
I'm currently aware of two serious proposals, the first from James:
map := new HashMap<String,List<Integer>>();
The second is from Christian:
final map = new HashMap<String,List<Integer>>();
Rémi has implemented both of these proposals. But let us not forget what is going on in other languages:
var map = new HashMap<String,List<Integer>>();
This is well known from JavaScript and has recently been added to
C#. However, adding it to Java would require adding a new keyword
(which is bad) or having some really strange rules. So
var is not really a contender for Java.
All of these proposals (mine excluded) essentially solve the problem using the same technique: type inference. The problem is that this promotes bad behavior:
This motivated Neal to suggest:
Map<String,List<Integer>> map = new HashMap<>();
Neal and I just had a chat about this. I think we both agree that it is ugly but the alternative could have compatibility problems (as well as other problems):
Map<String,List<Integer>> map = new HashMap();
So you can probably see how I got the idea:
Map<String,List<Integer>> map = HashMap.new();
However, there is more to it than just a simple syntax for
constructing instances. My initial reaction to this problem was:
we need more static factories throughout the JDK
. For
example:
Map<String,List<Integer>> map = HashMap.create();
Where the definition of create is:
public static <K,V> HashMap<K,V> create() { return new HashMap<K,V>(); }
What I propose is actually a new syntax for declaring static factory methods and that the compiler adds them by default (as it does with constructors). For example, if you declare a class:
public class Box<T> { public Box() {} }
The compiler would automatically add this method:
public static <T> Box<T> new() { return new Box<T>(); }
The programmer can specify any number of new methods
just as it is possible to have multiple constructors. If the
programmer provides a new method with a signature
that matches that of a public constructor, that will be used
instead of the compiler providing one.
Thanks to Neal M Gafter for his input on this idea and this posting.
Integer i = new Integer(5)?or
String s = new String("hello");
? We're still repeating ourselves, and the repetition adds nothing to making the program more understandable. I would agree that generics make the repetition even more ugly (and pointless) but that perhaps including James' option as well we can implement DRY more thoroughly. So I'd be in favor of:
s := new String("hello");
and
Map<String,List<Integer>> map = HashMap.new();Also, what about factories that (reasonably) take parameters, like the starting size of the HashMap? Regards, Patrick
Posted by Patrick Wright on January 18, 2007 at 01:24 AM PST #
Posted by Stefan Schulz on January 18, 2007 at 03:10 AM PST #
I believe that a key part of the 'look and feel' of Java code is the type declaration being on the left. Thats why I *really* dislike the suggestions from James and Christian (although I understand the motivation). Neal's suggestion comes originally from me (AFAIK) via the Javapolis whiteboards. I won't pretend its beautiful, but it is effective.
As to your static factory idea, I think it might work OK. You won't be able to block <code>Integer i = new Integer(5)</code> if I understand it correctly (due to backwards compatability). As a result, you won't be able to use it to fix old code where you later realise that you should have had a factory, for caching, not a public constructor.
And finally, one more possibility to throw in the mix:
SomeVeryLongClassName someEvenLongerVariablename = new(args);
which I think is within the Java 'look and feel'.
Posted by Stephen Colebourne on January 18, 2007 at 03:34 AM PST #
HashMap<String,String> hmap = new(10, 0.75); Map<String,String> map = new HashMap(10, 0.75);
Other point, about concerns about existing #new(...) methods. How could this be an issue ? 'new' is a keyword, and thus cannot appear as an identifier anywhere in the code. Thus 'new' methods would be purely synthetics, and no user code could ever refer to them without using the proposed syntax. The only problematic point, is that you couldn't let a user write custom ones, and purely rely on synthetic ones. If so, then why even need a generated method ? The compiler could simply macroexpand the call into a constructor call for free. Also if instantiating the lhs type, then no explicit type would be required.
HashMap<String,String> hmap = new(10, 0.75); Map<String,String> map = HashMap.new(10, 0.75);
Posted by Philippe Mulet on January 18, 2007 at 04:48 AM PST #
public class Test {
class A {
}
public static void main(String[] args) {
Test Test=new Test();
Test.new A();
}
}
And i don't understand why javac needs to
automatically generates static factories ?
The compiler could try to call a static factory first
and if it doesn't find an applicable one try to call
a constructor.
Rémi
Posted by Rémi Forax on January 18, 2007 at 04:58 AM PST #
MyClass.new({somearguments});
could be compiled similar to
new MyClass<{inferredtypes}>({somearguments});
Posted by Stefan Schulz on January 18, 2007 at 05:29 AM PST #
@ Philippe
I'm not sure I get the reference to "array initializer simplification for variable declarations".
@ Philippe and Rémi
The syntax could be used without the static factory methods. However, then it would be a different proposal ;-)
I think it is more powerful, yet simple, to add factory methods that can be provided by the user.
Posted by Peter von der Ahé on January 18, 2007 at 06:25 AM PST #
Posted by Peter von der Ahé on January 18, 2007 at 06:33 AM PST #
Posted by Peter von der Ahé on January 18, 2007 at 06:41 AM PST #
As you've shown, we already have type inference on method calls, just not on constructor calls (why?). It's possible to write a generic method 'hashMap()' that returns the right kind of Map, used like so:
Map<String,Integer> aMap=hashMap();
Rather than a language change, this could be a simple API change to add hashMap(), hashSet(), etc., to the APIs. If ArrayList et al had been designed from the start to be instantiated via a static method, then I doubt we'd even be discussing this.
Ok, discounting my uninteresting solution, onto your idea:
If the compiler had to generate code to do this, then the language feature would be incompatible with existing bytecode, unless it could generate it at call-sites.
A problem with allowing programmers to write methods called 'new' is that they might not actually generate a 'new' object, whereas as far as I know, in Java 'new' always creates a new object.
Is there a reason why HashMap.new() could not use the existing constructor and apply type inference?
Further, is there a reason why 'new HashMap()' could not have type inference applied, then there's no new syntax?
Currently, assigning a raw type to a parameterised type is a compile warning. I don't see how allowing inference on this will adversely affect any existing code; other than removing the warning. Is there a test case that shows this problem?
The older idea, using 'final' to mean 'infer local variable type', has an interesting repercussion. The type could actually be an anonymous type, e.g.:
final o=new Object(){int a=5,b=6;}; o.a=6; o.b=5;
If that looks odd, note that this already works.
System.out.println(new Object(){int a=5;}.a);
If I had to choose between your proposal and the 'final' proposal, I'd go with yours for readability and security (I treat variables defined in anonymous classes as private, regardless of their actual visibility).
Another idea is, when you have a concrete type, simply HashMap<whatever> map=new(); without having to statically import 'new' from somewhere. Getting closer to C++'s stack-allocation syntax there..
Posted by Ricky Clarkson on January 18, 2007 at 07:45 AM PST #
(1) DRY is good (2) boilerplate repeated type definitions obscure rather than enlighten so (3) this seems like a generally good idea.
However, aren't there problems with completeness and O() of runtime when you try to infer types from right to left? I can't remember whether SML/ML allowed it for example. And wasn't it one of the hurdles in front of covariant Java method return types?
I'm still deciding if I like the auto boxing/unboxing in JSE5, having just started using it to save considerable keystrokes, just because I now have to start wondering again (very C++ like) if something as apparently benign as "final long l = <simpleexpr>;" can cause an NPE. Before, I knew that it could not unless I could see a function call. Now it might if simplexpr involves Long rather than long by design or accident.
Are there any such slight-anxiety-raising features lurking in your suggestion(s)?
Rgds
Damon
Posted by Damon Hart-Davis on January 18, 2007 at 08:24 AM PST #
HashMap<String, List<Integer>> map = new({parameters});
HashMap<String, List<Integer>> map = new HashMap<String, List<Integer>>({parameters});
Posted by Stefan Schulz on January 18, 2007 at 08:26 AM PST #
Posted by Ricky Clarkson on January 18, 2007 at 08:53 AM PST #
I will apologize in advance for putting 4 things into one post, but I wanted to make 4 separate points! If you get bored reading such a long post :) stick to item 1 - its the most important!
You could add new keywords without pain. If you added the keyword source that accepted the Java version number. The source declaration would go before the package declaration. No code broken because no statement could go before package previously. Then if a package or import statement included something that was a keyword in the language version, e.g. source 7, then this keyword could be name mangled. So for example in the proposal C3S the keyword declare is proposed, as in declare map = HashMap.new( collection ), but if the name declare was included from elsewhere it could be mangled to _declare_ say. Note this suggestion will work well now that other languages are on the JVM. These languages don't have the same keywords as Java and might well have a method called final, say, in one of their libraries.
Removing the type declaration in a local/field declaration from the left hand side of = has more use cases than removing it from the right. There are a lot of examples of a method returning a value that you assign to a local/field. EG final x = someMethod( ... ) as opposed to final SomeType = someMethod( ... ).
None of the ideas you are presenting for Java 7 are that new, you will find very similar proposals from many people on the web, some going back years, e.g. from myself: RFE 6389769, SSCO, and C3S.
The later of the proposal referenced above, C3S, includes an alternative treatment for new defining a constructor instead of defining a static factory but retaining the type inference that the static factory has. This addresses the issue raised by others in this forum of why generate a static factory - just call the constructor.
Just to re-iterate the most important point raised above, item 1, is a suggestion as to how new keywords can be added painlessly that is compatible with other languages.
Posted by Howard Lovatt on January 18, 2007 at 05:56 PM PST #
Although it solves the most common use case, I still would be disappointed about not getting type inference for local variables. Avoiding clutter in a method body seems well worth the minor inconvenience of using class types rather than instance types. In well-written code, most methods should be short, so this is a minor issue.
I also agree with the previous poster that we shouldn't accept keywords being hard to add as an immutable rule. Adding keywords makes the code for new language features more readable, so we should figure out how to do it.
Posted by Brian Slesinsky on January 18, 2007 at 08:22 PM PST #
Following on from Brian Slesinsky post and his comment re. keywords. I started off trying to think of syntax without new keywords in RFE 6389769 and SSCO, but I kept hetting feedback that keywords are the essence of Java and hence version 3 of my proposal, C3S, I put in keywords and this has recieved largely positive comments.
Posted by Howard Lovatt on January 18, 2007 at 09:01 PM PST #
Posted by Stephen Colebourne on January 19, 2007 at 02:48 AM PST #
Looking at this proposal, I can't help but think it's a special case for something more general - being able to define 'static interfaces' which is to say being able to require a class implement certain static methods. Doing that would move Java towards SmallTalk and other languages where classes are extendable. Sketching it out would look something like this:
<code>
package java.lang;
static interface Creatable<C> {
static C create(Object...);
}
class Foo implements Bar, static Creatable<Foo> { //how would you enforce creatable<Foo> though?
public static Foo create(Object...) { //required by interface
return new Foo();
}
}
//now use it
for(Creatable<?> c : listOfFactories) {
instances.add(c.create()); //what's the type of 'c' here? some kind of generated subclass of java.lang.Class?
}
</code>
Of course, what's above isn't very coherent. Introducing extendable classes into Java might well be impossible, but what you're proposing does make me think there's a more general feature hiding in there somewhere.
Posted by Andrew Thompson on January 20, 2007 at 08:33 AM PST #
public interface Metaclass<I, C<I> extends Metaclass<I, C<I>> {
<U> C<? extends U> asSubclass(C<U> clazz);
C<? super I> getSuperclass();
I cast(Object obj);
I newInstance();
...
}
public class Class<T> implements Metaclass<T, Class<T>> {
<U> Class<? extends U> asSubclass(Class<U> clazz) {...}
Class<? super T> getSuperclass();
T cast(Object obj) {...}
T newInstance() {...}
...
}
public interface Stringable<T> {
force static T valueOf(String string);
String toString();
}
public interface Clonable<T> {
force T clone();
}
public interface Factory<T> {
force this();
}
Posted by Stefan Schulz on January 20, 2007 at 12:20 PM PST #
import static java.util.HashMap.new;
For the majority of cases, we know which implementation will by required. Defaults are a smart move. If I say I want a new Maplt;K,V>, you know 95% it's a new java.util.HashMap<K, V> I want (even if some PuTTY author calls it a cowboy algorithm). new Listlt;E> will be a new ArrayListlt;E>. new Font will be a new Font (ignoring PL&Fs). So why not just allow interfaces and classes to nominate a default implementation:
interface Map<K,V> default new HashMap<K,V> { ... }
...
Map<MyKey, MyValue> map = new();
I still haven't worked out what the problem is with inferring types in the currently legal syntax: Map<MyKey, MyValue> map = new HashMap();
Posted by Tom Hawtin on January 21, 2007 at 01:36 PM PST #
Posted by Ricky Clarkson on January 21, 2007 at 05:30 PM PST #
one of the examples you've given already compiles in Java 5, albeit with "unchecked warning":
Map<String,List<Integer>> map = new HashMap();
Note, that you don't need any type inference to make it work, you just need some "smart rule" which tells that this assignment of freshly constructed raw type to a concrete instantiation of Map interface is actually _safe_, since pointer to this instance of raw type did not leak anywhere else and could not possibly become polluted with something else. So, unchecked warning should not have been emitted for this code in the first place. I admit, that this linear property of this particular constructor cannot be checked without additional changes in the language, however we could recruit some new annotation to do the trick.
Posted by Roman Elizarov on January 22, 2007 at 01:45 AM PST #
Posted by Michael Mangeng on January 23, 2007 at 01:20 AM PST #
import java.util.Collections;
import java.util.List;
public abstract class NewClass<T> {
public static final NewClass<List<String>> STATIC_INSTANCE
= new NewClass<List<String>>(Collections.emptyList()) {
};
private NewClass(final T t) {}
}
Posted by Steven Coco on January 23, 2007 at 07:13 PM PST #
Posted by Steven Coco on January 23, 2007 at 07:20 PM PST #
Posted by Stephen Colebourne on January 24, 2007 at 03:17 PM PST #
Anyone still reading this entry?!
@Stephen C.:
Thanks a lot for responding with that help. I already did implement something just like that. I thought it might be a bug though. I will actually go look at the JLS and see how that type is inferred. Thanks again for the help.
On this exact topic: in fact, on the exact method .emptyList(): I had previously filed a bug against javac about the implementation that's used by that method. I was writing a utility class of my own, and I wanted to do something just like .emptyList(); and I got stuck realizing it didn't seem possible. I peeked at that code and found a bug -- javac has a bug, and that method seems to rely on it. For those interested, that bug id is 6467183.
Thanks. - Steev Coco.
Hi Peter,
You write: "we need more static factories throughout the JDK". This is very true, but on the other hand, there are very serious problems with static factories in their current form. For example, if class A has a static factory called getInstance(), and we wish to mandate using that method, then the class's constructor must be private. Which means A can't be subclassed! Or, if the constructor isn't private (e.g., package-level access) and some other class B extends it, then --- if you forget to re-define getInstance() for B --- the expression B.getInstance() will return an instance of A!
Another problem is that static factory methods are not "transparent", in the sense that you cannot introduce a new such method and have every existing clients immediately use it. Let's say we have an old class S and we wish to make it a singleton (or manage an instance pool, or whatever). We can add a new factory method getInstance(), but existing clients are now broken, since they still use the old constructor...
The solution is to allow classes to control how instances are generated, by default; i.e., every call to new S() is always a factory call. By default, a factory is generated for each constructor; but developers can choose to specify their own factories (in which case no default factories are generated). The syntax I suggest is:
class S {
public static S instance = null;
public static new() {
if (instance == null) s = this();
return instance;
}
public S() { ... constructor code ... }
}
A few things to note: the keyword new is used as the name of factories. The keyword this is used for invoking the constructor from within the factory; constructors can only be invoked from within factories -- external clients never have direct access to constructors. However, existing code is not broken since the syntax used so far ("new S") now implies a factory invocation. We use "this" rather than "new S" to invoke the constructor in order to prevent a recursive call.
Factories are generated by default only for regular classes, but can be manually added to abstract classes or even to interfaces. Think of adding a factory method to an interface like List:
interface List {
public static new(boolean synch, boolean randomAccess) { ... }
...
}
Which will return an instance of Vector, ArrayList, LinkedList or a synchronized LinkedList, depending on the arguments provided by the user. The user is now completely shielded from explicitly naming any specific implementation of List in his code. And if a future version of the JDK includes a new SuperDuperList class, any code using "new List(...)" with relevant parameters will immediately benefit from this new class.
Factories can be viewed as an overriding of the "new" operator. However this is different than "new" overriding in C++, since there, you only get to control memory allocation; you cannot, for example, return an existing instance (or an instance of a subclass, etc.). In fact, this mechanism cannot be easily introduced into C++, since in C++ objects can be stored on the stack -- but here, we cannot know in advance what specific type will be returned by the factory, so the exact memory requirement is not known in advance.
Additional details and discussion can be found in Better Construction with Factories.
var map = new HashMap<String,List<Integer>>();is the best syntax. It is the most beautiful and most obvious, and it could be used in for loops as well:
for (var e : someEnum)I don't think that other suggestions could be used inside for loops. And what is the problem with adding a new "var" keyword (when compiled into JDK7)?
Posted by Harri Pesonen on March 16, 2007 at 05:50 AM PDT #
If introducing a new keyword is really a problem (breaks existing code), how about using "*" instead of "var"? After all, "*" is the universal "wildcard" symbol. So we'll have:
* map = new HashMap<String,List<Integer>>();
or:
for (* e : someEnum)
Posted by Harri Pesonen on March 19, 2007 at 12:21 AM PDT #