Search

Categories

Links

Referers

Considered Harmful

Jul 27 2005, 10:44:51 PM PDT »Java Comments [16]
Dijkstra wrote a classical paper Go To Statement Considered Harmful. A paper that was considered classical in 1995 is truly a classic in our business ;-)

Dijkstra did understand what he was talking about.

Dijkstra was urging the community to move forward and work with higher abstractions.

Dijkstra points out that our intellectual powers are limited in a fashion that should make us strive to shorten the conceptual gap between the static program and the dynamic process.

A modern example of this conceptual gap is found when adding an element to a collection and extracting it in the Java™ Programming Language before the advent of generic types:

List strings = new ArrayList(); // list of strings
strings.add("Hello");
...
String s = (String)strings.get(0);

Notice how the programmer is required to bridge a conceptual gap by adding comments and casts that aren't enforced by the compiler as it is with generic types:

List<String> strings = new ArrayList<String>();
strings.add("Hello");
...
String s = strings.get(0);

Also notice that the compiler will notify the developer if he makes mistakes:

Integer i = (Integer)strings.get(0); // compile-time error
strings.add(1); // compile-time error

The benefits should be clear and I like to think if Dijkstra had been consulted on generics he would have said: "not having generic types should be considered harmful".

Ken Arnold doesn't understand generics and thus he suggest that generics should be considered harmful. Let's examine Ken's arguments:

My colleague told me something I misunderstood, but when I follow my wrong conclusions to the end, it doesn't make sense. Thus generics is considered harmful. And so Ken declares that you shouldn't do this:

    interface Holder<T> {
        T[] toArray();
    }

Allow me to correct your misunderstanding. You can't do this:

class Box<T> {
    T[] array = new T[10]; // compile-time error
}

You should not try to work around it:

class Box<T> {
    T[] array = (T[])new Object[10]; // BAD and causes a warning
}

This is an unfortunate side-effect of erasure, the VM doesn't know what T is at runtime, so the compiler has no way to generate code that can construct an array of the right type. In the VM, all arrays must record their element type. This is because arrays should be considered harmful:

public class Test {
    public static void main(String... args) {
        Object[] array = new String[10];
        array[0] = new Object(); // causes a run-time error, ArrayStoreException
    }
}
The "intuitively correct" type rule that allows the first assignment demonstrates that arrays aren't statically type safe. However, they are type safe: the VM eventually rejects the problematic assignment.

If you replace arrays with the generic ArrayList, you get this program:

import java.util.*;
public class Test {
    public static void main(String... args) {
        List<Object> array = new ArrayList<String>(10); // compile-time error
        array.add(0, new Object());
    }
}

So Ken and others should consider arrays harmful, not generics.

Ken also says that because he and David can't understand the declaration of Enum then it must be evil.

Let's look at Enum, it's not that complicated:

class Enum<E extends Enum<E>> { ... }

This is called an F-bound, I'm not exactly sure what the F means but think of it as function bounds. The bound on the type variable is a function (F) of the type variable. Yes, I know that probably didn't help your understanding. Hold on, here goes...

Since Enum is a special implementation class used for the new enum types, let's look at Comparable instead. The definition of Collections.sort looks like this:

public static <T extends Comparable<? super T>> void sort(List<T> list)

T should be a type which implements Comparable of a supertype of T. In other words: T must be comparable to it self (or a supertype of itself). This is a simple recursive definition.

So how do you validate that a given type meets the criteria (is within the bounds of T). It's simple: substitute T for the type and check if the statement is true. For example, Integer implements Comparable<Integer>, so the question is does

Integer extend Comparable<? super Integer>

The answer is yes. Integer implements Comparable<Integer> which is a subtype of Comparable<? super Integer>. Piece of cake.

Post a Comment:
Comments are closed for this entry.
Comments:

How long did you consider subjecting arrays to erasure, too? I _know_ you did. Getting rid of the runtime element type would have broken quite some software but solved the mess, right? Say hello to Gilad, BTW.

Posted by Matthias Ernst on July 27, 2005 at 11:40 PM PDT #

A comment on F-Bounds: quite a few years ago we were developing business applications with our own programming language "TL 2" [1]; basically a Smalltalk-like language with an optional, polymorphic, higher-order type system. I do not have a fond memory of using F-bounded parametric polymorphism from there. It worked well in the base library but from a type-checking session with a colleague I remember how the F-bounds were basically viral: in order to correctly typecheck a class that used another class with an F-bound we had to introduce the F-bound to the using class as well. And so on. In the end the whole system was parameterized. I don't remember the exact circumstances; may have been a design mistake or a bug in the type checker ... Or maybe the lack of wildcards in our system. [1] http://www.sts.tu-harburg.de/projects/Tycoon2/entry.html

Posted by Matthias Ernst on July 28, 2005 at 02:26 AM PDT #

I don't think it was ever considered subjecting arrays to erasure. It is well-known that the array store check requires type information. We did consider a new addition to the language, statically type-safe arrays. However, these arrays proved to be of little value as you can't easily convert between "old" arrays and the new kind. Also, the syntax was extremely confusing. I don't believe in this approach anymore.

Posted by Peter von der Ahé on July 28, 2005 at 10:48 AM PDT #

> It is well-known that the array store check requires type information. My proposal would have entailed getting rid of that check. Just the same semantics as for an ArrayList: no element type check at runtime whatsoever. But I know it's not realistic to change that behaviour.

Posted by Matthias on July 29, 2005 at 12:10 AM PDT #

Finding this in your blog:

You should not try to work around it: 

class Box<T> {
    T[] array = (T[])new Object[10]; // BAD and causes a warning
}
made me curious of the solution sought in <T> T[] List.toArray(T[] a) and ArrayList.ensureCapacity. In the latter I found exactly the above statement, in the former an analogous one using java.lang.reflect.Array for the implementation in ArrayList. In Gilad's tutorial on generics I did not find a hint how to circumvent this cleanly without getting the compiler's type erasure warning.

Wouldn't a method in java.lang.reflect like public static <T> T[] newInstance(Class<T> type, int length) do the job without the ugly cast?

For sure there is a good reason, but I ask myself the following: at compiletime, the compiler has the type information about the type to form an array of. Therefore this information could be used to declare the array at least in those cases where only arrays of the type variable are required. So the (non-static) statement T[] a = new T[10] would be perfectly determined at compile time and the correct statement could be generated. Where I am wrong?

Regards
Richard

Posted by Richard Birenheide on August 10, 2005 at 01:45 AM PDT #

First question: We are considering something like newInstance. It is not without problems as the type of int.class is Class<Integer> so that also needs to change. Second question: The compiler doesn't know what type a type variable holds a compile time. This is similar to regular variables. The compiler doesn't know what value they hold.

Posted by Peter von der Ah&eacute; on August 31, 2005 at 06:45 PM PDT #

So Ken and others should consider arrays harmful, not generics.

Peter, oh, Peter.

DISCLAIMER: I want generics, I love generics, I understand generics (IMHO)

BUT: current generics not only make things uninuitive, they break well-established contract. <code>

(1)
Object o = new Something(); // compiles
((Something) o).doSomething(); // compiles, runs, includes runtime checking...
((Whatever) o).run(); // compiles, fails at runtime, obviously

(2) 
Object[] oa = new Something[1]; // compiles
...
((Something[]) oa)[0].doSomething(); // runs
((Whatever[]) oa)[0].run(); // CCE on array
((Whatever) oa[0]).run(); // CCE on array item
// and 
oa[0] = new Something(); // runs
oa[0] = new Whatever(); // rt-failure

3)
List<Object> invalid = new ArrayList<Something>(); // does not compile
List l = new ArrayList<Something>(); // compiles
l.add(new Something()); // compiles, runs
l.add(new Whatever()); // compiles (ok), runs (but should fail at runtime)
</code>

Q1) Why should generic collections (parameterized types, to be precise) be any different from type-safe arrays?
Q2) Static type-safety without dynamic runtime checks? What is this security hole good for?

Please stop with excuses and explanations why things are the way they are. They are most definitely WRONG

(I understand why they are like that but that does not change anything on the fact it is a BAD thing)

I can't believe this. I am too tired of fighting against this generic-by-erasure thing :-(

Posted by Patrik Beno on September 07, 2005 at 02:13 PM PDT #

Hi Peter.

I've got a question regarding your comments:
T[] array = (T[])new Object[10]; // BAD and causes a warning

Actually, this is the problem what I'm currently trying to settle.

Please take a look at following code:

class A { public String toString() { return "A"; } }
class B { public String toString() { return "B"; } }

public class Test {

public static void main(String[] args) {
A a = null;
foo(a);
}

public static <T> void foo(T a) {

Object []o = new Object[10];
o[0] = new B(); // correct. Insert B into object[].
T val = (T) o[0]; // Type error if T is not A!!

System.out.println(val);
}
}

In this code, T is obviously A. And there's no conversion from B to A. However, T val = (T) o[0] succeeds, giving only one warning: The cast from Object to T is actually checking against the erased type Object.

However, THIS CODE DOES NOT RAISE ANY RUNTIME EXCEPTION, and that is the important problem. Could you let me know the point I've missed from java generics?

Thanks.

Posted by Minkoo Seo on September 12, 2005 at 03:17 AM PDT #

Sorry for my previous post, HTML ate type parameters in my generics. Sample (3) should read:
<code>
3)
List<Object> invalid = new ArrayList<Something>(); // does not compile
List l = new ArrayList<Something>(); // compiles
l.add(new Something()); // compiles, runs
l.add(new Whatever()); // compiles (ok), runs (but should fail at runtime)
</code>

Posted by Patrik Beno on September 16, 2005 at 04:09 PM PDT #

Here is Patrik Beno's example again:
List<Object> invalid = new ArrayList<Something>(); // does not compile
List l = new ArrayList<Something>(); // compiles
l.add(new Something()); // compiles, runs
l.add(new Whatever()); // compiles (ok), runs (but should fail at runtime)

Posted by Peter von der Ahé on September 23, 2005 at 03:17 PM PDT #

Response to Patrik Beno's questions:

<em>Q1) Why should generic collections (parameterized types, to be precise) be any different from type-safe arrays?</em>

Java arrays are not statically type safe. Generic types are if you don't ignore the warnings.

<em>Q2) Static type-safety without dynamic run-time checks? What is this security hole good for?</em>

Where is the security hole? Security is not affected by generics.

Generics is a compile time feature which allows you to catch a certain category of programming errors at compile time. If you use arrays, this kind of errors will not be caught until the product is shipped and an important customer tries something you forgot to test.

Generics fixes this problem. If you compile your entire program with -Xlint:unchecked and -Werror, then such errors will be caught before your customer finds them.

Posted by Peter von der Ahé on September 23, 2005 at 03:32 PM PDT #

Reply to Minkoo Seo:

<em>Could you let me know the point I've missed from java generics?</em>

Don't use arrays. Use ArrayList instead.

Posted by Peter von der Ahé on September 23, 2005 at 04:02 PM PDT #

Considering that arrays aren't statically type-safe why were they chosen to implement varargs? Why not Iterable<T>?

Posted by Rafael de F. Ferreira on June 12, 2006 at 05:14 PM PDT #

This was added before my time. You will have to ask the 202 expert group. We were experimenting with a type safe version of arrays (which I now believe is infeasible) and perhaps they thought that would come to the rescue. Or perhaps they were more concerned about retrofitting existing methods.

Posted by Peter von der Ahe on June 12, 2006 at 05:23 PM PDT #

Thanks for your response. Just one more question, though. I couldn't think of a case where passing objects declared as instances of parametrized types to a varargs method can cause problems. Could you give me a sample use case?

Posted by Rafael de F. Ferreira on June 23, 2006 at 05:29 PM PDT #

Rafael,

This method:

    static <T> T[] makeArray(T... args) {
        return args;
    }

Posted by Peter von der Ahé on August 03, 2006 at 09:14 AM PDT #

Java is a trademark of Sun Microsystems, Inc.
Copyright © 2006,2007 Peter von der Ahé