Java Generics: Understanding Type Safety and Bounded Wildcards, Papers of Computer Science

The concept of java generics, focusing on type safety and bounded wildcards. It covers the use of type parameters, container types, and the differences between raw types and wildcard types. The document also provides examples and best practices for using generic methods and collections.

Typology: Papers

Pre 2010

Uploaded on 08/16/2009

koofers-user-mes
koofers-user-mes 🇺🇸

10 documents

1 / 23

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Generics in the Java Programming Language
Gilad Bracha
July 5, 2004
Contents
1 Introduction 2
2 Defining Simple Generics 3
3 Generics and Subtyping 4
4 Wildcards 5
4.1 BoundedWildcards........................... 6
5 Generic Methods 7
6 Interoperating with Legacy Code 10
6.1 Using Legacy Codein Generic Code . . . . . . . . . . . . . . . . . . 10
6.2 Erasure and Translation . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.3 Using Generic Code in Legacy Code . . . . . . . . . . . . . . . . . . 13
7 The Fine Print 14
7.1 A Generic Class is Shared by all its Invocations . . . . . . . . . . . . 14
7.2 Casts and InstanceOf . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7.3 Arrays.................................. 15
8 Class Literals as Run-time Type Tokens 16
9 More Fun with Wildcards 18
9.1 WildcardCapture............................ 20
10 Converting Legacy Code to Use Generics 20
11 Acknowledgements 23
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17

Partial preview of the text

Download Java Generics: Understanding Type Safety and Bounded Wildcards and more Papers Computer Science in PDF only on Docsity!

Generics in the Java Programming Language

 - July 5, Gilad Bracha 
  • 1 Introduction Contents
  • 2 Defining Simple Generics
  • 3 Generics and Subtyping
  • 4 Wildcards
    • 4.1 Bounded Wildcards
  • 5 Generic Methods
  • 6 Interoperating with Legacy Code
    • 6.1 Using Legacy Code in Generic Code
    • 6.2 Erasure and Translation
    • 6.3 Using Generic Code in Legacy Code
  • 7 The Fine Print
    • 7.1 A Generic Class is Shared by all its Invocations
    • 7.2 Casts and InstanceOf
    • 7.3 Arrays
  • 8 Class Literals as Run-time Type Tokens
  • 9 More Fun with Wildcards
    • 9.1 Wildcard Capture
  • 10 Converting Legacy Code to Use Generics
  • 11 Acknowledgements

1 Introduction

JDK 1.5 introduces several extensions to the Java programming language. One of these is the introduction of generics. This tutorial is aimed at introducing you to generics. You may be familiar with similar constructs from other languages, most notably C++ templates. If so, you’ll soon see that there are both similarities and important differences. If you are not familiar with look-a-alike constructs from elsewhere, all the better; you can start afresh, without unlearning any misconceptions. Generics allow you to abstract over types. The most common examples are con- tainer types, such as those in the Collection hierarchy. Here is a typical usage of that sort:

List myIntList = new LinkedList(); // 1 myIntList.add(new Integer(0)); // 2 Integer x = (Integer) myIntList.iterator().next(); // 3

The cast on line 3 is slightly annoying. Typically, the programmer knows what kind of data has been placed into a particular list. However, the cast is essential. The compiler can only guarantee that an Object will be returned by the iterator. To ensure the assignment to a variable of type Integer is type safe, the cast is required. Of course, the cast not only introduces clutter. It also introduces the possibility of a run time error, since the programmer might be mistaken. What if programmers could actually express their intent, and mark a list as being restricted to contain a particular data type? This is the core idea behind generics. Here is a version of the program fragment given above using generics:

List myIntList = new LinkedList(); // 1’ myIntList.add(new Integer(0)); //2’ Integer x = myIntList.iterator().next(); // 3’

Notice the type declaration for the variable myIntList. It specifies that this is not just an arbitrary List, but a List of Integer, written List. We say that List is a generic interface that takes a type parameter - in this case, Integer. We also specify a type parameter when creating the list object. The other thing to pay attention to is that the cast is gone on line 3’. Now, you might think that all we’ve accomplished is to move the clutter around. Instead of a cast to Integer on line 3, we have Integer as a type parameter on line 1’. However, there is a very big difference here. The compiler can now check the type correctness of the program at compile-time. When we say that myIntList is declared with type List, this tells us something about the variable myIntList, which holds true wherever and whenever it is used, and the compiler will guarantee it. In contrast, the cast tells us something the programmer thinks is true at a single point in the code. The net effect, especially in large programs, is improved readability and robustness.

case characters in those names, making it easy to distinguish formal type parameters from ordinary classes and interfaces. Many container types use E, for element, as in the examples above. We’ll see some additional conventions in later examples.

3 Generics and Subtyping

Let’s test our understanding of generics. Is the following code snippet legal?

List ls = new ArrayList(); // List lo = ls; //

Line 1 is certainly legal. The trickier part of the question is line 2. This boils down to the question: is a List of String a List of Object. Most people’s instinct is to answer: “sure!”. Well, take a look at the next few lines:

lo.add(new Object()); // 3 String s = ls.get(0); // 4: attempts to assign an Object to a String!

Here we’ve aliased ls and lo. Accessing ls, a list of String, through the alias lo, we can insert arbitrary objects into it. As a result ls does not hold just Strings anymore, and when we try and get something out of it, we get a rude surprise. The Java compiler will prevent this from happening of course. Line 2 will cause a compile time error. In general, if Foo is a subtype (subclass or subinterface) of Bar, and G is some generic type declaration, it is not the case that G is a subtype of G. This is probably the hardest thing you need to learn about generics, because it goes against our deeply held intuitions. The problem with that intuition is that it assumes that collections don’t change. Our instinct takes these things to be immutable. For example, if the department of motor vehicles supplies a list of drivers to the cen- sus bureau, this seems reasonable. We think that a List is a List, assuming that Driver is a subtype of Person. In fact, what is being passed is a copy of the registry of drivers. Otherwise, the census bureau could add new people who are not drivers into the list, corrupting the DMV’s records. In order to cope with this sort of situation, it’s useful to consider more flexible generic types. The rules we’ve seen so far are quite restrictive.

4 Wildcards

Consider the problem of writing a routine that prints out all the elements in a collection. Here’s how you might write it in an older version of the language:

void printCollection(Collection c) { Iterator i = c.iterator(); for (k = 0; k < c.size(); k++) { System.out.println(i.next()); }}

And here is a naive attempt at writing it using generics (and the new for loop syn- tax):

void printCollection(Collection c) { for (Object e : c) { System.out.println(e); }}

The problem is that this new version is much less useful than the old one. Whereas the old code could be called with any kind of collection as a parameter, the new code only takes Collection, which, as we’ve just demonstrated, is not a supertype of all kinds of collections! So what is the supertype of all kinds of collections? It’s written Collection<?> (pronounced “collection of unknown”) , that is, a collection whose element type matches anything. It’s called a wildcard type for obvious reasons. We can write:

void printCollection(Collection<?> c) { for (Object e : c) { System.out.println(e); }}

and now, we can call it with any type of collection. Notice that inside printCollec- tion(), we can still read elements from c and give them type Object. This is always safe, since whatever the actual type of the collection, it does contain objects. It isn’t safe to add arbitrary objects to it however:

Collection<?> c = new ArrayList(); c.add(new Object()); // compile time error

Since we don’t know what the element type of c stands for, we cannot add objects to it. The add() method takes arguments of type E, the element type of the collection. When the actual type parameter is ?, it stands for some unknown type. Any parameter we pass to add would have to be a subtype of this unknown type. Since we don’t know what type that is, we cannot pass anything in. The sole exception is null, which is a member of every type. On the other hand, given a List<?>, we can call get() and make use of the result. The result type is an unknown type, but we always know that it is an object. It is

List<? extends Shape> is an example of a bounded wildcard. The? stands for an unknown type, just like the wildcards we saw earlier. However, in this case, we know that this unknown type is in fact a subtype of Shape^1. We say that Shape is the upper bound of the wildcard. There is, as usual, a price to be paid for the flexibility of using wildcards. That price is that it is now illegal to write into shapes in the body of the method. For instance, this is not allowed:

public void addRectangle(List<? extends Shape> shapes) { shapes.add(0, new Rectangle()); // compile-time error! }

You should be able to figure out why the code above is disallowed. The type of the second parameter to shapes.add() is? extends Shape - an unknown subtype of Shape. Since we don’t know what type it is, we don’t know if it is a supertype of Rectangle; it might or might not be such a supertype, so it isn’t safe to pass a Rectangle there. Bounded wildcards are just what one needs to handle the example of the DMV passing its data to the census bureau. Our example assumes that the data is represented by mapping from names (represented as strings) to people (represented by reference types such as Person or its subtypes, such as Driver). Map<K,V> is an example of a generic type that takes two type arguments, representing the keys and values of the map. Again, note the naming convention for formal type parameters - K for keys and V for values.

public class Census { public static void addRegistry(Map<String,? extends Person> registry) { ...} }... Map<String, Driver> allDrivers = ...; Census.addRegistry(allDrivers);

5 Generic Methods

Consider writing a method that takes an array of objects and a collection and puts all objects in the array into the collection. Here is a first attempt:

static void fromArrayToCollection(Object[] a, Collection<?> c) { for (Object o : a) { c.add(o); // compile time error }}

By now, you will have learned to avoid the beginner’s mistake of trying to use Collection as the type of the collection parameter. You may or may not

(^1) It could be Shape itself, or some subclass; it need not literally extend Shape.

have recognized that using Collection<?> isn’t going to work either. Recall that you cannot just shove objects into a collection of unknown type. The way to do deal with these problems is to use generic methods. Just like type declarations, method declarations can be generic - that is, parameterized by one or more type parameters.

static void fromArrayToCollection(T[] a, Collection c) { for (T o : a) { c.add(o); // correct }}

We can call this method with any kind of collection whose element type is a super- type of the element type of the array.

Object[] oa = new Object[100]; Collection co = new ArrayList(); fromArrayToCollection(oa, co); // T inferred to be Object String[] sa = new String[100]; Collection cs = new ArrayList(); fromArrayToCollection(sa, cs); // T inferred to be String fromArrayToCollection(sa, co); // T inferred to be Object Integer[] ia = new Integer[100]; Float[] fa = new Float[100]; Number[] na = new Number[100]; Collection cn = new ArrayList(); fromArrayToCollection(ia, cn); // T inferred to be Number fromArrayToCollection(fa, cn); // T inferred to be Number fromArrayToCollection(na, cn); // T inferred to be Number fromArrayToCollection(na, co); // T inferred to be Object fromArrayToCollection(na, cs); // compile-time error

Notice that we don’t have to pass an actual type argument to a generic method. The compiler infers the type argument for us, based on the types of the actual arguments. It will generally infer the most specific type argument that will make the call type-correct. One question that arises is: when should I use generic methods, and when should I use wildcard types? To understand the answer, let’s examine a few methods from the Collection libraries.

interface Collection { public boolean containsAll(Collection<?> c); public boolean addAll(Collection<? extends E> c); }

We could have used generic methods here instead:

interface Collection { public boolean containsAll(Collection c); public <T extends E> boolean addAll(Collection c); // hey, type variables can have bounds too! }

Finally, again let’s take note of the naming convention used for the type parame- ters. We use T for type, whenever there isn’t anything more specific about the type to distinguish it. This is often the case in generic methods. If there are multiple type parameters, we might use letters that neighbor T in the alphabet, such as S. If a generic method appears inside a generic class, it’s a good idea to avoid using the same names for the type parameters of the method and class, to avoid confusion. The same applies to nested generic classes.

6 Interoperating with Legacy Code

Until now, all our examples have assumed an idealized world, where everyone is using the latest version of the Java programming language, which supports generics. Alas, in reality this isn’t the case. Millions of lines of code have been written in earlier versions of the language, and they won’t all be converted overnight. Later, in section 10, we will tackle the problem of converting your old code to use generics. In this section, we’ll focus on a simpler problem: how can legacy code and generic code interoperate? This question has two parts: using legacy code from within generic code, and using generic code within legacy code.

6.1 Using Legacy Code in Generic Code

How can you use old code, while still enjoying the benefits of generics in your own code? As an example, assume you want to use the package com.Fooblibar.widgets. The folks at Fooblibar.com 2 market a system for inventory control, highlights of which are shown below:

package com.Fooblibar.widgets; public interface Part { ...} public class Inventory { _/**

  • Adds a new Assembly to the inventory database.
  • The assembly is given the name name, and consists of a set
  • parts specified by parts. All elements of the collection parts
  • must support the Part interface. **/_ public static void addAssembly(String name, Collection parts) {...} public static Assembly getAssembly(String name) {...} } public interface Assembly { Collection getParts(); // Returns a collection of Parts }

Now, you’d like to add new code that uses the API above. It would be nice to ensure that you always called addAssembly() with the proper arguments - that is, that

(^2) Fooblibar.com is a purely fictional company, used for illustration purposes. Any relation to any real company or institution, or any persons living or dead, is purely coincidental.

the collection you pass in is indeed a Collection of Part. Of course, generics are tailor made for this:

package com.mycompany.inventory; import com.Fooblibar.widgets.*; public class ... Blade implements Part { } public class Guillotine implements Part { } public class Main { public static void main(String[] args) { Collection c = new ArrayList(); c.add(new Guillotine()) ; c.add(new Blade()); Inventory.addAssembly(”thingee”, c); Collection k = Inventory.getAssembly(”thingee”).getParts(); }}

When we call addAssembly, it expects the second parameter to be of type Collec- tion. The actual argument is of type Collection. This works, but why? After all, most collections don’t contain Part objects, and so in general, the compiler has no way of knowing what kind of collection the type Collection refers to. In proper generic code, Collection would always be accompanied by a type param- eter. When a generic type like Collection is used without a type parameter, it’s called a raw type. Most people’s first instinct is that Collection really means Collection. However, as we saw earlier, it isn’t safe to pass a Collection in a place where a Collection is required. It’s more accurate to say that the type Collection denotes a collection of some unknown type, just like Collection<?>. But wait, that can’t be right either! Consider the call to getParts(), which returns a Collection. This is then assigned to k, which is a Collection. If the result of the call is a Collection<?>, the assignment would be an error. In reality, the assignment is legal, but it generates an unchecked warning. The warning is needed, because the fact is that the compiler can’t guarantee its correctness. We have no way of checking the legacy code in getAssembly() to ensure that indeed the collection being returned is a collection of Parts. The type used in the code is Collection, and one could legally insert all kinds of objects into such a collection. So, shouldn’t this be an error? Theoretically speaking, yes; but practically speak- ing, if generic code is going to call legacy code, this has to be allowed. It’s up to you, the programmer, to satisfy yourself that in this case, the assignment is safe because the contract of getAssembly() says it returns a collection of Parts, even though the type signature doesn’t show this. So raw types are very much like wildcard types, but they are not typechecked as stringently. This is a deliberate design decision, to allow generics to interoperate with pre-existing legacy code. Calling legacy code from generic code is inherently dangerous; once you mix generic code with non-generic legacy code, all the safety guarantees that the generic

The full details of erasure are beyond the scope of this tutorial, but the simple description we just gave isn’t far from the truth. It’s good to know a bit about this, especially if you want to do more sophisticated things like converting existing APIs to use generics (see section 10), or just want to understand why things are the way they are.

6.3 Using Generic Code in Legacy Code

Now let’s consider the inverse case. Imagine that Fooblibar.com chose to convert their API to use generics, but that some of their clients haven’t yet. So now the code looks like:

package com.Fooblibar.widgets; public interface Part { ...} public class Inventory { _/**

  • Adds a new Assembly to the inventory database.
  • The assembly is given the name name, and consists of a set
  • parts specified by parts. All elements of the collection parts
  • must support the Part interface. **/_ public static void addAssembly(String name, Collection parts) {...} public static Assembly getAssembly(String name) {...} } public interface Assembly { Collection getParts(); // Returns a collection of Parts }

and the client code looks like:

package com.mycompany.inventory; import com.Fooblibar.widgets.*; public class ... Blade implements Part { } public class Guillotine implements Part { } public class Main { public static void main(String[] args) { Collection c = new ArrayList(); c.add(new Guillotine()) ; c.add(new Blade()); Inventory.addAssembly(”thingee”, c); // 1: unchecked warning Collection k = Inventory.getAssembly(”thingee”).getParts(); }}

The client code was written before generics were introduced, but it uses the package com.Fooblibar.widgets and the collection library, both of which are using generic types. All the uses of generic type declarations in the client code are raw types.

Line 1 generates an unchecked warning, because a raw Collection is being passed in where a Collection of Parts is expected, and the compiler cannot ensure that the raw Collection really is a Collection of Parts. As an alternative, you can compile the client code using the source 1.4 flag, ensur- ing that no warnings are generated. However, in that case you won’t be able to use any of the new language features introduced in JDK 1.5.

7 The Fine Print

7.1 A Generic Class is Shared by all its Invocations

What does the following code fragment print?

List l1 = new ArrayList(); List l2 = new ArrayList(); System.out.println(l1.getClass() == l2.getClass());

You might be tempted to say false, but you’d be wrong. It prints true, because all instances of a generic class have the same run-time class, regardless of their actual type parameters. Indeed, what makes a class generic is the fact that it has the same behavior for all of its possible type parameters; the same class can be viewed as having many different types. As consequence, the static variables and methods of a class are also shared among all the instances. That is why it is illegal to refer to the type parameters of a type declaration in a static method or initializer, or in the declaration or initializer of a static variable.

7.2 Casts and InstanceOf

Another implication of the fact that a generic class is shared among all its instances, is that it usually makes no sense to ask an instance if it is an instance of a particular invocation of a generic type:

Collection cs = new ArrayList(); if (cs instanceof Collection) { ...} // illegal

similarly, a cast such as

Collection cstr = (Collection) cs; // unchecked warning

gives an unchecked warning, since this isn’t something the run time system is going to check for you. The same is true of type variables

T badCast(T t, Object o) { return (T) o; // unchecked warning }

Similarly, attempting to create an array object whose element type is a type variable causes a compile-time error:

T[] makeArray(T t) { return new T[100]; // error }

Since type variables don’t exist at run time, there is no way to determine what the actual array type would be. The way to work around these kinds of limitations is to use class literals as run time type tokens, as described in section 8.

8 Class Literals as Run-time Type Tokens

One of the changes in JDK 1.5 is that the class java.lang.Class is generic. It’s an interesting example of using genericity for something other than a container class. Now that Class has a type parameter T, you might well ask, what does T stand for? It stands for the type that the Class object is representing. For example, the type of String.class is Class, and the type of Serial- izable.class is Class. This can be used to improve the type safety of your reflection code. In particular, since the newInstance() method in Class now returns a T, you can get more precise types when creating objects reflectively. For example, suppose you need to write a utility method that performs a database query, given as a string of SQL, and returns a collection of objects in the database that match that query. One way is to pass in a factory object explicitly, writing code like:

interface Factory { T make();} public Collection select(Factory factory, String statement) { Collection result = new ArrayList(); /* run sql query using jdbc */ for ( /* iterate over jdbc results */ ) { T item = factory.make(); /* use reflection and set all of item’s fields from sql results */ result.add(item); } return result; }

You can call this either as select(new Factory(){ public EmpInfo make() { return new EmpInfo(); }} , ”selection string”);

or you can declare a class EmpInfoFactory to support the Factory interface

class ... EmpInfoFactory implements Factory { public EmpInfo make() { return new EmpInfo();} }

and call it select(getMyEmpInfoFactory(), ”selection string”);

The downside of this solution is that it requires either:

  • the use of verbose anonymous factory classes at the call site, or
  • declaring a factory class for every type used and passing a factory instance at the call site, which is somewhat unnatural.

It is very natural to use the class literal as a factory object, which can then be used by reflection. Today (without generics) the code might be written:

Collection emps = sqlUtility.select(EmpInfo.class, ”select * from emps”);... public static Collection select(Class c, String sqlStatement) { Collection result = new ArrayList(); /* run sql query using jdbc */ for ( /* iterate over jdbc results */ ) { Object item = c.newInstance(); /* use reflection and set all of item’s fields from sql results */ result.add(item); } return result; }

However, this would not give us a collection of the precise type we desire. Now that Class is generic, we can instead write

Collection emps = ... sqlUtility.select(EmpInfo.class, ”select * from emps”); public static Collection select(Classc, String sqlStatement) { Collection result = new ArrayList(); /* run sql query using jdbc */ for ( /* iterate over jdbc results */ ) { T item = c.newInstance(); /* use reflection and set all of item’s fields from sql results */ result.add(item); } return result; }

giving us the precise type of collection in a type safe way. This technique of using class literals as run time type tokens is a very useful trick to know. It’s an idiom that’s used extensively in the new APIs for manipulating anno- tations, for example.

Now let’s turn to a more realistic example. A java.util.TreeSet represents a tree of elements of type E that are ordered. One way to construct a TreeSet is to pass a Comparator object to the constructor. That comparator will be used to sort the elements of the TreeSet according to a desired ordering.

TreeSet(Comparator c)

The Comparator interface is essentially:

interface Comparator { int compare(T fst, T snd); }

Suppose we want to create a TreeSet and pass in a suitable comparator, We need to pass it a Comparator that can compare Strings. This can be done by a Comparator, but a Comparator will do just as well. However, we won’t be able to invoke the constructor given above on a Comparator. We can use a lower bounded wildcard to get the flexibility we want:

TreeSet(Comparator<? super E> c)

This allows any applicable comparator to be used. As a final example of using lower bounded wildcards, lets look at the method Col- lections.max(), which returns the maximal element in a collection passed to it as an argument. Now, in order for max() to work, all elements of the collection being passed in must implement Comparable. Furthermore, they must all be comparable to each other. A first attempt at generifying this method signature yields

public static <T extends Comparable> T max(Collection coll)

That is, the method takes a collection of some type T that is comparable to itself, and returns an element of that type. This turns out to be too restrictive. To see why, consider a type that is comparable to arbitrary objects

class ... Foo implements Comparable {...} Collection cf = ...; Collections.max(cf); // should work

Every element of cf is comparable to every other element in cf, since every such element is a Foo, which is comparable to any object, and in particular to another Foo. However, using the signature above, we find that the call is rejected. The inferred type must be Foo, but Foo does not implement Comparable. It isn’t necessary that T be comparable to exactly itself. All that’s required is that T be comparable to one of its supertypes. This give us: 4

(^4) The actual signature of Collections.max() is more involved. We return to it in section 10

public static <T extends Comparable<? super T>> T max(Collection coll)

This reasoning applies to almost any usage of Comparable that is intended to work for arbitrary types: You always want to use Comparable<? super T>. In general, if you have an API that only uses a type parameter T as an argument, its uses should take advantage of lower bounded wildcards (? super T). Conversely, if the API only returns T, you’ll give your clients more flexibility by using upper bounded wildcards (? extends T).

9.1 Wildcard Capture

It should be pretty clear by now that given

Set... <?> unknownSet = new HashSet(); /** Add an element t to a Set s */ public static void addToSet(Set s, T t) {...}

The call below is illegal.

addToSet(unknownSet, “abc”); // illegal

It makes no difference that the actual set being passed is a set of strings; what matters is that the expression being passed as an argument is a set of an unknown type, which cannot be guaranteed to be a set of strings, or of any type in particular. Now, consider

class ... Collections { public static Set unmodifiableSet(Set set) { ... } }... Set<?> s = Collections.unmodifiableSet(unknownSet); // this works! Why?

It seems this should not be allowed; yet, looking at this specific call, it is perfectly safe to permit it. After all, unmodifiableSet() does work for any kind of Set, regard- less of its element type. Because this situation arises relatively frequently, there is a special rule that allows such code under very specific circumstances in which the code can be proven to be safe. This rule, known as wildcard capture, allows the compiler to infer the unknown type of a wildcard as a type argument to a generic method.

10 Converting Legacy Code to Use Generics

Earlier, we showed how new and legacy code can interoperate. Now, it’s time to look at the harder problem of “generifying” old code. If you decide to convert old code to use generics, you need to think carefully about how you modify the API.