Monday, April 12, 2010

The Contains Trap in Java Collections

One of the nasty little traps a Java developer can run into occurs when Collection.contains(Object) is not used with appropriate understanding. I demonstrate this potential trap in this post.

Collection.contains(Object) accepts an Object, which means it essentially accepts an instance of any Java class. This is where the potential trap lies. If one passes in an instance of a class other than the the type of classes that can be stored in a particular collection, it can be that this method will simply return false. This isn't really wrong because a different type than the collection holds is obviously not part of that collection. However, it can be a trap if the developer relies on the returned value from the contains call to implement other logic.

This is demonstrated in the next code listing.


public void demonstrateIllConceivedContainsBasedCode()
{
final Set<String> favoriteChildrensBooks = new HashSet<String>();
favoriteChildrensBooks.add("Mrs. Frisby and the Rats of NIMH");
favoriteChildrensBooks.add("The Penguin that Hated the Cold");
favoriteChildrensBooks.add("The Bears' Vacation");
favoriteChildrensBooks.add("Green Eggs and Ham");
favoriteChildrensBooks.add("A Fish Out of Water");
favoriteChildrensBooks.add("The Lorax");
final Date date = new Date();
if (favoriteChildrensBooks.contains(date))
{
out.println("That is a great book!");
}
}


In the code listing above, the statement printing out "That is a great book!" will never be executed because a Date will never be contained in that Set.

There is no warning condition for this, even with javac's -Xlint option set. However, NetBeans 6.8 does provide a warning for this as demonstrated in the next screen snapshot.



As the screen snapshot indicates, NetBeans 6.8 provides the nice and fairly clear warning message, "Suspicious call to java.util.Collection.contains: Given object cannot contain instances of Date (expected String)." This is definitely "suspicious" and is almost never what the developer really desired.

It's not necessarily surprising that the contains method returns false rather than some type of error message or exception because it is certainly true that the Set in this example did not contain the Date for which the question was asked. One tactic that can be used to have at least a runtime check for the proper class in a call to contains is to use a collection type that implements the optional throwing of a ClassCastException when appropriate.

The Javadoc documentation for the Collection, Set, List, and Map interfaces' respective contains methods all state that they throw ClassCastException "if the type of the specified element is incompatible with this collection (optional)" (Collection), "if the type of the specified element is incompatible with this set (optional)" (Set), "if the type of the specified element is incompatible with this list (optional)" (List), and "if the key is of an inappropriate type for this map (optional) " (Map.containsKey). The most important thing to note is that each of these declares the throwing of ClassCastException as optional.

In the code above, I used a HashSet, which does not throw a ClassCastException when an incompatible object type is passed to its contains method. Indeed, the Javadoc documentation for HashSet.contains(Object) makes no mention of throwing a ClassCastException. Similarly, LinkedHashSet extends HashSet and inherits the same contains as HastSet. The TreeSet, on the other hand, has Javadoc comments that state that TreeSet.contains(Object) does throw a ClassCastException "if the specified object cannot be compared with the elements currently in the set." So the TreeSet does throw an exception when an incomparable object is provided to its contains method.

I will now demonstrate the differences in these behaviors with some code samples.

The first class to be used here is the Person class.

Person.java

/*
* http://marxsoftware.blogspot.com/
*/
package dustin.examples;

import java.io.Serializable;

public final class Person implements Comparable, Serializable
{
private final String lastName;
private final String firstName;

public Person(final String newLastName, final String newFirstName)
{
this.lastName = newLastName;
this.firstName = newFirstName;
}

public String getLastName()
{
return this.lastName;
}

public String getFirstName()
{
return this.firstName;
}

@Override
public boolean equals(Object obj)
{
if (obj == null)
{
return false;
}
if (getClass() != obj.getClass())
{
return false;
}
final Person other = (Person) obj;
if (this.lastName == null ? other.lastName != null : !this.lastName.equals(other.lastName))
{
return false;
}
if (this.firstName == null ? other.firstName != null : !this.firstName.equals(other.firstName))
{
return false;
}
return true;
}

@Override
public int hashCode()
{
int hash = 5;
hash = 59 * hash + (this.lastName != null ? this.lastName.hashCode() : 0);
hash = 59 * hash + (this.firstName != null ? this.firstName.hashCode() : 0);
return hash;
}

public int compareTo(Object anotherPerson) throws ClassCastException
{
if (!(anotherPerson instanceof Person))
{
throw new ClassCastException("A Person object expected.");
}
final Person theOtherPerson = (Person) anotherPerson;
final int lastNameComparisonResult =
this.lastName.compareTo(theOtherPerson.lastName);
return lastNameComparisonResult != 0
? lastNameComparisonResult
: this.firstName.compareTo(theOtherPerson.firstName);
}

@Override
public String toString()
{
return this.firstName + " " + this.lastName;
}
}


Another classed used in my examples is the InanimateObject class.

Non-Comparable InanimateObject.java

/*
* http://marxsoftware.blogspot.com/
*/
package dustin.examples;

public class InanimateObject
{
private final String name;
private final String secondaryName;
private final int yearOfOrigin;

public InanimateObject(
final String newName, final String newSecondaryName, final int newYear)
{
this.name = newName;
this.secondaryName = newSecondaryName;
this.yearOfOrigin = newYear;
}

public String getName()
{
return this.name;
}

public String getSecondaryName()
{
return this.secondaryName;
}

public int getYearOfOrigin()
{
return this.yearOfOrigin;
}

@Override
public boolean equals(Object obj)
{
if (obj == null)
{
return false;
}
if (getClass() != obj.getClass())
{
return false;
}
final InanimateObject other = (InanimateObject) obj;
if (this.name == null ? other.name != null : !this.name.equals(other.name))
{
return false;
}
if (this.yearOfOrigin != other.yearOfOrigin)
{
return false;
}
return true;
}

@Override
public int hashCode()
{
int hash = 3;
hash = 23 * hash + (this.name != null ? this.name.hashCode() : 0);
hash = 23 * hash + this.yearOfOrigin;
return hash;
}

@Override
public String toString()
{
return this.name + " (" + this.secondaryName + "), created in " + this.yearOfOrigin;
}
}



The main executable class for testing these things is SetContainsExample.

SetContainsExample.java

/*
* http://marxsoftware.blogspot.com/
*/
package dustin.examples;

import static java.lang.System.out;

import java.util.Arrays;
import java.util.EnumSet;
import java.util.HashSet;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;
import java.util.TreeSet;

public class SetContainsExample
{
final Person davidLightman = new Person("Lightman", "David");
final Person willFarmer = new Person("Farmer", "Will");
final Person daveBowman = new Person("Bowman", "Dave");
final Person jerryShaw = new Person("Shaw", "Jerry");
final Person delSpooner = new Person("Spooner", "Del");

final InanimateObject wopr =
new InanimateObject("War Operation Plan Response", "WOPR", 1983);
final InanimateObject ripley =
new InanimateObject("R.I.P.L.E.Y", "R.I.P.L.E.Y", 2008);
final InanimateObject hal =
new InanimateObject("Heuristically programmed ALgorithmic Computer", "HAL9000", 1997);
final InanimateObject ariia =
new InanimateObject("Autonomous Reconnaissance Intelligence Integration Analyst", "ARIIA", 2009);
final InanimateObject viki =
new InanimateObject("Virtual Interactive Kinetic Intelligence", "VIKI", 2035);

public Set<Person> createPeople(final Class setType)
{
Set<Person> people = new HashSet<Person>();
if (validateSetImplementation(setType))
{
if (HashSet.class.equals(setType))
{
people = new HashSet<Person>();
}
else if (LinkedHashSet.class.equals(setType))
{
people = new LinkedHashSet<Person>();
}
else if (TreeSet.class.equals(setType))
{
people = new TreeSet<Person>();
}
else if (EnumSet.class.equals(setType))
{
out.println("ERROR: EnumSet is inappropriate type of Set here.");
}
else
{
out.println("WARNING: " + setType.getName() + " is an unexpected Set implementation.");
}
}
else
{
out.println("WARNING: " + setType.getName() + " is not a Set implementation.");
people = new HashSet<Person>();
}

people.add(davidLightman);
people.add(willFarmer);
people.add(daveBowman);
people.add(jerryShaw);
people.add(delSpooner);
return people;
}

private boolean validateSetImplementation(final Class candidateSetImpl)
{
if (candidateSetImpl.isInterface())
{
throw new IllegalArgumentException(
"Provided setType needs to be an implementation, but an interface ["
+ candidateSetImpl.getName() + "] was provided." );
}
final Class[] implementedInterfaces = candidateSetImpl.getInterfaces();
final List<Class> implementedIFs = Arrays.asList(implementedInterfaces);
return implementedIFs.contains(java.util.Set.class)
|| implementedIFs.contains(java.util.NavigableSet.class)
|| implementedIFs.contains(java.util.SortedSet.class);
}

public void testSetContains(final Set<Person> set, final String title)
{
printHeader(title);
out.println("Chosen Set Implementation: " + set.getClass().getName());

final Person person = davidLightman;
out.println(
set.contains(person)
? person + " is one of my people."
: person + " is NOT one of my people.");

final Person luke = new Person("Skywalker", "Luke");
out.println(
set.contains(luke)
? luke + " is one of my people."
: luke + " is NOT one of my people.");

out.println(
set.contains(wopr)
? wopr + " is one of my people."
: wopr + " is NOT one of my people.");
}

private void printHeader(final String headerText)
{
out.println();
out.println(
"==================================================================");
out.println("== " + headerText);
out.println(
"==================================================================");
}

public static void main(final String[] arguments)
{
final SetContainsExample me = new SetContainsExample();

final Set<Person> peopleHash = me.createPeople(HashSet.class);
me.testSetContains(peopleHash, "HashSet");

final Set<Person> peopleLinkedHash = me.createPeople(LinkedHashSet.class);
me.testSetContains(peopleLinkedHash, "LinkedHashSet");

final Set<Person> peopleTree = me.createPeople(TreeSet.class);
me.testSetContains(peopleTree, "TreeSet");
}
}


When the above code is run as-is (without InanimateObject being Comparable), the output appears as shown in the next screen snapshot.



The ClassCastException tells us, "dustin.examples.InanimateObject cannot be cast to java.lang.Comparable." This makes sense because that class does not implement Comparable, which is necessary for use with the TreeMap.contains(Object) method. When made Comparable, the class looks something like that shown next.

Comparable InanimateObject.java

/*
* http://marxsoftware.blogspot.com/
*/
package dustin.examples;

public class InanimateObject implements Comparable
{
private final String name;
private final String secondaryName;
private final int yearOfOrigin;

public InanimateObject(
final String newName, final String newSecondaryName, final int newYear)
{
this.name = newName;
this.secondaryName = newSecondaryName;
this.yearOfOrigin = newYear;
}

public String getName()
{
return this.name;
}

public String getSecondaryName()
{
return this.secondaryName;
}

public int getYearOfOrigin()
{
return this.yearOfOrigin;
}

@Override
public boolean equals(Object obj)
{
if (obj == null)
{
return false;
}
if (getClass() != obj.getClass())
{
return false;
}
final InanimateObject other = (InanimateObject) obj;
if (this.name == null ? other.name != null : !this.name.equals(other.name))
{
return false;
}
if (this.yearOfOrigin != other.yearOfOrigin)
{
return false;
}
return true;
}

@Override
public int hashCode()
{
int hash = 3;
hash = 23 * hash + (this.name != null ? this.name.hashCode() : 0);
hash = 23 * hash + this.yearOfOrigin;
return hash;
}

public int compareTo(Object anotherObject) throws ClassCastException
{
if (!(anotherObject instanceof InanimateObject))
{
throw new ClassCastException("An InanimateObject object expected.");
}
final InanimateObject theOtherObject = (InanimateObject) anotherObject;
final int yearOfOriginComparisonResult =
this.yearOfOrigin - theOtherObject.yearOfOrigin;
return yearOfOriginComparisonResult != 0
? yearOfOriginComparisonResult
: this.name.compareTo(theOtherObject.name);
}

@Override
public String toString()
{
return this.name + " (" + this.secondaryName + "), created in " + this.yearOfOrigin;
}
}


With InanimateObject now implementing Comparable, the error message when trying to call Set.contains(Object) on an object whose type does not compare well with that type held in that Set can now be seen. In fact, the message now seen this the one provided in the compareTo method of the InanimateObject. This is demonstrated in the next screen snapshot.



By virtue of TreeSet checking a Comparable before trying to determine if the TreeSet contains that object, we get a runtime exception and message saying that this makes no sense.


Dealing with the Collection.contains(Object) Issue

In this post, I have demonstrated how the Collection.contains(Object) method (including the Map.containsKey(Object) method) can behave differently depending on how the particular collection implementation decides to throw or not throw the option ClassCastException when a provided object is not of a relevant class to the objects stored in that collection. In the many cases where the ClassCastException is not thrown, this can be a potential error that is difficult to track down. There are some ways to reduce this possibility.

1. Careful coding and code reviews might lead to developers or reviewers seeing that an object of an irrelevant class is being passed to the contains method. I'm typically uncomfortable leaving it with just this form of protection.

2. Use of a modern version of NetBeans or of similar tools that flag the "suspicious" behavior can be very helpful.

3. Unit tests combined with code coverage tests can be helpful in identifying seemingly strange flows through the code that can be attributed to issues such as the one described in this post.

4. Collection implementations that do throw ClassCastException at least provide a runtime error to let developers know that an improper action is being taken. It may not be as effective as compile-time detection, but it does make it much easier to identify that there is a problem and what it is when it happens then without it. However, the major disadvantage with this approach is that the choice of which collection implementation to use is often driven by important considerations that often outweigh the desire to have runtime detection of an errant call to contains.

5. Logging the results of calls to contains (both the positive and negative results) can be helpful, but can clutter the code and the log files.

6. Debugging is an obvious way to detect this problem, but is often more time-consuming than some of the approaches addressed earlier.

I generally prefer the first four steps over the last two because the first three steps are more preventative than the last two and the fourth step gives a clear indication of a problem when it occurs.

1 comment:

Infnordz said...

It really is quit ridiculous and IMHO a stupid bug that the Generics changes, introduced with Java 1.5, allowed any Object parameters, for any methods in Collections classes!

A Collections.checkedMap() wrapper should provide a solution, but are not in Java 1.5 at least, because all he contains methods are unchecked there too; WTF! This is IMHO a serious bug, because it is counter-intuitive and really quite stupid!