Monday, October 31, 2011

Filtering and Transforming Java Collections with Guava's Collections2

One of the conveniences of Groovy is the ability to easily perform filtering and transformation operations on collections via Groovy's closure support. Guava brings filtering and transformation on collections to standard Java and that is the subject of this post.

Guava's Collections2 class features two public methods, both of which are static. The methods filter(Collection, Predicate) and transform(Collection, Function) do what their names imply: perform filtering and transformation respectively on a given collection. The collection to be filtered or transformed is the first parameter to each static method. The filtering function's second argument is an instance of Guava's Predicate class. The second argument of the transformation function is an instance of Guava's Function class. The remainder of this post demonstrates combining all of this together to filter and transform Java collections.

Filtering Collections with Guava

Filtering collections with Guava is fairly straightforward. The following code snippet demonstrates a simple example of this. A Set of Strings is provided (not shown in the code snippet, but obvious in the output that follows the code) and that provided Set is filtered for only entries beginning with a capital 'J'. This is done via use of Java regular expression support and Guava's Predicates.containsPattern(String), but there are other types of Predicates that can be specified.

Filtering Strings Beginning with 'J'
   /**
    * Demonstrate Guava's Collections2.filter method. Filter String beginning 
    * with letter 'J'.
    */
   public static void demonstrateFilter()
   {
      printHeader("Collections2.filter(Collection,Predicate): 'J' Languages");
      final Set<String> strings = buildSetStrings();
      out.println("\nOriginal Strings (pre-filter):\n\t" + strings);
      final Collection<String> filteredStrings =
              Collections2.filter(strings, Predicates.containsPattern("^J"));
      out.println("\nFiltered Strings:\n\t" + filteredStrings);
      out.println("\nOriginal Strings (post-filter):\n\t" + strings);
   }

The output from running the above method is shown next. This output shows the lengthy list of programming languages that make up the original Set of Strings returned by buildSetStrings() [source code shown later in the post] and shows the results of the filter call featuring only those programming languages that begin with 'J.'

Transforming Collections with Guava

Transforming collections with Guava works similarly to filtering syntactically, but a Function is used to specify how source collection entries are "transformed" to the output collection rather than using a Predicate to determine which entries to keep. The following code snippet demonstrates transforming entries in a given collection to the uppercase version of themselves.

Transforming Entries to Uppercase
   /**
    * Demonstrate Guava's Collections2.transform method. Transform input
    * collection's entries to uppercase form.
    */
   public static void demonstrateTransform()
   {
      printHeader("Collections2.transform(Collection,Function): Uppercase");
      final Set<String> strings = buildSetStrings();
      out.println("\nOriginal Strings (pre-transform):\n\t" + strings);
      final Collection<String> transformedStrings = 
              Collections2.transform(strings, new UpperCaseFunction<String, String>());
      out.println("\nTransformed Strings:\n\t" + transformedStrings);
      out.println("\nOriginal Strings (post-transform):\n\t" + strings);
   }

The above transformation code snippet made use of a class called UpperCaseFunction, but you won't find that class in the Guava API documentation. That is a custom class defined as shown in the next code listing.

UpperCaseFunction.java
package dustin.examples;

import com.google.common.base.Function;

/**
 * Simple Guava Function that converts provided object's toString() representation
 * to upper case.
 * 
 * @author Dustin
 */
public class UpperCaseFunction<F, T> implements Function<F, T>
{
   @Override
   public Object apply(Object f)
   {
      return f.toString().toUpperCase();
   }
}

The output of running the transformation code snippet that uses the UpperCaseFunction class is shown next.

The above code snippets showed methods devoted to filtering and transforming collections' entries with Guava. The entire code listing for the main class is shown next.

GuavaCollections2.java
package dustin.examples;

import static java.lang.System.out;

import com.google.common.base.Predicates;
import com.google.common.collect.Collections2;
import java.util.Collection;
import java.util.HashSet;
import java.util.Set;

/**
 * Class whose sole reason for existence is to demonstrate Guava's Collections2
 * class.
 * 
 * @author Dustin
 */
public class GuavaCollections2
{
   /**
    * Provides a Set of Strings.
    * 
    * @return Set of strings representing some programming languages.
    */
   private static Set<String> buildSetStrings()
   {
      final Set<String> strings = new HashSet<String>();
      strings.add("Java");
      strings.add("Groovy");
      strings.add("Jython");
      strings.add("JRuby");
      strings.add("Python");
      strings.add("Ruby");
      strings.add("Perl");
      strings.add("C");
      strings.add("C++");
      strings.add("C#");
      strings.add("Pascal");
      strings.add("Fortran");
      strings.add("Cobol");
      strings.add("Scala");
      strings.add("Clojure");
      strings.add("Basic");
      strings.add("PHP");
      strings.add("Flex/ActionScript");
      strings.add("JOVIAL");
      return strings;
   }

   /**
    * Demonstrate Guava's Collections2.filter method. Filter String beginning 
    * with letter 'J'.
    */
   public static void demonstrateFilter()
   {
      printHeader("Collections2.filter(Collection,Predicate): 'J' Languages");
      final Set<String> strings = buildSetStrings();
      out.println("\nOriginal Strings (pre-filter):\n\t" + strings);
      final Collection<String> filteredStrings =
              Collections2.filter(strings, Predicates.containsPattern("^J"));
      out.println("\nFiltered Strings:\n\t" + filteredStrings);
      out.println("\nOriginal Strings (post-filter):\n\t" + strings);
   }

   /**
    * Demonstrate Guava's Collections2.transform method. Transform input
    * collection's entries to uppercase form.
    */
   public static void demonstrateTransform()
   {
      printHeader("Collections2.transform(Collection,Function): Uppercase");
      final Set<String> strings = buildSetStrings();
      out.println("\nOriginal Strings (pre-transform):\n\t" + strings);
      final Collection<String> transformedStrings = 
              Collections2.transform(strings, new UpperCaseFunction<String, String>());
      out.println("\nTransformed Strings:\n\t" + transformedStrings);
      out.println("\nOriginal Strings (post-transform):\n\t" + strings);
   }

   /**
    * Print a separation header including the provided text.
    * 
    * @param headerText Text to be included in separation header.
    */
   private static void printHeader(final String headerText)
   {
      out.println("\n==========================================================");
      out.println("== " + headerText);
      out.println("==========================================================");
   }

   /**
    * Main function for demonstrating Guava's Collections2 class.
    * 
    * @param arguments 
    */
   public static void main(final String[] arguments)
   {
      demonstrateFilter();
      demonstrateTransform();
   }
}

Before concluding this post, there is an important caveat to note here. Both methods defined on the Collections2 class contain warnings in their Javadoc documentation about their use. Both methods provide collections that are considered "live views" of the original collections and thus "changes to one affect the other." For example, removing an element from a source collection similarly removes it from the transformed collection. The documentation for each method also warns that neither returns a collection that is Serializable or thread-safe even when the source collection was Serializable and/or thread-safe.

Conclusion

Guava makes it easier to filter collections and transform collections' entries in Java. Although the code may not be as concise as that of Groovy for doing similar things, it beats writing straight Java code without use of Guava's Collections2 class. Java collections can be filtered with Collections2.filter(Collection,Predicate) or transformed with Collections2.transform(Collection,Function).

4 comments:

James said...

Hi Dustin

Excellent blog. I wonder if you'd be interested in joining up with DZone's MVB program?
You can read more about it at dzone.com/aboutmvb.

If you're interested, please contact me at james at dzone dot com.

Thanks
James

Joel said...

Nice helpful entry.

Your "Transforming Entries to Uppercase" snippet has an issue in the call to the UpperCaseFunction constructor. You have:

new UpperCaseFunction()

which is pretty funky for Java and doesn't match the same in the full text of the test class at the bottom of the article.

@DustinMarx said...

Joel,

Thanks for pointing that out. Every once in a while, I forget to escape the less-than sign or greater-than sign in use of generics and the browser tries to interpret them as HTML tags. That's what happened here, but I've fixed it now.

Thanks again for the feedback and for letting me know about that issue.

Dustin

Kamran Ali Khan said...

JFilter http://code.google.com/p/jfilter/ is a very simple and high performance api to filter java collection.