Comparing the Google Collections Library with the Apache Commons Collections

hen I first learned about the Google Collections Library, I was puzzled. The Google Collections Library enhances the JDK’s Java Collections Framework, and its developers plan to make it part of JDK 7. As a user of Apache’s Jakarta Commons Collections, I didn’t understand why Google created an API to enhance the Java Collections Framework when Apache had already done so. I read in an interview with one of the Google Collections Library creators that a motivation for creating a new library was to provide JDK 5 Generics support, which Jakarta Commons Collections lacks. I also read that Jakarta Commons Collections contains a few classes that violate the Java Collections Framework specification.

Upon investigating these claims I found that Jakarta Commons Collections indeed doesn’t support JDK 5 Generics?a definite flaw. However, I found this SourceForge project, which claims to be a JDK 5 Generics-enabled version of Jakarta Commons Collections.

As for Java Collections Framework violations, I found that only the Bag interface methods in Jakarta Commons Collections violate the spec (as per the JavaDoc), and the Bag interface by its nature warrants this violation. Take the add method in the Bag interface as an example. The Java Collections Framework spec stipulates that the add method must always return true, but Bag does not do this when an object of the same type as one already in Bag is added. In that case, it only increases the count and the method returns false.

Coincidently, I discovered the src folder of the Jakarta Commons Collections project had not been updated in two months.

So I decided to explore the Google Collections Library further and find out what it had to offer a Java developer like me who’s been using Jakarta Commons Collections. In this article, I describe the packages and classes in the Google Collections Library (hereafter referred to simply as the Library). I then compare and contrast these classes with those in the Apache Jakarta Commons Collections (hereafter referred to simply as Commons Collections).

Google Collections Library Packages
The Library is organized into two packages:

  • com.google.common.base ? This package contains a few common utility classes, which can be used even without the Java Collections Framework.
  • com.google.common.collect ? This package hosts all the enhancements to the Java Collections Framework.

This organization enables a programmer to grasp the Library quicker than he or she would learn Commons Collections. For instance, whereas decorator classes are scattered throughout all the packages in Commons Collections, the Library keeps them all in one package (com.google.common.collect) and prefixes them with ‘Forwarding.

The com.google.common.base Package
This section details some of the common utility classes in the com.google.common.base package.

Working with Soft, Weak, and Phantom References
If you used any of the SoftReference, WeakReference, or PhantomReference classes in the Java Collections Framework package java.lang.ref directly and wrote code to work with the ReferenceQueue, you will appreciate using the following Library classes:

  • FinalizablePhantomReference
  • FinalizableSoftReference
  • FinalizableWeakReference

These classes extend their respective Java Collections Framework classes to work with processing the ReferenceQueue and call back a convenient method finalizeReferent() defined in these classes. So if you have to do some cleanup operation when an object is claimed by the garbage collector (GC), just overriding the finalizeReferent() method will do the trick.

For example, suppose you are using the ImageChunks class to store image binary data in an object’s memory and the GC cleaning up this object type is acceptable if the JVM runs low on memory. You can try extending the FinalizableSoftReference class (with, say, the SoftImageChunks class) and overriding the finalizeReferent method. This approach allows you to write the cleanup operation that should be called when the GC claims your object. The primary advantage of FinalizableSoftReference is you wouldn’t need to deal with the ReferenceQueue directly.

Commons Collections doesn’t have anything equivalent for working with the ReferenceQueue since it is not part of collections enhancement.

Function
The Function interface has the method apply, which is used for transforming from one value to another (String to Integer, for example). Function is used in Maps and Comparators in the com.google.common.collect package for performing transformations. Here are a couple of notable Function uses:

  • Maps.uniqueIndex(..) lets you apply a Function to perform a transform on the values of the backing Map instance. As a result, you get a Map instance that contains a mapping between the key and the transformed values.
  • Comparators.fromFunction(Function function) lets you create a Comparator that will apply the Function on both the items compared. Since the Comparator is created from within this method, the natural ordering of the object is used for comparison.
  • The method forMap in the Functions utility class is handy for performing a lookup on a map to do a transformation. This method takes in a Map instance and returns a Function. When the method apply is called, the input is used as the key and the corresponding value will be looked up from the Map.

You can find Function being used in the classes Lists, Iterators, Iterator, and Functions.

The equivalent Commons Collections interface to Function is org.apache.commons.collections.Transformer. It is used in the following classes:

  • TransformedBag
  • TransformedBuffer
  • TransformedList
  • TransformedSet
  • TransformedMap

Transformer can be used along with Predicates as well. (Refer to TransformerUtils in the JavaDoc for more details.)

Nullable
The Nullable annotation type is a cool thing. A general misunderstanding among business application developers is that their applications should not throw run-time exceptions. Nullable allows run-time exceptions to be an acceptable and neat way to write methods that bound the call to a contract. That is, if a method cannot accept null for an argument, a developer could use Nullable to perform a null check that raises an error in the first line of the method.

Commons Collections doesn’t have an equivalent to Nullable.

Objects and Preconditions
Objects and Preconditions are utility classes that can be used even without the Java Collections Framework:

  • The Objects class contains a series of utility methods to aid calling toString, hashCode, and equals methods on more than one object. It uses Arrays internally to delegate calls to the appropriate type.
  • The Preconditions class is very useful for checking the value of the argument passed to a method and raising an error if appropriate.

Predicates
Predicates can be used to model a scenario in which you want to iterate a collection (for example, when you want to go through a set of employee objects and filter them based on the attribute of salary). Using predicates will save you from having to write the same code repeatedly to perform if checks on the objects added in a list.

When compared to Commons Collections Functors, the Library predicates lack closures support. Predicates allows only filter and find operations, which can be run against a collection. For example, the following filter method in the com.google.common.collect.Iterators class allows filtering a collection based on the predicate passed:

static  Iterator filter(Iterator unfiltered, Predicate predicate) 

So the Library’s predicate is ideal for developers intending to get the subset of a collection based on a filtering criterion.

The com.google.common.collect Package
This section details some of the enhancements to the Java Collections Framework that the com.google.common.collect package makes.

BiMap and ClassToInstance
The BiMap interface in the Library is analogous to the BidiMap interface in Commons Collections. It allows you to map a key to a value and vice versa. Therefore, both the key and value entries should be unique.

The code download compares the performances of both implementations. I found HashBiMap from the Library and DualTreeBidiMap from Commons Collections to be very similar in their insertion and seek times. However, Commons Collections TreeBidiMap takes a little longer to insert but saves memory space because it does not use dual map to represent value mapping to the key.

If you use Enum types for keys and values, you can use the EnumBiMap class in the com.google.common.collect package to create a bi-directional map. But I haven’t seen a good use for this class yet. Why would someone want to map Enum types to Enum types? (Click here for a sampling of small programs that demonstrate the usage of some Library classes.)

The Library has no substitute for Commons Collections’ ClassToInstance, which is useful for mapping an object instance to its class type. (Refer to the code download for a sample that illustrates this usage.)

Constraint and MapConstraint
You can restrict object additions to a collection object by setting constraints. Suppose you want to treat a negative value added to a list as a fatal error. You could define a constraint and use it in all the places wherever Integer is added to a List or Map. The Constraints or MapConstraints class contains a lot of utility methods for working with Constraint and MapConstraint in the Library. PredicatedCollections in the Java Collections Framework is similar to this.

Commons Collections has no direct equivalent for this feature, but it contains predefined, constraint-enforced classes using the decorator pattern. A couple of the classes are:

  • org.apache.commons.collections.list.FixedSizeList ? This class posts a restriction on the add or remove operation.
  • org.apache.commons.collections.map.LRUMap ? This class removes the least recently used entry if the map is full.

Multimap
The Library’s implementation of Multimap has two variants:

  • HashMultimap allows you to override an existing entry if the key and value are identical.
  • ArrayListMultimap allows duplicates for both keys and values.

Commons Collections has only one implementation, MultiValueMap, and it behaves similarly to HashMultimap. The code download includes an example, which I used to evaluate these classes.

Multiset
The Multiset interface is similar to the Bag interface in Commons Collections. (Refer to my previous article to learn about Bag.) Basically, Multiset allows you to keep track of similar objects added to a Set. If the object you’re adding already exists in the Set, the count is incremented instead of the object being added again.

I did not find a match for ConcurrentMultiset in Commons Collections.The Library’s implementation of ConcurrentMultiset uses JDK 5 ConcurrentMap to achieve better performance when using this class in a multi-threaded environment. Commons Collections has HashBag and Synchronized Bag, but they are not equal; the former is not thread safe and the latter locks the object for all method calls.

Comparators
Both the Library and Commons Collections enable you to create a Comparator implementation for use with a collection. If you have to use Comparator extensively in an application, read the JavaDoc of the com.google.common.collect.Comparators class. It contains static utility methods to facilitate Comparator and Function usage.

If you have two comparators and want to use both to get a compound result, the method Comparators.compound(list of comparators) is handy. This method returns a Comparator that will invoke other comparators until it finds a non-zero result.

The org.apache.commons.collections.comparators package contains seven predefined comparators. One, org.apache.commons.collections.ComparatorUtils, is very similar to the Library’s com.google.common.collect.Comparators.

Unlike Commons Collections, the Library defines all the methods as utility methods in one class, Comparators. This class contains methods for primitives comparison and natural-order comparison, as well as for using Function to transform the object before comparing.

Decorators
To perform an operation before the backing data structure is modified, you can use the classes prefixed with “Forwarding” under the com.google.common.collect package to decorate maps, lists, and sets. For example, the method Constraints.constrainedList(..) decorates a List to use a Constraint implementation whenever an add operation is performed. This allows you to use collection classes, such as the classes in the java.io.* package. You can create multiple decorators and wrap one inside the other to perform a series of operations before the actual call is delegated to the ultimate collection class.

Commons Collections also contains decorator classes, but they are located in the respective packages of the collection types. For instance, FixedSizeMap, LazyMap, ListOrderedMap, MultiValueMap, and UnmodifiableMap are located under the org.apache.commons.collections.map package.

Iterables and Iterator
Iterables contains a bunch of utility methods to create numerous Iterator types for various purposes. For example, the Iterators.cycle method allows you to iterate infinitely until the list becomes empty. Commons Collections has similar behavior in the class org.apache.commons.collections.iterators.LoopingIterator.

In Commons Collections, all the Iterator types are found under the package org.apache.commons.collections.iterators and the utility methods are found in the class org.apache.commons.collections.IteratorUtils.

Maps, Lists, and Sets
The Maps, Lists, and Sets classes respectively allow you to create map, list, and set objects easily, without having to call the constructor. For instance, you can replace Map m = new HashMap() with Map m = Maps.newHashMap().

Apart from creating collection objects, these factory classes have numerous utility methods, including these notable ones:

  • Maps.immutableMap(..) lets you to create an immutable Map for a given hard-coded key value pair. You no longer need to write four or five lines to populate some hard-coded key value pair to a Map.
  • List.newArrayList(Iterator itr) creates an ArrayList object and populates it by iterating argument Iterator instances.

Similar utility classes, such as MapUtils, ListUtils, and SetUtils, can be found in the Commons Collections package org.apache.commons.collections.

The Library also contains the utility classes PrimitiveArrays and ObjectArrays to simplify the use of primitive data types. PrimitiveArrays contains utility methods to convert List to primitive types and vice versa. This is really helpful for reducing repeated code for these kinds of operations.

What’s Missing in Google Collections Library?
I found no equivalents for the following Commons Collections classes in the Library:

  • org.apache.commons.collections.keyvalue.MultiKey ? This class allows you to use multiple attributes to create a key for use in a Map.
  • org.apache.commons.collections.buffer ? This class defines a contract for object removal in a collection. According to the JavaDoc of the Buffer interface, “The removal order can be based on insertion order (for example, a FIFO queue or a LIFO stack), on access order (e.g., an LRU cache), on some arbitrary comparator (e.g., a priority queue) or on any other well-defined ordering.”
  • org.apache.commons.collections.functors ? This class doesn’t have the Closure feature, to be called if the Predicate evaluate method returns true. But Predicate support is available in Google library to be used with collections for find or filter operations.

I also did not find equivalents for FixedSizeList, LazyList, and a few other specific list types in the com.google.common.collect package, FixedSizeMap, Flat3Map, LazyMap, LRUMap, and ListOrderedSet.

More Choice for the Commons Collections User
Although Google already actively uses the Library in applications such as GMail, Google Reader, and Blogger, the developers have kept the current API release (0.5) as Alpha because an official release would require backward compatibility. Keeping it an Alpha allows them the freedom to make changes. Still, the API achieves about 85 percent functional test coverage according to the Google Collections Library FAQ.

Now you know how the Google Collections Library compares with Apache Commons Collections. If you are not a fan of using JDK 5 Generics, Commons Collections probably will meet your needs. Personally, I like how few classes and packages are in Google’s implementation, which makes it very easy to learn and use in a project right away. If the Commons Collections community revamps its offering as a result of the Library release, it will be interesting to observe which one makes it into the JDK in the future.

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

The Latest

microsoft careers

Top Careers at Microsoft

Microsoft has gained its position as one of the top companies in the world, and Microsoft careers are flourishing. This multinational company is efficiently developing popular software and computers with other consumer electronics. It is a dream come true for so many people to acquire a high paid, high-prestige job

your company's audio

4 Areas of Your Company Where Your Audio Really Matters

Your company probably relies on audio more than you realize. Whether you’re creating a spoken text message to a colleague or giving a speech, you want your audio to shine. Otherwise, you could cause avoidable friction points and potentially hurt your brand reputation. For example, let’s say you create a

chrome os developer mode

How to Turn on Chrome OS Developer Mode

Google’s Chrome OS is a popular operating system that is widely used on Chromebooks and other devices. While it is designed to be simple and user-friendly, there are times when users may want to access additional features and functionality. One way to do this is by turning on Chrome OS