Dustin's Pages

Monday, January 5, 2015

Stream-Powered Collections Functionality in JDK 8

This post presents application of JDK 8-introduced Streams with Collections to more concisely accomplish commonly desired Collections-related functionality. Along the way, several key aspects of using Java Streams will be demonstrated and briefly explained. Note that although JDK 8 Streams provide potential performance benefits via parallelization support, that is not the focus of this post.

The Sample Collection and Collection Entries

For purposes of this post, instances of Movie will be stored in a collection. The following code snippet is for the simple Movie class used in these examples.

Multiple instances of Movie are placed in a Java Set. The code that does this is shown below because it also shows the values set in these instances. This code declares the "movies" as a static field on the class and then uses a static initialization block to populate that field with five instances of Movie.

One type of functionality commonly performed on collections is filtering. The next code listing shows how to filter the "movies" Set for all movies that are rated PG. I'll highlight some observations that can be made from this code after the listing.

One thing that this first example includes that all examples in this post will also have is the invocation of the method stream() on the collection. This method returns an object implementing the java.util.Stream interface. Each of these returned Streams use the collection the stream() method is invoked against as their data source. All operations at this point are on the Stream rather than on the collection which is the source of the data for the Stream.

In the code listing above, the filter(Predicate) method is called on the Stream based on the "movies" Set. In this case, the Predicate is given by the lambda expressionmovie -> movie.getMpaaRating() == MpaaRating.PG. This fairly readable representation tells us that the predicate is each movie in the underlying data that has an MPAA rating of PG.

The Stream.filter(Predicate) method is an intermediate operation, meaning that it returns an instance of Stream that can be further operated on by other operations. In this case, there is another operation, collect(Collector), that is called upon the Stream returned by Stream.filter(Predicate). The Collectors class features numerous static methods that each provide an implementation of Collector that can be provided to this collect(Collector) method. In this case, Collectors.toSet() is used to get a Collector that will instruct the stream results to be arranged in a Set. The Stream.collect(Collector) method is a terminal operation, meaning that it's the end of the line and does NOT return a Stream instance and so no more Stream operations can be executed after this collect has been executed.

When the above code is executed, it generates output like the following:

This example shares many similarities with the previous example. Like that previous code listing, this listing shows use of Stream.filter(Predicate), but this time the predicate is the lambda expression movie -> movie.getImdbTopRating() == 1). In other words, the Stream resulting from this filter should contain only instances of Movie that have the method getImdbTopRating() returning the number 1. The terminating operation Stream.findFirst() is then executed against the Stream returned by Stream.filter(Predicate). This returns the first entry encountered in the stream and, because our underlying Set of Movie instances only had one instance with IMDb Top 250 Rating of 1, it will be the first and only entry available in the stream resulting from the filter.

The Stream.map(Function) method acts upon the Stream against which it is called (in our case, the Stream based on the underlying Set of Movie objects) and applies the provided Function against that Steam to return a new Stream that results from the application of that Function against the source Stream. In this case, the Function is represented by Movie::getTitle, which is an example of a JDK 8-introduced method reference. I could have used the lambda expression movie -> movie.getTitle() instead of the method reference Movie::getTitle for the same results. The Method References documentation explains that this is exactly the situation a method reference is intended to address:

You use lambda expressions to create anonymous methods. Sometimes, however, a lambda expression does nothing but call an existing method. In those cases, it's often clearer to refer to the existing method by name. Method references enable you to do this; they are compact, easy-to-read lambda expressions for methods that already have a name.

As you might guess from its use in the code above, Stream.map(Function) is an intermediate operation. This code listing applies a terminating operation of Stream.collect(Collector) just as the previous two examples did, but in this case it's Collectors.toList() that is passed to it and so the resultant data structure is a List rather than a Set.

The next example does not use Stream.filter(Predicate), Stream.map(Function), or even the terminating operation Stream.collect(Collector) that were used in most of the previous examples. In this example, the reduction and terminating operations Stream.allMatch(Predicate) and Stream.anyMatch(Predicate) are applied directly on the Stream based on our Set of Movie objects.

The code listing demonstrates that Stream.anyMatch(Predicate) and Stream.allMatch(Predicate) each return a boolean indicating, as their names respectively imply, whether the Stream has at least one entry matching the predicate or all of the entries matching the predicate. In this case, all movies come from the imdb.com Top 250, so that "allMatch" will return true. Not all of the movies are rated PG, however, so that "allMatch" returns false. Because at least one movie is rated PG, the "anyMatch" for PG rating predicate returns true, but the "anyMatch" for N/A rating predicate returns false because not even one movie in the underlying Set had a MpaaRating.NA rating. The output from running this code is shown next.

This convoluted example illustrates using Integer.min(int,int) to find the oldest movie in the underlying Set and using Integer.max(int,int) to find the newest movie in the Set. This is accomplished by first using Stream.map to get a new Stream of Integers provided by the release year of each Movie in the original Stream. This Stream of Integers then has Stream.reduce(BinaryOperation) operation executed with the static Integer methods used as the BinaryOperation.

For this code listing, I intentionally used lambda expressions for the Predicate and BinaryOperation in calculating the oldest movie (Integer.min(int,int)) and used method references instead of lambda expressions for the Predicate and BinaryOperation used in calculating the newest movie (Integer.max(int,int)). This proves that either lambda expressions or method references can be used in many cases.

The output from running the above code is shown next:

===========================================================
= Oldest and Youngest via reduce
===========================================================
Oldest movie was released in 1980
Youngest movie was released in 2010

Conclusion

JDK 8 Streams introduce a powerful mechanism for working with Collections. This post has focused on the readability and conciseness that working against Streams brings as compared to working against Collections directly, but Streams offer potential performance benefits as well. This post has attempted to use common collections handling idioms as examples of the conciseness that Streams bring to Java. Along the way, some key concepts associated with using JDK streams have also been discussed. The most challenging parts about using JDK 8 Streams are getting used to new concepts and new syntax (such as lambda expression and method references), but these are quickly learned after playing with a couple examples. A Java developer with even light experience with the concepts and syntax can explore the Stream API's methods for a much lengthier list of operations that can be executed against Streams (and hence against collections underlying those Streams) than illustrated in this post.

Additional Resources

The purpose of this post was to provide a light first look at JDK 8 streams based on simple but fairly common collections manipulation examples. For a deeper dive into JDK 8 streams and for more ideas on how JDK 8 streams make Collections manipulation easier, see the following articles: