0% found this document useful (0 votes)
2 views

java8collectors

The document provides an overview of Java 8's Collectors, which are used to collect elements from a Stream into various data structures. It details several predefined collectors such as toList, toSet, toMap, and others, explaining their usage and behavior, especially in handling duplicates and custom implementations. Additionally, it covers how to create custom collectors by implementing the Collector interface, with an example of an ImmutableSet collector.

Uploaded by

aravindhkumar311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

java8collectors

The document provides an overview of Java 8's Collectors, which are used to collect elements from a Stream into various data structures. It details several predefined collectors such as toList, toSet, toMap, and others, explaining their usage and behavior, especially in handling duplicates and custom implementations. Additionally, it covers how to create custom collectors by implementing the Collector interface, with an example of an ImmutableSet collector.

Uploaded by

aravindhkumar311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 7

Java 8’s Collectors - which are used at the final step of processing a Stream.

The Stream.collect() Method


============================
Stream.collect() is one of the Java 8’s Stream API‘s terminal methods. It allows to
perform mutable fold operations
(repackaging elements to some data structures and applying some additional logic,
concatenating them, etc.) on data elements held in a Stream instance.

The strategy for this operation is provided via Collector interface implementation.

Collectors
===========
All predefined implementations can be found in the Collectors class. It’s a common
practice to use a following static import with them to leverage increased
readability:

import static java.util.stream.Collectors.*;

or just single import collectors of your choice:

import static java.util.stream.Collectors.toList;


import static java.util.stream.Collectors.toMap;
import static java.util.stream.Collectors.toSet;

In the following examples we will be reusing the following list:

List<String> givenList = Arrays.asList("a", "bb", "ccc", "dd");

Collectors.toList()-toList collector can be used for collecting all Stream elements


into a List instance
Collectors.toSet()-ToSet collector can be used for collecting all Stream elements
into a Set instance.

Collectors.toCollection():
============================
when using toSet and toList collectors, you can’t make any assumptions of their
implementations.
If you want to use a custom implementation, you will need to use the toCollection
collector with a provided collection of your choice.

Let’s create a Stream instance representing a sequence of elements and collect them
into a LinkedList instance:
List<String> givenList = Arrays.asList("a", "bb", "ccc", "dd");
List<String> result = givenList.stream()
.collect(toCollection(LinkedList::new))
Notice that this will not work with any immutable collections. In such case, you
would need to either write a custom Collector implementation or use
collectingAndThen.

Collectors.toMap():
====================
ToMap collector can be used to collect Stream elements into a Map instance. To do
this, we need to provide two functions:
keyMapper
valueMapper
keyMapper will be used for extracting a Map key from a Stream element, and
valueMapper will be used for extracting a value associated with a given key.

Let’s collect those elements into a Map that stores strings as keys and their
lengths as values:
List<String> givenList = Arrays.asList("a", "bb", "ccc", "dd");
Map<String, Integer> result = givenList.stream()
.collect(toMap(Function.identity(), String::length));

Function.identity() is just a shortcut for defining a function that accepts and


returns the same value.

What happens if our collection contains duplicate elements? Contrary to toSet,


toMap doesn’t silently filter duplicates.
It’s understandable – how should it figure out which value to pick for this key?

List<String> listWithDuplicates = Arrays.asList("a", "bb", "c", "d", "bb");


assertThatThrownBy(() -> {
listWithDuplicates.stream().collect(toMap(Function.identity(),
String::length));
}).isInstanceOf(IllegalStateException.class);

Note that toMap doesn’t even evaluate whether the values are also equal. If it sees
duplicate keys, it immediately throws an IllegalStateException.

In such cases with key collision, we should use toMap with another signature:

List<String> givenList = Arrays.asList("a", "bb", "ccc", "dd");


Map<String, Integer> result = givenList.stream()
.collect(toMap(Function.identity(), String::length, (item, identicalItem) ->
item));

The third argument here is a BinaryOperator, where we can specify how we want
collisions to be handled.
In this case, we’ll just pick any of these two colliding values because we know
that the same strings will always have the same lengths, too.

Collectors.collectingAndThen()[Collect a Java Stream to an Immutable Collection]


==============================
CollectingAndThen is a special collector that allows performing another action on a
result straight after collecting ends.

Let’s collect Stream elements to a List instance and then convert the result into
an ImmutableList instance:

List<String> givenList = Arrays.asList("a", "bb", "ccc", "dd");


List<String> result = givenList.stream()
.collect(collectingAndThen(toList(), ImmutableList::copyOf))

Collectors.joining():
=========================
Joining collector can be used for joining Stream<String> elements.
We can join them together by doing:

List<String> givenList = Arrays.asList("a", "bb", "ccc", "dd");


String result = givenList.stream()
.collect(joining());

which will result in:

"abbcccdd"

You can also specify custom separators, prefixes, postfixes:

String result = givenList.stream()


.collect(joining(" "));

which will result in:

"a bb ccc dd"

or you can write:

String result = givenList.stream()


.collect(joining(" ", "PRE-", "-POST"));

which will result in:

"PRE-a bb ccc dd-POST"

Collectors.counting():
=======================
Counting is a simple collector that allows simply counting of all Stream elements.

Now we can write:

Long result = givenList.stream()


.collect(counting());

Collectors.summarizingDouble/Long/Int():
=========================================
SummarizingDouble/Long/Int is a collector that returns a special class containing
statistical information about numerical data in a Stream of extracted elements.

We can obtain information about string lengths by doing:

List<String> givenList = Arrays.asList("a", "bb", "ccc", "dd");


DoubleSummaryStatistics result = givenList.stream()
.collect(Collectors.summarizingDouble(String::length));
Double d= result.getMax();
System.out.println(d);//3.0

Collectors.averagingDouble/Long/Int():
========================================
AveragingDouble/Long/Int is a collector that simply returns an average of extracted
elements.
We can get average string length by doing:

Double result = givenList.stream()


.collect(averagingDouble(String::length));

Collectors.summingDouble/Long/Int():
=====================================
SummingDouble/Long/Int is a collector that simply returns a sum of extracted
elements.

We can get a sum of all string lengths by doing:

Double result = givenList.stream()


.collect(summingDouble(String::length));

Collectors.maxBy()/minBy():
==============================
MaxBy/MinBy collectors return the biggest/the smallest element of a Stream
according to a provided Comparator instance.

We can pick the biggest element by doing:

Optional<String> result = givenList.stream()


.collect(maxBy(Comparator.naturalOrder()));
Notice that returned value is wrapped in an Optional instance. This forces users to
rethink the empty collection corner case.

Collectors.groupingBy():
=========================
GroupingBy collector is used for grouping objects by some property and storing
results in a Map instance.

We can group them by string length and store grouping results in Set instances:

List<String> givenList = Arrays.asList("a", "bb", "ccc", "dd");


Map<Integer, Set<String>> result = givenList.stream()
.collect(groupingBy(String::length, toSet())); //{1=[a], 2=[bb, dd],
3=[ccc]}

This will result in the following being true:

assertThat(result)
.containsEntry(1, newHashSet("a"))
.containsEntry(2, newHashSet("bb", "dd"))
.containsEntry(3, newHashSet("ccc"));
Notice that the second argument of the groupingBy method is a Collector and you are
free to use any Collector of your choice.

Collectors.partitioningBy()
PartitioningBy is a specialized case of groupingBy that accepts a Predicate
instance and collects Stream elements into a Map instance that stores Boolean
values as keys and collections as values.
Under the “true” key, you can find a collection of elements matching the given
Predicate, and under the “false” key, you can find a collection of elements not
matching the given Predicate.

You can write:

Map<Boolean, List<String>> result = givenList.stream()


.collect(partitioningBy(s -> s.length() > 2))

Which results in a Map containing:

{false=["a", "bb", "dd"], true=["ccc"]}

Custom Collectors
=====================
If you want to write your Collector implementation, you need to implement Collector
interface and specify its three generic parameters:

public interface Collector<T, A, R> {...}


T – the type of objects that will be available for collection,
A – the type of a mutable accumulator object,
R – the type of a final result.

Let’s write an example Collector for collecting elements into an ImmutableSet


instance. We start by specifying the right types:

private class ImmutableSetCollector<T>


implements Collector<T, ImmutableSet.Builder<T>, ImmutableSet<T>> {...}

Since we need a mutable collection for internal collection operation handling, we


can’t use ImmutableSet for this; we need to use some other mutable collection or
any other class that could temporarily accumulate objects for us.
In this case, we will go on with an ImmutableSet.Builder and now we need to
implement 5 methods:

Supplier<ImmutableSet.Builder<T>> supplier()
BiConsumer<ImmutableSet.Builder<T>, T> accumulator()
BinaryOperator<ImmutableSet.Builder<T>> combiner()
Function<ImmutableSet.Builder<T>, ImmutableSet<T>> finisher()
Set<Characteristics> characteristics()

The supplier() method returns a Supplier instance that generates an empty


accumulator instance, so, in this case, we can simply write:

@Override
public Supplier<ImmutableSet.Builder<T>> supplier() {
return ImmutableSet::builder;
}

The accumulator() method returns a function that is used for adding a new element
to an existing accumulator object, so let’s just use the Builder‘s add method.

@Override
public BiConsumer<ImmutableSet.Builder<T>, T> accumulator() {
return ImmutableSet.Builder::add;
}
The combiner() method returns a function that is used for merging two accumulators
together:

@Override
public BinaryOperator<ImmutableSet.Builder<T>> combiner() {
return (left, right) -> left.addAll(right.build());
}

The finisher() method returns a function that is used for converting an accumulator
to final result type, so in this case, we will just use Builder‘s build method:

@Override
public Function<ImmutableSet.Builder<T>, ImmutableSet<T>> finisher() {
return ImmutableSet.Builder::build;
}

The characteristics() method is used to provide Stream with some additional


information that will be used for internal optimizations. In this case, we do not
pay attention to the elements order in a Set so that we will use
Characteristics.UNORDERED. To obtain more information regarding this subject, check
Characteristics‘ JavaDoc.

@Override public Set<Characteristics> characteristics() {


return Sets.immutableEnumSet(Characteristics.UNORDERED);
}

Here is the complete implementation along with the usage:

public class ImmutableSetCollector<T>


implements Collector<T, ImmutableSet.Builder<T>, ImmutableSet<T>> {

@Override
public Supplier<ImmutableSet.Builder<T>> supplier() {
return ImmutableSet::builder;
}

@Override
public BiConsumer<ImmutableSet.Builder<T>, T> accumulator() {
return ImmutableSet.Builder::add;
}

@Override
public BinaryOperator<ImmutableSet.Builder<T>> combiner() {
return (left, right) -> left.addAll(right.build());
}

@Override
public Function<ImmutableSet.Builder<T>, ImmutableSet<T>> finisher() {
return ImmutableSet.Builder::build;
}

@Override
public Set<Characteristics> characteristics() {
return Sets.immutableEnumSet(Characteristics.UNORDERED);
}

public static <T> ImmutableSetCollector<T> toImmutableSet() {


return new ImmutableSetCollector<>();
}
}

and here in action:

List<String> givenList = Arrays.asList("a", "bb", "ccc", "dddd");

ImmutableSet<String> result = givenList.stream()


.collect(toImmutableSet());

You might also like