0% found this document useful (0 votes)
28 views47 pages

GS Collections and Java 8 Functional, Fluent, Friendly & Fun!

2014-09-29_JavaOne_GSC

Uploaded by

Cong Zihang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views47 pages

GS Collections and Java 8 Functional, Fluent, Friendly & Fun!

2014-09-29_JavaOne_GSC

Uploaded by

Cong Zihang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 47

TECHNOLOGY

DIVISION

GS Collections and Java 8


Functional, Fluent, Friendly & Fun!
GS.com/Engineering
Fall, 2014
Donald Raab
Craig Motlin

Agenda

TECHNOLOGY
DIVISION

Introductions
Lost and Found
Streams
The Iceberg

APIs
Fluency
Memory Efficiency
Method references are awesome

Framework Comparisons
2

What is GS Collections?

TECHNOLOGY
DIVISION

Open source Java collections framework developed in


Goldman Sachs
In development since 2004
Hosted on GitHub w/ Apache 2.0 License
github.com/goldmansachs/gs-collections

GS Collections Kata
Internal training developed in 2007
Taught to > 1,500 GS Java developers
Hosted on GitHub w/ Apache 2.0 License
github.com/goldmansachs/gs-collections-kata
3

Paradise Lost
1997 - Smalltalk Best
Practice Patterns (Kent
Beck)

do:
select:
reject:
collect:
detect:
detect:ifNone:
inject:into:

(Dr. Seuss API)

TECHNOLOGY
DIVISION

2007 - Implementation
Patterns (Kent Beck)

Map
List
Set

The collection iteration


patterns disappeared
All that remained were
the types

Paradise Found
detect
select

list reject: [:each | each > 50].

Pattern in GS Collections w/
Lambdas
list.detect(each -> each > 50);

list.select(each -> each > 50);

list.reject(each -> each > 50);


list.anySatisfy(each -> each >
50);

list anySatisfy: [:each | each >


50].
list allSatisfy: [:each | each >
50].

for (Integer each : list)


if (each <= 50)
return false;
return true;

list.allSatisfy(each -> each >


50);

list collect: [:e | e


printString].

List<String> result = new


ArrayList<>();
for (Integer each : list)
result.add(each.toString());

list.collect(Object::toString);

list inject: 3 into: [:x :y | x +


y].

int result = 3;
for (Integer each : list)
result = result + each;

list.injectInto(3, Integer::sum);

inject
into

collect

any
satisfy

list select: [:each | each > 50].

Pattern in Classic Java


for (Integer each : list)
if (each > 50)
return each;
return null;
List<Integer> result = new
ArrayList<>();
for (Integer each : list)
if (each > 50)
result.add(each);
List<Integer> result = new
ArrayList<>();
for (Integer each : list)
if (each <= 50)
result.add(v);
for (Integer each : list)
if (each > 50)
return true;
return false;

all
satisfy

list detect: [:each | each > 50].

reject

Pattern in Smalltalk-80

TECHNOLOGY
DIVISION

Lazy by any other name


fndAny
flter
flter

Stream<Integer> result =
list.stream().filter(e -> e <= 50);

any
Match

list.asLazy().collect(Object::toString);
Integer result =
list.asLazy().injectInto(3,
Integer::sum);

Stream<Integer> result =
list.stream().filter(e -> e > 50);

boolean any =
list.stream().anyMatch(e -> e > 50);

all
Match

inject
into

LazyIterable<String> result =

Integer result = list.stream()


.filter(e -> e >
50).findFirst().orElse(null);

boolean all =
list.stream().allMatch(e -> e > 50);

map

boolean all =
list.asLazy().allSatisfy(e -> e > 50);

collect

boolean any =
list.asLazy().anySatisfy(e -> e > 50);

Java 8 Streams

Stream<String> result =
list.stream().map(Object::toString);

reduce

detect
select

LazyIterable<Integer> result =
list.asLazy().reject(e -> e > 50);

any
satisfy

LazyIterable<Integer> result =
list.asLazy().select(e -> e > 50);

all
satisfy

Integer result = list.asLazy()


.detectIfNone(e -> e > 50, () -> null);

reject

GS Collections LazyIterable

TECHNOLOGY
DIVISION

Integer result =
list.stream().reduce(3, Integer::sum);
6

Eager vs. Lazy


fndAny
flter
flter

List<Integer> result =
list.stream().filter(e -> e <=
50).collect(Collectors.toList());
boolean result =
list.stream().anyMatch(e -> e > 50);

boolean all =
list.allSatisfy(e -> e >
50);

boolean result =
list.stream().allMatch(e -> e > 50);

map

List<String> result =

MutableList<String> result =
list.collect(Object::toString);
Integer result =
list.injectInto(3,
Integer::sum);

reduce

inject
into

collect

boolean any =
list.anySatisfy(e -> e >
50);

any
Match

List<Integer> result =
list.stream().filter(e -> e >
50).collect(Collectors.toList());

all
Match

detect
select

Integer result =
list.stream().filter(e -> e > 50).findFirst().orElse(null);

any
satisfy

MutableList<Integer> result =
list.reject(e -> e > 50);

Java 8 Streams

all
satisfy

MutableList<Integer> result =
list.select(e -> e > 50);

reject

Eager GS Collections
Integer result =
list.detect(e -> e > 50);

TECHNOLOGY
DIVISION

list.stream().map(Object::toString).collect(Collectors.toList())
;
Integer result =
list.stream().reduce(3,
Integer::sum);
7

Java 8 Streams

TECHNOLOGY
DIVISION

Great framework that provides feature rich functional


API
Lazy by default
Supports serial and parallel iteration patterns
Support for three types of primitive streams
Extendable through Collector implementations

Java 8 Streams is the tip of an enormous iceberg


8

Iceberg dead ahead!

TECHNOLOGY
DIVISION

Eager iteration patterns on Collections


Covariant return types on collection protocols
New Collection Types
Bag, SortedBag, BiMap, Multimap

Memory Efficient Set and Map


Primitive containers
Immutable containers
9

Ice is Twice as Nice

TECHNOLOGY
DIVISION

Java 8

GS Collections

Stream vs. LazyIterable


Interfaces

Functional Interfaces

46

298

Object Container Interfaces

11

75

Primitive Container Interfaces

309

Stream vs. RichIterable API

47

109

Primitive Stream vs. Iterable API

48 x 3 = 144

38 x 8 = 304
10

More Iteration Patterns

TECHNOLOGY
DIVISION

flatCollect
partition
makeString / appendString
groupBy
aggregateBy
sumOf
sumBy
11

Futility of Utility

TECHNOLOGY
DIVISION

Utility
Easy to extend with new behaviors without breaking existing
clients

API
Easy to discover new features
Easy to optimize
Easy to read from left to right
Return types are specific and easy to understand
Verb vs. gerund

12

Joining vs. MakeString

TECHNOLOGY
DIVISION

String joined = things.stream()


.map(Object::toString)
.collect(Collectors.joining(",
"));
String joined =
things.makeString(", ");

13

SummingInt vs. SumOfInt

TECHNOLOGY
DIVISION

int total = employees.stream().collect(


Collectors.summingInt(Employee::getSalary))
;
long total =
employees.sumOfInt(Employee::getSalary);

14

GroupingBy vs. GroupBy

TECHNOLOGY
DIVISION

Map<Department, List<Employee>> byDept =


employees.stream()
.collect(Collectors.groupingBy(
Employee::getDepartment));
Multimap<Department, Employee> byDept =
employees.groupBy(Employee::getDepartment);

15

GroupingBy/SummingBy vs. SumBy

TECHNOLOGY
DIVISION

Map<Department, Integer> totalByDept =


employees.stream()
.collect(Collectors.groupingBy(
Employee::getDepartment,
Collectors.summingInt(Employee::getSalary)));
// Upcoming GS Collections 6.0
ObjectLongMap<Department> totalByDept =
employees.sumByInt(
Employee::getDepartment,
Employee::getSalary);

16

PartitioningBy vs. Partition

TECHNOLOGY
DIVISION

Map<Boolean, List<Student>> passingFailing =


students.stream()
.collect(Collectors.partitioningBy(
s -> s.getGrade() >= PASS_THRESHOLD));
PartitionList<Student> passingFailing =
students.partition(
s -> s.getGrade() >= PASS_THRESHOLD);

17

How do they stack up?

TECHNOLOGY
DIVISION

18

Agenda

TECHNOLOGY
DIVISION

Introductions
Lost and Found
Streams
The Iceberg

APIs
Fluency
Memory Efficiency
Method references are awesome

Framework Comparisons
19

Anagram tutorial

TECHNOLOGY
DIVISION

http://docs.oracle.com/javase/tutorial/collections/algorithms /
Start with all words in the dictionary
Group them by their alphagrams
Alphagram contains sorted characters
alerts aelrst
stelar aelrst

Filter groups containing at least eight anagrams


Sort groups by number of anagrams (descending)
Print them in this format
11: [alerts, alters, artels, estral, laster, ratels,
salter, slater, staler, stelar, talers]

20

Anagram tutorial

TECHNOLOGY
DIVISION

this.getWords()
.stream()
.collect(Collectors.groupingBy(Alphagram::new))
.values()
.stream()
.filter(each -> each.size() >= SIZE_THRESHOLD)
.sorted(Comparator.<List<?
>>comparingInt(List::size).reversed())
.map(each -> each.size() + ": " + each)
.forEach(System.out::println);

21

Anagram tutorial

TECHNOLOGY
DIVISION

this.getWords()
.groupBy(Alphagram::new)
.multiValuesView()
.select(each -> each.size() >= SIZE_THRESHOLD)
.toSortedListBy(RichIterable::size)
.asReversed()
.collect(each -> each.size() + ": " + each)
.each(System.out::println);

22

Anagram tutorial

TECHNOLOGY
DIVISION

this.getWords()
.groupBy(Alphagram::new)
.multiValuesView()
.select(each -> each.size() >= SIZE_THRESHOLD)
.toSortedListBy(RichIterable::size)
.asReversed()
.collect(each -> each.size() + ": " + each)
.each(System.out::println);

Type: MutableListMultimap<Alphagram, String>


23

Anagram tutorial

TECHNOLOGY
DIVISION

this.getWords()
.groupBy(Alphagram::new)
.multiValuesView()
.select(each -> each.size() >= SIZE_THRESHOLD)
.toSortedListBy(RichIterable::size)
.asReversed()
.collect(each -> each.size() + ": " + each)
.each(System.out::println);

Type: RichIterable<RichIterable<String>>
24

Anagram tutorial

TECHNOLOGY
DIVISION

this.getWords()
.groupBy(Alphagram::new)
.multiValuesView()
.select(each -> each.size() >= SIZE_THRESHOLD)
.toSortedListBy(RichIterable::size)
.asReversed()
.collect(each -> each.size() + ": " + each)
.each(System.out::println);

Type: RichIterable<RichIterable<String>>
25

Anagram tutorial

TECHNOLOGY
DIVISION

this.getWords()
.groupBy(Alphagram::new)
.multiValuesView()
.select(each -> each.size() >= SIZE_THRESHOLD)
.toSortedListBy(RichIterable::size)
.asReversed()
.collect(each -> each.size() + ": " + each)
.each(System.out::println);

Type: MutableList<RichIterable<String>>
26

Anagram tutorial

TECHNOLOGY
DIVISION

this.getWords()
.groupBy(Alphagram::new)
.multiValuesView()
.select(each -> each.size() >= SIZE_THRESHOLD)
.toSortedListBy(RichIterable::size)
.asReversed()
.collect(each -> each.size() + ": " + each)
.each(System.out::println);

Type: LazyIterable<RichIterable<String>>
27

Anagram tutorial

TECHNOLOGY
DIVISION

this.getWords()
.groupBy(Alphagram::new)
.multiValuesView()
.select(each -> each.size() >= SIZE_THRESHOLD)
.toSortedListBy(RichIterable::size)
.asReversed()
.collect(each -> each.size() + ": " + each)
.each(System.out::println);

Type: LazyIterable<String>
28

Parallel Lazy Iteration

TECHNOLOGY
DIVISION

Stream<Address> addresses =
people.parallelStream()
.map(Person::getAddress);
ParallelListIterable<Address> addresses =
people.asParallel(executor, batchSize)
.collect(Person::getAddress);
http://www.infoq.com/presentations/java-streams-scala-parallel-collections

29

Agenda

TECHNOLOGY
DIVISION

Introductions
Lost and Found
Streams
The Iceberg

APIs
Fluency
Memory Efficiency
Method references are awesome

Framework Comparisons
30

Comparing Maps

TECHNOLOGY
DIVISION

45000000
40000000
35000000

JDK
HashMap

30000000

GSC
UnifedMap

25000000

Trove
THashMap

20000000
Size
(Mb)
15000000
10000000
5000000
0

Elements

31

Memory Optimizations

TECHNOLOGY
DIVISION

Entry holds key, value, next, and hash.


Better to put the keys and values in the backing array.
Uses half the memory on average.
But watch out for Map.entrySet().
Leaky abstraction
The assumption is that Maps are implemented as tables of Entry
objects.

Its now O(n) instead of O(1).


Use forEachKeyValue() instead.

32

Comparing Sets

TECHNOLOGY
DIVISION

60,000,000

50,000,000

JDK
HashSet

40,000,000

GSC
UnifedSet

30,000,000

Trove
THashSet

Size (Mb)
20,000,000

10,000,000

Elements

33

Memory Optimizations

TECHNOLOGY
DIVISION

HashSet is implemented by delegating to a HashMap.


Entries are still a waste of space.
Values in each (key, value) pair are a waste of space.
Uses 4x the memory on average.

34

Bad decisions from long ago

TECHNOLOGY
DIVISION

35

Save memory with Primitive Collections

TECHNOLOGY
DIVISION

25,000,000

20,000,000

15,000,000

Size (Mb)
10,000,000

JDK
ArrayList
GSC
IntArrayList
Trove
TIntArrayLis
t

5,000,000

Elements

36

List<Integer> vs. IntList

TECHNOLOGY
DIVISION

Java has object and primitive arrays


Primitive arrays have no behaviors

Java does not have primitive Lists, Sets or


Maps
Primitives must be boxed
Boxing is expensive
Reference + Header + alignment

37

Agenda

TECHNOLOGY
DIVISION

Introductions
Lost and Found
Streams
The Iceberg

APIs
Fluency
Memory Efficiency
Method references are awesome

Framework Comparisons
38

Lambdas and Method References

TECHNOLOGY
DIVISION

We upgraded the Kata (our training materials) from Java 7 to


Java 8
Some anonymous inner classes converted easily into Method
References
MutableList<String> customerCities =
customers.collect(Customer::getCity);

Some we kept as lambdas


MutableList<Customer> customersFromLondon =
customers.select(customer -> customer.livesIn("London"));

39

Lambdas and Method References

TECHNOLOGY
DIVISION

The method reference syntax is appealing


Can we write the select example with a
method reference?
MutableList<Customer> customersFromLondon =
customers.select(Customer::livesInLondon);

No one writes methods like this.

40

Lambdas and Method References

TECHNOLOGY
DIVISION

Now we use method references


We used to use constants
MutableList<String> customerCities =
customers.collect(Customer.TO_CITY);
public static final Function<Customer, String> TO_CITY =
new Function<Customer, String>() {
public String valueOf(Customer customer) {
return customer.getCity();
}
};

41

Lambdas and Method References

TECHNOLOGY
DIVISION

The select example would have created garbage


MutableList<Customer> customersFromLondon =
customers.select(new Predicate<Customer>()
{
public boolean accept(Customer customer)
{
return customer.livesIn("London");
}
});

42

Lambdas and Method References

TECHNOLOGY
DIVISION

So we created selectWith(Predicate2) to avoid garbage


MutableList<Customer> customersFromLondon =
customers.selectWith(Customer.LIVES_IN, "London");
public static final Predicate2<Customer, String> LIVES_IN =
new Predicate2<Customer, String>()
{
public boolean accept(Customer customer, String city)
{
return customer.livesIn(city);
}
};

43

Lambdas and Method References

TECHNOLOGY
DIVISION

The *With() methods work perfectly with Method


References
MutableList<Customer> customersFromLondon =
customers.selectWith(Customer::livesIn,
"London");

This increases the number of places we can use


method references.

44

Framework Comparisons
Features

GS Collections

Java 8

Guava

Rich API

Interfaces

Readable,
Mutable,
Immutable,
FixedSize, Lazy

Mutable,
Stream

Mutable,
Fluent

Optimized Set & Map

(+Bag)

Immutable Collections

Primitive Collections

(+Bag,
+Immutable)

Multimaps

(+Bag,
+SortedBag)

(+Linked)

Bags (Multisets)

BiMaps

Iteration Styles

Eager/Lazy,

Trove

TECHNOLOGY
DIVISION

Scala

Mutable

Readable,
Mutable,
Immutable, Lazy

Lazy,

Lazy,

(Multimap trait)

Eager,

Eager/Lazy,

45

Resources

TECHNOLOGY
DIVISION

GS Collections on GitHub
https://github.com/goldmansachs/gs-collections
https://github.com/goldmansachs/gs-collections/wiki
https://github.com/goldmansachs/gs-collections-kata

GS Collections Memory Benchmark


http://www.goldmansachs.com/gs-collections/presentations/GSC_Memory_Tests.pdf

NY JUG Presentation, May 2014


http://
www.goldmansachs.com/gs-collections/presentations/2014_05_19_NY_Java_User_Group.pdf

Parallel-lazy Performance: Java 8 vs Scala vs GS Collections


http://www.infoq.com/presentations/java-streams-scala-parallel-collections

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

46

TECHNOLOGY
DIVISION

Learn more at GS.com/Engineering

2014 Goldman Sachs. This presentation should not be relied upon or considered investment advice. Goldman Sachs does not warrant or guarantee to anyone the accuracy, completeness or efficacy of this
presentation, and recipients should not rely on it except at their own risk. This presentation may not be forwarded or disclosed except with this disclaimer intact.

47

You might also like