Bingjing - Big Data Tools
Bingjing - Big Data Tools
Bingjing - Big Data Tools
For S4 Storm
Streamin Spark
g Samza
Streaming
Layered
BioKe
stratio
Workf
(Tools
BioKe
NA:
stratio
Orche
Swift,
Workf
(Tools
Oozie
Keple
Galax
NA:
Pegas
ODE,
Orche
Architecture
Swift,
eBPE
Airav
Oozie
Keple
Galax
Taver
Activ
Pegas
ODE,
OOD
eBPE
Airav
Tride
Taver
Activ
OOD
Tride
pler,
pler,
&
and
low
nn &
and
low
na,
ata
us,
na,
ata
nt,
us,
nt,
L,
L,
TT
r,r,
yy
))
,,
(Upper) Cross Cutting
Capabilities
Machine Learning Data Analytics Libraries:
Monitoring:
Monitoring: Ambari,
CompLearn (NA)
Message
Message Protocols:
High Level (Integrated) Systems for Data Processing
Distributed
Distributed Coordination:
Hive Hcatalog Pig Shark MRQL Impala (NA) Swazall
(SQL on Interfaces (Procedural (SQL on (SQL on Hadoop, Cloudera (Log Files
Protocols: Thrift,
Ambari, Ganglia,
Security
Hadoop) Language) Spark, NA) Hama, Spark) (SQL on Hbase) Google NA)
Security &
Coordination: ZooKeeper,
Parallel Horizontally Scalable Data Processing
JGroups
JGroups
NA Non Apache projects Pegasus
Ganglia, Nagios,
Hadoop Spark NA:Twister Tez Hama S4 Samza Giraph on Hadoop
& Privacy
Thrift, Protobuf
(Map (Iterative Stratosphere (DAG) Storm
(BSP) Yahoo LinkedIn ~Pregel
Privacy
Green layers are Reduce) MR) Iterative MR
(NA)
Stream Graph
Nagios, Inca
Batch
Protobuf (NA)
Apache/Commercial Cloud (light)
ZooKeeper,
ABDS Inter-process Communication HPC Inter-process Communication
to HPC (darker) integration layers
Inca (NA)
Hadoop, Spark Communications MPI (NA)
(NA)
(NA) & Reductions Harp Collectives (NA)
Pub/Sub Messaging Netty (NA)/ZeroMQ (NA)/ActiveMQ/Qpid/Kafka
(Lower)
(Content) (NA) R,Python (DHT) +Document
(Watson) HBase) HDFS) HDFS)
Distributed
Distributed Coordination:
Monitoring:
Monitoring: Ambari,
(NA)
Solr DB Table Amazon ~Dynamo ~Dynamo
Message
Message Protocols:
NoSQL: General Graph NoSQL: TripleStore RDF SparkQL File
Management
Coordination: ZooKeeper,
Protocols: Thrift,
Neo4J Yarcdata
Ambari, Ganglia,
Security
Sesame AllegroGraph RYA RDF on
Security &
Java Gnu Commercial Jena
NA Non Apache projects (NA) (NA)
(NA) Commercial Accumulo iRODS(NA)
Ganglia, Nagios,
Green layers are
& Privacy
Thrift, Protobuf
Data Transport BitTorrent, HTTP, FTP, SSH Globus Online (GridFTP)
Privacy
ZooKeeper, JGroups
Apache/Commercial Cloud (light)
Nagios, Inca
HPC Cluster Resource Management
Protobuf (NA)
to HPC (darker) integration layers ABDS Cluster Resource Management
Mesos, Yarn, Helix, Llama(Cloudera) Condor, Moab, Slurm, Torque(NA) ..
Inca (NA)
(NA)
JGroups
(NA) ABDS File Systems
HDFS, Swift, Ceph User Level
FUSE(NA) HPC FileGluster,
SystemsLustre,
(NA) GPFS, GFFS
Popular implementations
MPICH (2001)
OpenMPI (2004)
http://www.open-mpi.org/
MapReduce Model
Google MapReduce (2004)
Jeffrey Dean et al. MapReduce: Simplified Data Processing on Large Clusters.
OSDI 2004.
HaLoop (2010)
Yingyi Bu et al. HaLoop: Efficient Iterative Data Processing on Large clusters. VLDB
2010.
http://code.google.com/p/haloop /
Programming model
Loop-Aware Task Scheduling
Caching and indexing for Loop-Invariant Data on local disk
Twister Programming Model
Main programs process Worker Nodes
space
configureMaps() Local
configureReduce( Disk
)
while(condition){ Cacheable map/reduce
tasks
runMapReduce(.
May scatter/broadcast Map
..)
<Key,Value> pairs directly ()
Iteration May merge data in shuffling Reduce
s Combine() ()
Communications/data transfers
operation via the pub-sub broker network &
direct TCP
updateCondition( Main program may contain many
)} //end while
MapReduce invocations or
close() iterative MapReduce invocations
DAG (Directed Acyclic Graph) Model
Dryad and DryadLINQ (2007)
Michael Isard et al. Dryad: Distributed Data-Parallel Programs
from Sequential Building Blocks, EuroSys, 2007.
http://
research.microsoft.com/en-us/collaboration/tools/dryad.aspx
Model Composition
Apache Spark (2010)
Matei Zaharia et al. Spark: Cluster Computing with Working Sets,.
HotCloud 2010.
Matei Zaharia et al. Resilient Distributed Datasets: A Fault-Tolerant
Abstraction for In-Memory Cluster Computing. NSDI 2012.
http://spark.apache.org/
Resilient Distributed Dataset (RDD)
RDD operations
MapReduce-like parallel operations
DAG of execution stages and pipelined transformations
Simple collectives: broadcasting and aggregation
Graph Processing with BSP model
Pregel (2010)
Grzegorz Malewicz et al. Pregel: A System for Large-Scale Graph Processing.
SIGMOD 2010.
@Override
public void compute(Vertex<IntWritable, FloatWritable, NullWritable> vertex, Iterable<FloatWritable> messages)
throws IOException {
if (getSuperstep() >= 1) {
float sum = 0;
for (FloatWritable message : messages) {
sum += message.get();
}
vertex.getValue().set((0.15f / getTotalNumVertices()) + 0.85f * sum);
}
if (getSuperstep() < getConf().getInt(SUPERSTEP_COUNT, 0)) {
sendMessageToAllEdges(vertex,
new FloatWritable(vertex.getValue().get() / vertex.getNumEdges()));
} else {
vertex.voteToHalt();
}
}
}
GraphLab (2010)
Data graph
Update functions and the scope
Sync operation (similar to aggregation in Pregel)
Data Graph
Vertex-cut v.s. Edge-cut
PowerGraph (2012)
Joseph E. Gonzalez et al. PowerGraph:
Distributed Graph-Parallel Computation
on Natural Graphs. OSDI 2012.
Gather, apply, Scatter (GAS) model
GraphX (2013)
Reynold Xin et al. GraphX: A Resilient
Edge-cut (Giraph
Distributed Graph System on Spark. model)
GRADES (SIGMOD workshop) 2013.
https
://amplab.cs.berkeley.edu/publication/gr
aphx-grades/ Vertex-cut (GAS model)
To reduce communication
overhead.
Option 1
Algorithmic message reduction
Fixed point-to-point communication pattern
Option 2
Collective communication optimization
Not considered by previous BSP model but well developed in MPI
Initial attempts in Twister and Spark on clouds
Mosharaf Chowdhury et al. Managing Data Transfers in Computer Clusters with
Orchestra. SIGCOMM 2011.
Bingjing Zhang, Judy Qiu. High Performance Clustering of Social Images in a Map-
Collective Programming Model. SOCC Poster 2013.
Collective Model
Harp (2013)
https://github.com/jessezbj/harp-project
Hadoop Plugin (on Hadoop 1.2.1 and Hadoop 2.2.0)
Hierarchical data abstraction on arrays, key-values and graphs for
easy programming expressiveness.
Collective communication model to support various communication
operations on the data abstractions.
Caching with buffer management for memory allocation required
from computation and communication
BSP style parallelism
Fault tolerance with check-pointing
Harp Design
Array Partition
Edge Message Vertex KeyValue
Partition < Array Type
Partition Partition Partition Partition
>
Broadcast, Send
if (this.isMaster()) {
String cFile = conf.get(KMeansConstants.CFILE);
Map<Integer, DoubleArray> cenDataMap = createCenDataMap(cParSize, rest, numCenPartitions,
vectorSize, this.getResourcePool());
loadCentroids(cenDataMap, vectorSize, cFile, conf);
addPartitionMapToTable(cenDataMap, table);
}
arrTableBcast(table);
}
Broadcasting with Twister vs. MPI Twister vs. MPJ
(Broadcasting 0.5~2GB data) (Broadcasting 0.5~2GB data)
25 40
35
Topology-
20
30
15 25
Awareness
20
10 15
10
5
5
0 0
1 25 50 75 100 125 150 1 25 50 75 100 125 150
Number of Nodes Number of Nodes
Twis ter Bcas t 500MB MPI Bcas t 500MB
Twis ter 0.5GB MPJ 0.5GB
Twis ter Bcas t 1GB MPI Bcas t 1GB
Twis ter Bcas t 2GB MPI Bcas t 2GB Twis ter 1GB MPJ 1GB
1400
1200
1000
600
400
200
0
100m 500 10m 5k 1m 50k
Problem Size
Hadoop 24 cores Harp 24 cores Hadoop 48 cores Harp 48 cores Hadoop 96 cores Harp 96 cores
K-means
Clustering Parallel
Efficiency
2500
2000
1000
500
0
0 20 40 60 80 100 120 140
Number of Nodes (8, 16, 32, 64, 128 nodes, 32 cores per node)
Parallel Efficiency
Based On 8 Nodes and 256 Cores
Parallel Efficiency (Based On 8Nodes and 256 Cores)
1.2
0.8
0.6
0.4
0.2
0
0 20 40 60 80 100 120 140
3000 2877.76
2500
2000
1643.08
Execution Time (Seconds)
1500
1000
500 368.39
0
100000 200000 300000
Problem Size
Machine Learning on Big Data
Mahout on Hadoop
https://mahout.apache.org/
MLlib on Spark
http://spark.apache.org/mllib/
GraphLab Toolkits
http://graphlab.org/projects/toolkits.html
GraphLab Computer Vision Toolkit
Query on Big Data
Query with procedural language