0% found this document useful (0 votes)

78 views6 pages

Arallel Rogramming With Keletons

Parallel Programming is bound to become the main concern of software developers in the coming decades. Parallel Skeletons are based on recurring patterns of computation and communication. A library called Quaff implements them in c++ and makes parallel application development easier.

Uploaded by

elidianemf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

78 views6 pages

Arallel Rogramming With Keletons

Uploaded by

elidianemf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

S CIEntIfIC ProgrAmmIng

Editors: Konstantin Läufer, [email protected]

Konrad Hinsen, [email protected]

Parallel Programming
with SkeletonS
By Joel Falcou

Parallel programming is bound to become the main concern of software developers in the coming decades.
Various models aim to solve this tension, trading efficiency for abstraction or vice versa, but how about
getting both?

B ack at the dawn of scientific

computing, parallel machines
were revered as titans that
few could approach and even fewer
could tame. Today, after decades of
have emerged to provide a structured
framework to design and implement
nontrivial parallel applications and
deliver high performance.
In this article, I present such a pro-
tributed data mining is, in fact, noth-
ing but a specific parallel skeleton.
Every community or application
domain has its own specific skeletons.
In computer vision—the example
progress in both hardware and soft- gramming model—Parallel Algorith- that I use in this article—parallel
ware development, parallel comput- mic Skeletons—along with a library skeletons mostly involve slicing and
ing is a mainstream technique for called Quaff that implements it in distributing regular data. Parallel ex-
getting things done. However, as C++ and makes parallel application ploration of a tree-like structure or a
this progress spawned increasingly development easier. randomized walk through some itera-
powerful machines, it spelled doom tion space are common in operational
for many developers. Parallel Skeletons research applications. The main ad-
As Herb Sutter1 stated, the free in a Nutshell vantage of this parameterization of
lunch is over for sequential pro- The concept of parallel skeletons2,3 is parallelism is that all low-level, ar-
gramming: designers must take based on the observation that many chitecture, or framework-dependent
concurrency and parallel program- applications express parallelism in the code is hidden from the user, who
ming into account at every level of form of a few recurring patterns of only has to write sequential code
software design to avoid the dreaded computation and communication. An fragments and instantiate skeletons.
“this application doesn’t scale” re- example of such a pattern is a pipeline, The skeleton approach thus provides
sult. For nonspecialists, this means which is a model of parallel function a decent level of organization.
that writing efficient code for such composition. Another such pattern is Another interesting feature of skel-
machines or groups of machines will a data-parallel structure, in which a etons is their ability to be nested. If
become less trivial, as it usually in- master process slices data along some we look at a skeleton as a function
volves dealing with low-level APIs dimension and distributes them to a that takes functions as arguments and
such as OpenMP, Posix threads, pool of slave processing units. produces parallel code, then any in-
and the message-passing interface Essentially, a skeleton is a pattern stantiated skeleton is eligible to be an-
(MPI). However, years of experience that occurs in a significant number of other skeleton’s argument. Skeletons
have shown that using such frame- parallel applications; some skeletons are thus higher-order functions in the
works is difficult and error-prone; are general, whereas others are specific sense of functional programming—
deadlocks and other undesired be- to an application domain. Although that is, functions whose parameters
haviors make parallel software de- design patterns are rather informal, are themselves functions. To be able
velopment very slow compared to the skeletons have been formalized to the to perform the transformation from a
classic, sequential approach. As an extent that we can express them as con- combination of skeletons to a working
alternative, software design patterns crete language constructs or templates. parallelized program, we need an ap-
offer an easy way to build reusable, Recently, commercial users have start- plication model capable of represent-
structured sequential software com- ed to show interest in these methods. ing an arbitrarily complex program as
ponents. Similarly, various attempts Google’s MapReduce model for dis- a set of functions.4

58 Copublished by the IEEE CS and the AIP 1521-9615/09/$25.00 © 2009 IEEE Computing in SCienCe & engineering
PIPE

Skeleton tree Process network φ φ2 φ C+MPI code

2 2
generation production generation C
C++
φ1 FARM3 φ3 MPI
φ1 f φ3

φ2

As an example, consider an image- figure 1. Quaff code generation process. the compile-time system analyzes C++
processing application for detecting code using skeleton constructors to build a process network. this network is then
edges in a video stream. We would used to generate the message-passing interface (mPI) code for compilation.
build this application from four
functions:
slices over a pool of slave processors, structure extracted from the applica-
• load, which retrieves an image from and merges the results of a function’s tion definition into a list of executable
the video stream; parallel application (which is thresh in instructions for a given parallel archi-
• thresh, which applies a binary our example): tecture, as Figure 1 shows.
threshold to an image; Other research describes this pro-
• edge, which extracts edges as lists of A2 = pipeline [map [slice, thresh, cess and its associated specific lan-
lines; and merge], edge, save]. guage.5 Whereas languages such as
• save, which saves the result to a file metaOCaml or Template Haskell
on disk. We thus define the final paral- natively support such constructions,
lel application in terms of skeleton C++ requires the use (and abuse) of
The sequential version of our applica- nesting—map being nested inside a template metaprogramming and op-
tion is defined as the sequential com- pipeline—and the list of sequential erator overloading.
position of these four functions: functions. As far as the developer is Let’s examine the Quaff interface
concerned, the parallelization is done and which skeletons it supports.
As = sequence [load, thresh, edge, save], because the skeleton implementation
will handle all the low-level com- The Quaff Programming Model
where sequence represents the se- munication and marshaling details. Figure 2 presents a simple algorithm
quential composition skeleton. Building parallel applications is sim- parallelized with Quaff. In this ex-
We can introduce the first level of plified because the developer only ample, we want to apply the function
parallelism by noticing that we can needs to know the skeletons’ opera- comp to a vector in a parallel way.
run the four functions in parallel if we tional semantics. Another advantage The actual code is split into four
apply them to different elements of the is that developers can reuse existing parts, starting with the definition of
data stream. Thus, while load is load- sequential functions directly because the user functions. The only limita-
ing the ith image, we can apply thresh they don’t need to know about the tion is the argument ordering (input
to the (i − 1)th image. In general, we parallelization process. first, output last), which is a require-
can apply the thresh function on imag- ment to enable Quaff to determine
es it, it–1, it–2, and it–3 in parallel. This The Quaff Library how data should be transferred be-
parallelization scheme—a pipeline—is Quaff is a skeleton-based parallel pro- tween processes. The library pro-
a classic yet useful form of parallelism gramming library whose main task vides communication support for
that we can choose as a skeleton. A first is to rely on C++ template metapro- all standard C and C++ types and
parallel version of A is thus gramming to reduce the overhead some standard template library con-
traditionally associated with object- tainers—such as vector or list—thus
A1 = pipeline [load, thresh, edge, save]. oriented implementations of such li- limiting the marshaling code we
braries. The basic idea is to use the must write to support custom types.
We can express another level of par- C++ template mechanism so that Next, the user initializes the parallel
allelism by noticing that we can run skeleton-based programs expand at execution environment at line 10 via
the thresh function on different im- compile time and generate a new C++ the initialize function. From this
age slices in parallel. So, if we define MPI code to be compiled and execut- point on, we can evaluate and run
two functions for slicing and merg- ed at runtime. This code generation skeleton expressions on the underly-
ing images (slice and merge, respec- totally removes the overhead associ- ing MPI-enabled parallel machine.
tively), we can express As by using a ated with runtime polymorphism The application is a combination
new parallel construction—map— and function forwarding. To do this, of skeleton constructors on lines 12
which slices an image, distributes the developers transform the skeleton through 14.

may/June 2009 59
SCIEntIfIC ProgrAmmIng

# include <quaff / quaff .hpp >

using namespace quaff ;

void load ( vector <float >& d );

void comp ( vector <float > const & d, vector <float >& r);
void save ( vector <float > const & r );

int main ( int argc , const char * argv [])

{ description of parallel applications
initialize (arg , argv ); easier—for instance, the CHAIN,
PIPE, and PARDO skeletons are re-
run ( ( seq ( load ) spectively mapped to the comma op-
, map <16 >( seq( comp )) erator, the bitwise or, and the bitwise
, seq ( save ) and operators.
) );
finalize ();
} Application to
Computer Vision
To demonstrate Quaff’s expressive-
figure 2. Quaff sample code. In this example, we want to apply the function comp ness and efficiency, we can parallel-
to a vector in a parallel way. ize a realistic application from the
domain of computer vision. Com-
puter vision features various com-
In this example, we first load data etons that the community agrees on plex, time-consuming algorithms
from a file, distribute this data over so far: and is a field of choice for paralleliza-
processors using the map skeleton, tion. The application I’ve chosen
perform the computation, and gather • The SEQ skeleton encapsulates performs object detection and track-
the results, which are then saved back user-defined sequential functions ing in a stereoscopic video stream us-
on disk. This sample code shows the to use them as parameters of other ing a probabilistic algorithm. In this
explicit call to the MAP and SEQ skeletons. approach, detecting and tracking is
skeleton constructors and the use of • The CHAIN skeleton calls other done by computing a posteriori the
the comma operator as the CHAIN skeletons in sequence. probability density of a state vec-
skeleton constructor. Note that we • The PARDO skeleton supports ad tor representing the tracked object
can parameterize skeleton construc- hoc parallelism; it simply spawns from a series of observations follow-
tors via additional information—for parallel processes with no defined ing a standard Bayesian inference
instance, map takes an additional communication scheme. procedure. To solve such a problem,
template parameter that describes • The PIPELINE skeleton is func- we can either analytically solve the
over how many processors the data tionally equivalent to parallel func- Chapman-Kolmogorov prediction
will be distributed. Finally, the fi- tion composition. equation or use a Monte Carlo meth-
nalize function shuts down the • The FARM skeleton models irreg- od. The particle filter algorithm
parallel execution environment in ular, asynchronous data parallelism models such a probabilistic proce-
line 17. Again, note that the process in which a master process dynami- dure with a Markov process by esti-
of structuring the communication cally distributes inputs to a pool of mating the probability density by a
and generating MPI primitive calls is slave processors using some param- weighted, discrete set of observations
mostly done at compile time thanks eterizable heuristic. inside the observation universe.6
to metaprogramming, leading to a • The MAP skeleton models regular In our case, we want to track a pe-
very small overhead (a few percent) data parallelism that divides the destrian in a 3D world. To do so, we
compared to a manual implementa- input data into subsets on which try to estimate the probability density
tion of the same application. a given function is applied. The of a set of particles containing the pe-
MAP skeleton then merges back destrian’s 3D position and 3D velocity
Supported Skeletons the subset results to produce the vector. Those particles also have an
Because we usually define skeletons by total output. evolution model that represents how
generalizing the parallelization pat- we can model a pedestrian particle
terns that arise from the implementa- All these skeletons are directly us- between frames. No pedestrian can
tion of specific application classes, no able in Quaff via the correspond- run faster than the speed of sound,
single standard list of skeletons exists. ing function. Some of them are also thus capping the velocity vector’s
Quaff supports a small subset of skel- mapped to operators to make the norm, and no pedestrian can teleport

60 Computing in SCienCe & engineering

# include <quaff / quaff .hpp >
using namespace quaff ;

# define NPROC 16
typedef std :: vector < particles > data_t ;

void gui ();

void generate ( data_t & );
between frames, thus providing a void measure ( image const &, data_t const &, data_t & );
continuity equation on position. void sample ( data_t const &, data_t & );
Our implementation of the particle particles estimate ( data_t const & );
filter algorithm uses four functions: void update_gui ( particles const & );

• generate builds the particle dis- int main ( int argc , const char * argv [])
{
tribution from the last iteration
initialize (arg , argv );
results,
• measure extracts features from the run ( seq ( gui )
video stream to evaluate each par- & ( map <NPROC >( ( seq ( generate )
ticle’s interest score by using an im- , seq ( measure )
age descriptor, , seq ( sample )
• sample resamples the particle set )
by replicating particles with large )
weight and trimming particles with , seq ( estimate )
, seq ( update_gui )
small weight, and
)
• estimate computes the particle );
set’s average to get the current finalize ();
frame estimation. }

This algorithm’s parallelization is

based on distributing the particle set figure 3. Quaff code for the particle filter application. After starting up the parallel
over the available processors and then environment, the application is split into two parts: the gUI handling and the
computing the estimated position of actual computation code that follows the algorithm.
the object on the root node, which
also runs the application’s GUI parts.
Figure 3 presents the Quaff listing
for this application, and Figure 4 shows
a sample execution of our 3D tracking Left Right
image image
application. The upper part of the fig-
ure shows the two video streams and
the projection of the particle distribu-
tion, whereas the lower part shows the
estimated pedestrian’s 3D path pro-
jected on the ground plane.

Estimated

P arallel programming is bound to

become an everyday problem for
a large part of the scientific comput-
Particles
distribution
projection
3D path

ing community. These developers

will have to struggle with increasingly
complex architectures and will need
tools to overcome these difficulties.
Parallel skeletons could offer a solu- figure 4. Sample tracking session. the top of the screen shows the right and left
tion to several problems. Because it’s image provided by the cameras; a subset of the particle set and the estimated
an easy-to-use and efficient way of position are projected (the yellow and red boxes). the lower part shows a bird’s-
building parallel applications, this eye view of the reconstructed pedestrian path.

may/June 2009 61
SCIEntIfIC ProgrAmmIng

Other Parallel main idea is to generate a process topology from the con-

SkeletOn librarieS struction of various skeleton classes and use a distributed

container to handle data transmission. this polymorphic

S everal implementations of the parallel skeleton para-

digm are available for various programming languages.
they show that the skeleton paradigm is indeed easy to
C++ skeleton library offers a high level of abstraction but
stays close to a language that’s familiar to many developers.
moreover, the C++ binding for higher-order functions and
integrate into mainstream languages. polymorphic calls ensures that the library is type safe. the
the Eskel Library by murray Cole represents a concrete main problem is that the overhead due to dynamic poly-
attempt to embed the skeleton-based parallel program- morphism is rather high (between 20 and 110 percent for
ming method into the mainstream of parallel program- rather simple applications). A prototype is available at www.
ming. It offers various skeletal parallel programming wi.uni-muenster.de/PI/forschung/Skeletons/index.php.
constructs that resemble mPI primitives and are directly finally, Lithium is a Java skeleton library for clusters and
usable in C code. However, Eskel’s low-level API requires grid-like networks of machines whose underlying execu-
the programmer to take care of internal implementation model is a macro data-flow one. the choice of Java
tion details. Eskel and a lot of seminal information about is motivated by the fact that it provides an easy-to-use,
skeletons are available at http://homepages.inf.ed.ac.uk/ platform-independent, object-oriented language. Lithium
abenoit1/eSkel/. also spawned ASSISt, a skeleton-based framework for
Herbert Kuchen based mUESLI, the C++ münster Skel- grid computing. Lithium is available at www.di.unipi.it/
eton Library, on a platform-independent structure. the marcod/Lithium/.

Table 1. Timing results for various particle sets.

Np = 1,000 Np = 2,000 Np = 5,000 Np = 10,000
P=1
time 672.30 ms 1,986.19 ms 11,783.58 ms 21,142.71 ms
P=2
time 409.94 ms 1,115.84 ms 6,074.01 ms 10,898.30 ms
Speedup 1.64 1.78 1.87 1.94
Efficiency 82% 89% 93.5% 97%
P=4
time 263.65 ms 711.90 ms 3,527.45 ms 6,110.61 ms

Speedup 2.55 2.79 3.22 3.46

Efficiency 63.7% 79.5% 80.5% 96.5%
P=6
time 187.79 ms 467.34 ms 1,975.37 ms 3,645.29 ms
Speedup 3.58 4.25 5.75 5.80
Efficiency 59.7% 70.8% 95.8% 96.7%
P = 12
time 162.00 ms 291.23 ms 1,046.86 ms 1,828.95 ms
Speedup 4.15 6.82 10.85 11.56
Efficiency 34.6% 56.8% 90.4% 96.3%
P = 28
time 108.61 ms 170.05 ms 512.79 ms 786.27 ms
Speedup 6.19 11.68 22.15 26.89
Efficiency 22.1% 41.7% 79.1% 96.1%

model can solve many of the com- munications and synchronization. they’re applicable, they provide a con-
mon issues associated with parallel Parallel skeletons won’t be the uni- venient way to describe computational
programming, such as handling com- versal solution for everyone, but when problems and solve them efficiently.

62 Computing in SCienCe & engineering

Lower nonmember rate
of $32 for S&P magazine!
The Quaff library shows that users IEEE Security & .NET SEcuriTy • iNTErviEw wiTh MEliSSa haThaway

can operate such a model in a main- Privacy is THE

stream language with a high level of premier magazine NovEMbEr/DEcEMbEr 2008
voluME 6, NuMbEr 6

abstraction and without efficiency for security

loss. Currently, Quaff targets clus-
ters, multicore machines, and the
professionals.
Cell processor, with only small differ-
ences in the interface. The goal is to
develop skeleton-based applications Top security
on heterogeneous platforms with a professionals
single source code—architectures
GooGle’s Android PlAtform • risk Assessment for norwAy’s infrAstructure

in the field share

such as the IBM RoadRunner cluster
of Cell-enhanced multicore cluster information JAnuAry/februAry 2009

on which
Volume 7, number 1

nodes, for example.

you can rely:

References
1. H. Sutter and J. Larus, “the free Lunch Is • Silver Bullet podcasts
over: A fundamental turn toward Concur-
rency in Software,” Dr. Dobb’s J., vol. 30, no. and interviews
3, 2005; www.ddj.com/web-development/ • Intellectual Barack’s BlackBerry Bind • education via second life

184405990. Property
2. m. Cole, Algorithmic Skeletons: Structured Protection
Management of Parallel Computation, mIt March/april 2009

Press, 1989.
and Piracy volume 7, numBer 2

3. D.B. Skillicorn, “Architecture-Independent

• Designing for
Parallel Computation,” Computer, vol. 23, no. Infrastructure Security
12, 1990, pp. 38–50. • Privacy Issues
4. m. Aldinucci and m. Danelutto, “Stream Par- • Legal Issues and
allel Skeleton optimization,” Proc. 11th Int’l
Cybercrime
Conf. Parallel and Distributed Computing and
Systems, ACtA Press, 1999, pp. 955–962 • Digital Rights
5. J. Sérot and J. falcou, “functional meta- Management
Programming for Parallel Skeletons,” Proc. • The Security Profession
Int’l Conf. Computational Science, Springer-
Verlag, 2008, pp. 154–163.
6. J. falcou et al., “real time Parallel Imple-
mentation of a Particle filter Based Visual Visit our Web site at
tracking,” Workshop on Computation In-
tensive Methods for Computer Vision, 2006,
www.computer.org/security/
CD-rom.

Joel Falcou is an assistant professor at the

University Paris-Sud and researcher at the Subscribe now!
Laboratoire de recherche d’Informatique
in orsay, france. His work focuses on in-
vestigating high-level programming mod-
els for parallel architectures (present and
www.computer.org/services/
future) and providing efficient implemen- nonmem/spbnr
tation of such models using high-perfor-
mance language features. Contact him at
[email protected].

may/June 2009 63

14 Parallel Algorithms CUDA Basics s20
No ratings yet
14 Parallel Algorithms CUDA Basics s20
89 pages
Parallel Comp Point Main
No ratings yet
Parallel Comp Point Main
18 pages
8-Bit Verilog Code For Booth's Multiplier
80% (5)
8-Bit Verilog Code For Booth's Multiplier
2 pages
Achieving High Performance Computing
No ratings yet
Achieving High Performance Computing
58 pages
1.6 Final Thoughts: 1 Parallel Programming Models 49
No ratings yet
1.6 Final Thoughts: 1 Parallel Programming Models 49
5 pages
Electronic Mainternance 1981
No ratings yet
Electronic Mainternance 1981
180 pages
Gebru Netsanet Kassaye 150519190409
No ratings yet
Gebru Netsanet Kassaye 150519190409
65 pages
The Source-Free RC Circuit: V (T) Across The
No ratings yet
The Source-Free RC Circuit: V (T) Across The
58 pages
HF6208 User Manual
No ratings yet
HF6208 User Manual
97 pages
Electric Drive Lab
No ratings yet
Electric Drive Lab
17 pages
Siemens Simatic HMI 2009 PDF
No ratings yet
Siemens Simatic HMI 2009 PDF
178 pages
Obstracle Avoiding Vehicle
No ratings yet
Obstracle Avoiding Vehicle
45 pages
MSP 430 G 2955
No ratings yet
MSP 430 G 2955
71 pages
RHCSA 8 Exam Paper
No ratings yet
RHCSA 8 Exam Paper
3 pages
Functionality Considerations in Custom SCADA Development Tools
No ratings yet
Functionality Considerations in Custom SCADA Development Tools
5 pages
Fundamentals of Frequency References: ISSCC 2023 Tutorial
No ratings yet
Fundamentals of Frequency References: ISSCC 2023 Tutorial
101 pages
Current Components and Law of Junction (17.8.20)
No ratings yet
Current Components and Law of Junction (17.8.20)
22 pages
Combined Cell
No ratings yet
Combined Cell
36 pages
Openspecs Windows Protocols Ms SCMR
No ratings yet
Openspecs Windows Protocols Ms SCMR
407 pages
Modern Cmake
No ratings yet
Modern Cmake
83 pages
Control Wave
No ratings yet
Control Wave
156 pages
SQL Select From V$SGA
No ratings yet
SQL Select From V$SGA
16 pages
Cascode Amp Design
No ratings yet
Cascode Amp Design
5 pages
30 Telecommunications CBI Questions With Answers 1709061992
No ratings yet
30 Telecommunications CBI Questions With Answers 1709061992
7 pages
Name: Charan Tej Merugu Id:111-00-1840
No ratings yet
Name: Charan Tej Merugu Id:111-00-1840
65 pages
Check Esx
No ratings yet
Check Esx
52 pages
MPL Lab Manual 2018-2019 PDF
No ratings yet
MPL Lab Manual 2018-2019 PDF
101 pages
16 - Issues in Failure Recovery
No ratings yet
16 - Issues in Failure Recovery
5 pages
Problema 1 "Lavadora" Código VHDL Module
No ratings yet
Problema 1 "Lavadora" Código VHDL Module
40 pages
Hold Queue Command
No ratings yet
Hold Queue Command
3 pages
Laprf Personal Timing System: Operator'S Manual
No ratings yet
Laprf Personal Timing System: Operator'S Manual
22 pages
PLATFORM TECHNOLOGY - Reviewer
No ratings yet
PLATFORM TECHNOLOGY - Reviewer
5 pages
IIR Filter-Sofyan Ahmadi
No ratings yet
IIR Filter-Sofyan Ahmadi
9 pages
Eset Configure The Hips Intrusion Prevention System 29942 mjv1fj PDF
No ratings yet
Eset Configure The Hips Intrusion Prevention System 29942 mjv1fj PDF
6 pages
Mastering Concurrency and Parallel Programming Unlock the Secrets of Expert-Level Skills.pdf
From Everand
Mastering Concurrency and Parallel Programming Unlock the Secrets of Expert-Level Skills.pdf
Larry Jones
No ratings yet
Concurrency and Multithreading in C: POSIX Threads and Synchronization
From Everand
Concurrency and Multithreading in C: POSIX Threads and Synchronization
Larry Jones
No ratings yet
Graphcore Poplar Programming and Optimization: The Complete Guide for Developers and Engineers
From Everand
Graphcore Poplar Programming and Optimization: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Cerebras GPT: Wafer-Scale Architectures for Large Language Models
From Everand
Cerebras GPT: Wafer-Scale Architectures for Large Language Models
William Smith
No ratings yet
Software-Defined Networks: A Systems Approach
From Everand
Software-Defined Networks: A Systems Approach
Larry Peterson
5/5 (1)
Learning PyTorch 2.0, Second Edition: Utilize PyTorch 2.3 and CUDA 12 to experiment neural networks and deep learning models
From Everand
Learning PyTorch 2.0, Second Edition: Utilize PyTorch 2.3 and CUDA 12 to experiment neural networks and deep learning models
Matthew Rosch
No ratings yet
Essays on Infrastructure-as-code
From Everand
Essays on Infrastructure-as-code
Ravi Rajamani
No ratings yet
Mastering C: Advanced Techniques and Tricks
From Everand
Mastering C: Advanced Techniques and Tricks
Ted Norice
No ratings yet
Practical C++ Backend Programming
From Everand
Practical C++ Backend Programming
Justin Barbara
No ratings yet
C + +: C++ programming
From Everand
C + +: C++ programming
Ummed Singh
No ratings yet
Java Streams Explained: A Practical Guide with Examples
From Everand
Java Streams Explained: A Practical Guide with Examples
William E. Clark
No ratings yet
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
From Everand
C++ VS JAVA A PERFORMANCE DEEPDIVE: Unraveling the Performance Characteristics of C++ and Java for High-Performance Computing
Manoj R Chakravarthi
No ratings yet
Learn C++
From Everand
Learn C++
Aishik Dutta
No ratings yet
JavaScript Algorithms Step by Step: A Practical Guide with Examples
From Everand
JavaScript Algorithms Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
Terraform for Developers, Second Edition
From Everand
Terraform for Developers, Second Edition
Kimiko Lee
No ratings yet
Terraform for Developers, Second Edition: Essentials of Infrastructure Automation and Provisioning
From Everand
Terraform for Developers, Second Edition: Essentials of Infrastructure Automation and Provisioning
Kimiko Lee
No ratings yet
Learning PyTorch 2.0, Second Edition
From Everand
Learning PyTorch 2.0, Second Edition
Matthew Rosch
No ratings yet
SCCharts - Language and Interactive Incremental Compilation
From Everand
SCCharts - Language and Interactive Incremental Compilation
Christian Motika
No ratings yet
Mastering the Art of Nix Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Nix Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Study Guide Cisco 300-535 SPAUTO Automating and Programming Cisco Service Provider Solutions
From Everand
Study Guide Cisco 300-535 SPAUTO Automating and Programming Cisco Service Provider Solutions
Anand Vemula
No ratings yet
C# Functional Programming Made Easy: A Practical Guide with Examples
From Everand
C# Functional Programming Made Easy: A Practical Guide with Examples
William E. Clark
No ratings yet
Learning Concurrent Programming in Scala
From Everand
Learning Concurrent Programming in Scala
Aleksandar Prokopec
No ratings yet
Parallel Software Development with Threading Building Blocks: Definitive Reference for Developers and Engineers
From Everand
Parallel Software Development with Threading Building Blocks: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering the Art of C# Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of C# Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
POSIX Threads Programming Essentials: Definitive Reference for Developers and Engineers
From Everand
POSIX Threads Programming Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Mastering the Craft of C Programming: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Craft of C Programming: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Boost.Thread in Practice: Definitive Reference for Developers and Engineers
From Everand
Boost.Thread in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Kubernetes: Build and Deploy Modern Applications in a Scalable Infrastructure. The Complete Guide to the Most Modern Scalable Software Infrastructure.: Docker & Kubernetes, #2
From Everand
Kubernetes: Build and Deploy Modern Applications in a Scalable Infrastructure. The Complete Guide to the Most Modern Scalable Software Infrastructure.: Docker & Kubernetes, #2
Jordan Lioy
No ratings yet
C# Fundamentals Made Simple: A Practical Guide with Examples
From Everand
C# Fundamentals Made Simple: A Practical Guide with Examples
William E. Clark
No ratings yet
Red Hat AMQ Streams for Cloud-Native Messaging: The Complete Guide for Developers and Engineers
From Everand
Red Hat AMQ Streams for Cloud-Native Messaging: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Practical C++ Backend Programming: Crafting Databases, APIs, and Web Servers for High-Performance Backend
From Everand
Practical C++ Backend Programming: Crafting Databases, APIs, and Web Servers for High-Performance Backend
Justin Barbara
No ratings yet
JavaScript Functional Programming Made Simple: A Practical Guide with Examples
From Everand
JavaScript Functional Programming Made Simple: A Practical Guide with Examples
William E. Clark
No ratings yet
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Boost.Asio Techniques and Applications: Definitive Reference for Developers and Engineers
From Everand
Boost.Asio Techniques and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Developing with ANGLE: Cross-Platform Graphics Integration: The Complete Guide for Developers and Engineers
From Everand
Developing with ANGLE: Cross-Platform Graphics Integration: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Objective-C Language Reference and Techniques: Definitive Reference for Developers and Engineers
From Everand
Objective-C Language Reference and Techniques: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenMP in Practice: Definitive Reference for Developers and Engineers
From Everand
OpenMP in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Caffe Deep Learning Framework Essentials: Definitive Reference for Developers and Engineers
From Everand
Caffe Deep Learning Framework Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Parallel Programming with MPI: Definitive Reference for Developers and Engineers
From Everand
Parallel Programming with MPI: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Argo for Cloud-Native Workflows and Delivery: Definitive Reference for Developers and Engineers
From Everand
Argo for Cloud-Native Workflows and Delivery: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
C++ Functional Programming for Starters: A Practical Guide with Examples
From Everand
C++ Functional Programming for Starters: A Practical Guide with Examples
William E. Clark
No ratings yet
C# Algorithms for New Programmers: A Practical Guide with Examples
From Everand
C# Algorithms for New Programmers: A Practical Guide with Examples
William E. Clark
No ratings yet
Concurrency in C++: Writing High-Performance Multithreaded Code
From Everand
Concurrency in C++: Writing High-Performance Multithreaded Code
Robert Johnson
No ratings yet
Cilk Programming and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Cilk Programming and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
C# OOP Step by Step: A Practical Guide with Examples
From Everand
C# OOP Step by Step: A Practical Guide with Examples
William E. Clark
No ratings yet
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
From Everand
OpenCL Programming and Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
OpenACC Programming Essentials: Definitive Reference for Developers and Engineers
From Everand
OpenACC Programming Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cocoa Development Essentials: Definitive Reference for Developers and Engineers
From Everand
Cocoa Development Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Essential Apache Beam: Definitive Reference for Developers and Engineers
From Everand
Essential Apache Beam: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Mastering Kubernetes
From Everand
Mastering Kubernetes
Manish Soni
No ratings yet
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
C# for Beginners: Learn in 24 Hours
From Everand
C# for Beginners: Learn in 24 Hours
Alex Nordeen
No ratings yet

Arallel Rogramming With Keletons

Uploaded by

Arallel Rogramming With Keletons

Uploaded by

S CIEntIfIC ProgrAmmIng

Editors: Konstantin Läufer, [email protected]

B ack at the dawn of scientific

Skeleton tree Process network φ φ2 φ C+MPI code

# include <quaff / quaff .hpp >

void load ( vector <float >& d );

int main ( int argc , const char * argv [])

60 Computing in SCienCe & engineering

void gui ();

This algorithm’s parallelization is

P arallel programming is bound to

ing community. These developers

SkeletOn librarieS struction of various skeleton classes and use a distributed

S everal implementations of the parallel skeleton para-

Table 1. Timing results for various particle sets.

Speedup 2.55 2.79 3.22 3.46

62 Computing in SCienCe & engineering

can operate such a model in a main- Privacy is THE

abstraction and without efficiency for security

in the field share

nodes, for example.

3. D.B. Skillicorn, “Architecture-Independent

Joel Falcou is an assistant professor at the

You might also like