J. M.
Orduña Universidad de Valencia – Spain
Introduction to HPC technologies
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Valencia, Spain
Conference title 1
May 21, 2012
Overview
- Introduction
- History
- Hardware
- Software
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
What is High Performance Computing?
- No clear definition. It could be:
Computing on high performance computers
Computational facilities substantially more powerful
than current desktop computers
- My opinion
A huge number of computational and memory
requirements
Cannot be afforded by a «standard» workstation
efficiently
Speed is the keyword
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Similar Concepts
► Parallel computing
Computing on parallel computers
► Super computing
Computing on top 500 machines
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Examples
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
When We Need High Performance
Computing?
► Case1
To perform time-consuming operations in less time/ before a
tighter deadline.
►I am a bioinformatic engineer.
►I need to run DNA-matching programs.
►I’d rather have the result in 5 minutes than in 5 days so that I
can say that the studied ADN sequence contain significant
mutations sooner.
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
When We Need High Performance
Computing
► Case 3
To do a high number of operations per seconds
►I am an engineer of Amazon.com
►My Web server gets 1,000 hits per seconds
►I’d like my web server and my databases to
handle 1,000 transactions per seconds so that
customers do not experience bad delays
Amazon does “process” several GBytes of data per seconds
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
What Does High Performance
Computing Study
► It includes following subjects
Hardware
- Computer Architecture
- Network Connections
Software
- Programming paradigms
- Languages
- Middleware
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
History
► Initially, all computers were designed for HPC
Design special hardware
► Special processor
► Special memory
► Efficient architecture
Design special instruction sets
Difficult to use
► Machine by machine
► Not portable
Most for research or military
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Nowadays
► Specially designed hardware is not as good as using more
Processors
Cheap processors
Cheap memory
Fast network connection
Work together (clusters)
► Shared memory and distributed memory machines becomes
mainstream
►Manycore architectures: GPUs used for computing (GPGPU)
► High performance computing almost equals to parallel
computing
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Nowadays
► The hardware speed is limited
Processor clock rate
► Too much heat
► Power consumption
►Multicore processors
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Parallel Computer Architectures
No issues about load balancing
I/O straight forward
All memory areas are equally accessible to all
Processor-to-processor data transfers are done using shared
areas in memory
Limited scalability
Industry standard (openMP)
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Parallel Computer Architectures
Distributed address space
Processor-to-processor data transfers are done over internal
network topology
Imposed memory locality
High scalability
Load balancing issues
I/O difficult
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Detailed view
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Massively Parallel Architecture: GPU
CUDA programming guide 4.0
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Hybrid
► A cross between shared and distributed computers
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Other alternatives
With the growing of:
PC society
Internet techniques
High performance interconnection networks
More and more idle computers power available
Grid computing, Cloud computing may be valid
alternatives
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Grid Computing
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Cloud computing
End users access cloud based
applications through a web
browser or a light weight
desktop or mobile app while the
business software and data are
stored on servers at a remote
location.
Security/privacity is a big issue
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Top Ten Most Powerful Computers
(http://www.top500.org)
•Computer - Type indicated by
manufacturer or vendor
•Rmax - Maximal LINPACK
performance achieved
•Rpeak - Theoretical peak
performance
(Rmax and Rpeak values are in
Tflops)
The K Computer is the first
supercomputer to achieve a
performance level of 10
Petaflop/s, or 10 quadrillion
calculations per second.
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Top Ten Most Powerful Computers
(http://www.top500.org)
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Top 500 Computers-- Architectures
(http://www.top500.org)
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Top 500 Computers— Operating systems
(http://www.top500.org)
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Top 500 Computers— Interconnects
(http://www.top500.org)
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Top 500 Computers—Vendors
(http://www.top500.org)
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Top 500 Computers (http://www.top500.org)
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Parallel Programming
► All top 500 machines have more than one
processor
► High Performance Computing almost equals to
Parallel Computing in these days
► To use these machines parallel programming is
a must
Decompose the computation into many pieces
Assign these pieces to different processors
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Parallel Programming
► Two components of parallel programming
Computation
Communication
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Programming Standard Interfaces
► OpenMP (http://openmp.org) for shared memory
architectures.
►Message Passing Interface (MPI) (http://www.mpi-
forum.org/ ) for distributed memory architectures
►GPGPU computing
- CUDA (Compute Unified Device Architecture) for GPU
programming on Nvidia cards.
(http://www.nvidia.com/object/cuda_home_new.html)
- OpenCL (http://www.khronos.org/opencl/) open source API for
developing parallel processing applications on a variety of CPUs, as
well as AMD/ATI and NVIDIA GPUs
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
OpenMP
► OpenMP (http://openmp.org)
- Based upon the existence of multiple threads in the shared memory programming
paradigm. A shared memory process consists of multiple threads.
- OpenMP is an explicit (not automatic) programming model, offering the programmer
full control over parallelization.
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Message Passing Interface (MPI)
►Data are moved from the address space of one process
to that of another process through cooperative operations
on each process.
►Collective operations, remote-memory access
operations, dynamic process creation, and parallel I/O.)
►Implementations
LAM http://www.lam-mpi.org/
MPICH http://www.mcs.anl.gov/mpi/mpich/
► C, C++, Fortran
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
CUDA programming model
► PURPOSE
►Development of application software that transparently scales
its parallelism to leverage the increasing number of processor
cores
► Maintain a low learning curve for programmers familiar with
standard programming languages such as C.
► Three key abstractions that are simply exposed to the
programmer as a minimal set of language extensions:
► Hierarchy of thread groups
►Hierarchy of shared memories
►Barrier synchronization
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
CUDA programming model
► PURPOSE
►Development of application software that transparently scales
its parallelism to leverage the increasing number of processor
cores
► Maintain a low learning curve for programmers familiar with
standard programming languages such as C.
► Three key abstractions that are simply exposed to the
programmer as a minimal set of language extensions:
► Hierarchy of thread groups
►Hierarchy of shared memories
►Barrier synchronization
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
CUDA programming model
► Partition the problem into
coarse sub-problems that
can be solved independently
in parallel by blocks of
threads
► Each sub-problem (block)
into finer pieces that can be
solved cooperatively in
parallel by all threads within
the block.
► Automatic scalability
CUDA programming guide 4.0
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
A Simple Example
► Add 1 to 10
1+2+3+4+5+6+7+8+9+10
► Use a single CPU computer
for (i = 1; i <= 10; i++)
► sum = sum + i;
Suppose each operation needs 1 second
9 seconds
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
A Simple Example
► Use two processors
► Solution one
1+2+3+4+5 (processor 1)
6+7+8+9+10 (processor 2)
► Solution two
1+3+5+7+9 (processor 1)
2+4+6+8+10 (processor 2)
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
A Simple Example
► Computation
Each processor computes
Independently and simultaneously
Processor 1 computes 1+3+5+7+9 for (i = pid; i <= 10; i =
i+2)
Processor 2 computes 2+4+6+8+10 sum = sum + i ;
► Communication
Processor 2 sends value sum to processor 1
Processor 1 receives the value sum as sum2
► Computation
sum=sum+sum2
Processor 1 computes
► 5 operations + Message passing + 1 sum(>=5 seconds)
► Save 3 seconds
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Code Sample
#include <stdio.h>
#include "mpi.h"
int main (int argc, char *argv[]) {
int pid, np, sum=0, sum2, tag=1;
MPI_Init (&argc, &argv);
MPI_Comm_rank (MPI_COMM_WORLD, &pid);
for (i = pid+1; i <= 10; i = i+2)
sum = sum + i
if (pid ==1)
MPI_Send (&sum, 1, MPI_INT, 0, tag, MPI_COMM_WORLD);
if (pid == 0){
MPI_Recv (&sum2, 1, MPI_INT, 1, tag, MPI_COMM_WORLD);
sum = sum + sum2;
}
printf(“The sum value = %d\n”, sum);
MPI_Finalize ();
return (0);
}
,
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS) 39
J. M. Orduña Universidad de Valencia – Spain
Introduction to HPC technologies
Workshop on High Performance Computing for Next Generation Sequencing (HPC4NGS)
Valencia, Spain
Conference title 40
May 21, 2012