0% found this document useful (0 votes)
53 views21 pages

BG External Presentation January 2002

The Blue Gene project aims to build a petaflop supercomputer to simulate protein folding through a $100M effort over 5 years. The goals are to advance biomolecular simulation and computer design for extremely large systems. The Blue Gene/L architecture is a massively parallel, low power supercomputer that can scale to 180TF. It uses a 3D torus network of compute chips with DRAM and can run scientific and commercial workloads through its software stack. Blue Gene/P aims to build the first petaflop supercomputer to further protein science and computer architecture research.

Uploaded by

soundarcrvoct
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views21 pages

BG External Presentation January 2002

The Blue Gene project aims to build a petaflop supercomputer to simulate protein folding through a $100M effort over 5 years. The goals are to advance biomolecular simulation and computer design for extremely large systems. The Blue Gene/L architecture is a massively parallel, low power supercomputer that can scale to 180TF. It uses a 3D torus network of compute chips with DRAM and can run scientific and commercial workloads through its software stack. Blue Gene/P aims to build the first petaflop supercomputer to further protein science and computer architecture research.

Uploaded by

soundarcrvoct
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Blue Gene

Project Update
January 2002
The Blue Gene Project

In December 1999, IBM Research announced a 5 year,


$100M US, effort to build a petaflop scale
supercomputer to attack problems such as protein
folding.
The Blue Gene project has two primary goals:
Advance the state of the art of biomolecular simulation.
Advance the state of the art in computer design and
software for extremely large scale systems.
In November 2001, a partnership with Lawrence
Livermore National Laboratory was announced.
Cellular System Architecture
Problem: Changing Workload Paradigms
Data-Intensive Workloads drive emerging commercial opportunities.
Power consumption, cooling and size increasingly important to users.

Drivers: Data-Intensive Computing & Technology


Growth in traditional unstructured data (text, images, graphs, pictures, ...)
Enormous increase in complex data types (video, audio, 3D images, ...)
Increased dependency on optimized data mngmt, analysis, search and
distribution.

Research Opportunity: A new way to build servers


Increase total system performance per cubic ft., per watt & per dollar by two
orders of magnitude.
Minimize development time and expense via common/reusable cores.
Distribute workloads by placing compute power in memory and IO
subsystems.
Focus on Key Issues: Power Consumption, Sys. Mngmt., Compute Density,
Autonomic Operation, Scalability, RAS.
Cellular System Projects
Super Dense Server (SDS)
Uses industry standard components
Linux Web Hosting Environments
Cluster-in-a-box using 10s-100s of low power processors
Workload specific functionality per node with shared storage system

Blue Gene / L System


Target scientific and emerging commercial apps.
Collaboration with LLNL for ~200TF System for S&TC workloads.
Extended distributed programming model.
Scaleable from small to large systems (2048 proc/rack)
Leverages high speed interconnect and system-on-a-chip technologies.

Blue Gene / Petaflop Supercomputer


Science and research project
Explore the limits of protein science, computer science & computer arch.
Applications - molecular dynamics, complex simulations
Supporting research: Distributed systems management, self-healing, problem
determination, fast initialization, visualization, system monitoring, packaging, etc.
Supercomputer Peak Performance

1E+16
Blue Gene / P
Blue Gene / L
1E+14
ASCI White
ASCI Red Blue Pacific
Peak Speed (flops)

1E+12 ASCI Red


CP-PACS
NWT
CM-5 Paragon
Delta
1E+10 CRAY-2 i860 (MPPs)
Doubling time = 1.5 yr. X-MP4
Y-MP8
Cyber 205 X-MP2 (parallel vectors)
1E+8 CDC STAR-100 (vectors)
CRAY-1
CDC 7600
CDC 6600 (ICs)
1E+6
IBM Stretch
IBM 7090 (transistors)
1E+4 IBM 704
IBM 701
UNIVAC
ENIAC (vacuum tubes)
1E+2
1940 1950 1960 1970 1980 1990 2000 2010
Year Introduced
Supercomputing Landscape
BG/C BG/ P
CU-11 CU-08
1000 TFLOP 1000 TFLOP

BG/L
CU-11
180 TFLOP

ASCI-Q

30 TFLOP
Performance

QCDOC
CMOS7SF
20 TFLOP
PPC 630
ASCI-
White
10 TFLOP

PPC 604
QCDSP ASCI-Blue
0.6 TFLOP 3.3TFLOP

Specific Architecture General


Blue Gene Project components

Two cellular computing architectures


Blue Gene/L
Blue Gene/C (formerly Cyclops)
Software stack
Kernels, host, middleware, simulators, OS
Self healing, autonomic computing
Application program
Molecular dynamics application software
Partnerships, external advisory board
Blue Gene/L

Massively parallel architecture applicable to a wide class of problems.

Third generation of low power - high performance super computer


QCDSP (600GF based on Texas Instruments DSP C31)
Gordon Bell Prize for Most Cost Effective Supercomputer in '98
QCDOC (20TF based on IBM System-on-a-Chip)
Collaboration between Columbia University and IBM Research
Blue Gene/L (180/360 TF)
Processor architecture included in the optimization

Outstanding price performance


Partnership between IBM, ASCI-Trilab and Universities
Initial focus on numerically intensive scientific problems
Blue Gene/L
System
CU-11 (64 cabinets, 32x32x64)
Design

Rack
(128 boards, 8x8x16)

Board
(8 chips, 2x2x2)

Chip
(2 processors)
180/360 TF/s
16 TB
440 core

2.9/5.7 TF/s
EDRAM

266 GB
440 core
I/O
22.4/44.8 GF/s
2.8/5.6 GF/s 2.08 GB
4 MB
Blue Gene/L - The Networks
65536 nodes interconnected with three integrated networks

3 Dimensional Torus
Virtual cut-through hardware routing to maximize efficiency
2.8 Gb/s on all 12 node links (total of 4.2 GB/s per node)
Communication backbone
134 TB/s total torus interconnect bandwidth

Global Tree
One-to-all or all-all broadcast functionality
Arithmetic operations implemented in tree
~1.4 GB/s of bandwidth from any node to all other nodes
Latency of tree traversal less than 1usec

Ethernet
Incorporated into every node ASIC
Disk I/O
Host control, booting and diagnostics
Blue Gene/L System Software

Software environment includes:


High performance Scientific Kernel
MPI-2 subset, defined by users
Math libraries subset, defined by users
Compiler support for DFPU (C, C++, Fortran)
Parallel file system
System management

External Collaborations: Boston University, Caltech, Columbia


University, Oak Ridge National Labs, San Diego Supercomputing
Center, Universidad Politecnica de Valencia, University of
Edinburgh, University of Maryland,Texas A&M, Tech. Univ. of
Vienna...
BG/L - Operating Environment

RAID RAID RAID RAID


1K I/O Nodes
RAID RAID RAID RAID
I/O Node
I/O Node
File I/O
FileCheckPoint
I/O RAID RAID RAID RAID
CheckPoint
Graphics
Graphics
1Tb/s
Network
Network RAID RAID RAID RAID
Kernel Services
Kernel Services
I/O
& RAS RAID Svrs
Network

1 Gb/s
Ethernet
Compute Nodei
Compute Nodei
ComputeApplication
Nodei
System
ComputeApplication
Nodes
Console
Application
MPI
C/C++, F95
Math
Application
MPI
C/C++, F95
Math
C/C++, F95
Kernel Services
MPI Math
C/C++,Kernel
F95 Services
MPI Math
Kernel Services
Kernel Services
64K Nodes Host
Blue Gene Science

Explore simulation methodologies, analysis tools, and


biological systems.
Thermodynamic and kinetic studies of peptide systems
The free energy landscape for  hairpin folding in explicit
water, R. Zhou et al. (to appear in PNAS)
Solvation free energy/partition coefficient calculations for
peptides in solution using a variety of force fields for
comparison with experiment
Explore scientific and technical computing applications
in other domains such as climate, materials science, ...
Drivers for Application

Aggressive machine architecture requires:


small memory footprint
fine-grained concurrency in application decomposition
exploration of reusable frameworks for application
development
Biological science program requires:
multiple force field support
architectural support for algorithmic and methodological
investigations
framework to support migration of analysis modules from
externally hosted application to the parallel core
BG/L Application to ASCI algorithms

Algorithm Scientific Mapping to


Importance BG/L
Architecture
Ab Initio Molecular Dynamics in biology
A A
and materials science (JEEP)
Three dimensional dislocation dynamics
A A
(MICRO3D and PARANOID)
Molecular Dynamics
A A
(MDCASK code)
Kinetic Monte Carlo (BIGMAC) A C
Computational Gene Discovery
A B
(MPGSS)
Turbulence: Rayleigh-Taylor instability
A A
(MIRANDA)
Shock Turbulence (AMRH) A A
Turbulence and Instability Modeling
B A
(sPPM benchmark code)
Hydrodynamic Instability (ALE hydro) A A
Blue Matter - a Molecular Dynamics
Code
Separate MD program into three subpackages (offload
function to host where possible):
MD core engine (massively parallel, minimal in size)
Setup programs to setup force field assignments, etc
Analysis Tools to analyze MD trajectories, etc
Multiple Force Field Support
CHARMM force field (done)
OPLS-AA force field (done)
AMBER force field (done)
Polarizable Force Field (desired)
Potential Parallelization Strategies
Interaction-based
Volume-based
Atom-based
Time Scales for Protein Folding Phenomena

phenomenon System/size time time step


w/solvent scale count
beta hairpin -hairpin/ 5sec 10**9
kinetics 4000 atoms
peptide thermo. -helix, 0.1-1s 10**8
-hairpin/400
0
protein thermo. 60-100 res./ 1-10s 10**9
20-30,000
protein kinetics 60-100 res./ 500sec 10**11
20-30,000
Simulation Capacity

1.00E+13

1.00E+12
time steps/month

1.00E+11

1.00E+10

1.00E+9

1.00E+8

1.00E+7

1.00E+6
1000 10000 100000
System Size (atoms)
1 rack Power3 ('01) 40*512 node BG/L partition (4Q04)
512 node BG/L partition (2H03) 1,000,000 GFLOP/second (2H06)
Data Volumes (assuming every
time-step written out)

1.000E+18

1.000E+17
bytes/month

1.000E+16

1.000E+15

1.000E+14

1.000E+13

1.000E+12
1E+3 1E+4 1E+5
System Size (atoms)
data volume/month (1 rack Power3)
data volume/month (512 node BL)
data volume/month (40*512 node BL)
data volume/month (1,000,000 GLOP/s)
External Scientific Interactions

First Blue Gene Protein Science workshop held at San


Diego Supercomputer Center, March 2001
Second Blue Gene Protein Science workshop to be held
at the Maxwell Institute, U. of Edinburgh, in March 2002
Collaborations with ORNL, Columbia, UPenn, Maryland,
Stanford, ETH-Zurich, ...
Blue Gene seminar series has hosted over 25 speakers
at the T.J. Watson Research Center
Blue Gene Applications Advisory Board formed with 15
members from the external scientific and HPC
communities.
Blue Gene
Project Update
January 2002

You might also like