0% found this document useful (0 votes)

4 views

VSS-NumericalLibraries

Uploaded by

sridevi10mas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

VSS-NumericalLibraries

Uploaded by

sridevi10mas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

High Performance Numerical

Libraries

Sathish Vadhiyar
Gaussian Elimination - Review
Version 1
for each column i
zero it out below the diagonal by adding multiples of row i to
later rows
for i= 1 to n-1
for each row j below row i
for j = i+1 to n
add a multiple of row i to row j
for k = i to n
A(j, k) = A(j, k) – A(j, i)/A(i, i) * A(i, k)
i

0
0 0 k
i
0 0 i,i
0 0 X
j 0 0 X
0 0 x
Gaussian Elimination - Review
Version 2 – Remove A(j, i)/A(i, i) from inner loop
for each column i
zero it out below the diagonal by adding multiples of row i to
later rows
for i= 1 to n-1
for each row j below row i
for j = i+1 to n
m = A(j, i) / A(i, i)
for k = i to n
A(j, k) = A(j, k) – m* A(i, k)
i

0
0 0 k
i
0 0 i,i
0 0 X
j 0 0 X
0 0 x
Gaussian Elimination - Review
Version 3 – Don’t compute what we already know
for each column i
zero it out below the diagonal by adding multiples of row i to
later rows
for i= 1 to n-1
for each row j below row i
for j = i+1 to n
m = A(j, i) / A(i, i)
for k = i+1 to n
A(j, k) = A(j, k) – m* A(i, k)
i

0
0 0 k
i
0 0 i,i
0 0 X
j 0 0 X
0 0 x
Gaussian Elimination - Review
Version 4 – Store multipliers m below diagonals
for each column i
zero it out below the diagonal by adding multiples of row i to
later rows
for i= 1 to n-1
for each row j below row i
for j = i+1 to n
A(j, i) = A(j, i) / A(i, i)
for k = i+1 to n
A(j, k) = A(j, k) – A(j, i)* A(i, k)
i

0
0 0 k
i
0 0 i,i
0 0 X
j 0 0 X
0 0 x
GE - Runtime
 Divisions
1+ 2 + 3 + … (n-1) = n2/2 (approx.)

 Multiplications / subtractions
12 + 22 + 32 + 42 +52 + …. (n-1)2 = n3/3 – n2/2

 Total
2n3/3
Parallel GE
 1st step – 1-D block partitioning along
blocks of n columns by p processors
i

0
0 k
0
i
0 0 i,i
0 0 X
j 0 0 X
0 0 x
1D block partitioning - Steps
1. Divisions
n2/2

2. Broadcast
xlog(p) + ylog(p-1) + zlog(p-3) + … log1 <
n2logp
3. Multiplications and Subtractions
(n-1)n/p + (n-2)n/p + …. 1x1 = n3/p (approx.)

Runtime:
< n2/2 +n2logp + n3/p
2-D block
 To speedup the divisions
P

0
0 k
0
i
0 0 i,i Q
0 0 X
j 0 0 X
0 0 x
2D block partitioning - Steps
1. Broadcast of (k,k)
logQ

2. Divisions
n2/Q (approx.)

3. Broadcast of multipliers
xlog(P) + ylog(P-1) + zlog(P-2) + …. = n2/Q logP

4. Multiplications and subtractions

n3/PQ (approx.)
Problem with block partitioning for
GE
 Once a block is finished, the
corresponding processor remains idle
for the rest of the execution
 Solution? -
Onto cyclic
 The block partitioning algorithms waste
processor cycles. No load balancing
throughout the algorithm.
 Onto cyclic

0123012301230

cyclic 1-D block-cyclic 2-D block-cyclic

Load balance, block operations, Has everything
Load balance but column factorization
Block cyclic
 Having blocks in a processor can lead
to block-based operations (block
matrix multiply etc.)
 Block based operations lead to high
performance
GE: Miscellaneous
GE with Partial Pivoting
 1D block-column partitioning: which is
better? Column or row pivoting
•Column pivoting does not involve any extra steps since pivot search
and exchange are done locally on each processor. O(n-i-1)
•The exchange information is passed to the other processes by
piggybacking with the multiplier information

• Row pivoting
• Involves distributed search and exchange – O(n/P)+O(logP)

 2D block partitioning: Can restrict the pivot

search to limited number of columns
Sparse Iterative Methods
Iterative & Direct methods – Pros
and Cons.
 Iterative methods do not give
accurate results.
 Convergence cannot be predicted
 But absolutely no fills.
Parallel Jacobi, Gauss-Seidel, SOR
 For problems with grid structure (1-
D, 2-D etc.), Jacobi is easily
parallelizable
 Gauss-Seidel and SOR need recent
values. Hence ordering of updates
and sequencing among processors
 But Gauss-Seidel and SOR can be
parallelized using red-black ordering
or checker board
2D Grid example

13 14 15 16

9 10 11 12

5 6 7 8

1 2 3 4
Red-Black Ordering
 Color alternate nodes in each
dimension red and black
 Number red nodes first and then
black nodes
 Red nodes can be updated
simultaneously followed by
simultaneous black nodes updates
2D Grid example – Red Black
Ordering

15 7 16 8

5 13 6 14

11 3 12 4

1 9 2 10

In general, reordering can

affect convergence
Graph Coloring
 In general multi-colored graph coloring
Ordering for parallel computing of Gauss-
Seidel and SOR

Analog To Digital Conversion in MATLAB and Simulink: Introducti0N
100% (1)
Analog To Digital Conversion in MATLAB and Simulink: Introducti0N
7 pages
Outline of Next 2 Lectures: Matrix Computations: Direct Methods I
No ratings yet
Outline of Next 2 Lectures: Matrix Computations: Direct Methods I
16 pages
hpc_linear
No ratings yet
hpc_linear
52 pages
Current Trends in Numerical Linear Algebra
No ratings yet
Current Trends in Numerical Linear Algebra
19 pages
Sparse 1
No ratings yet
Sparse 1
68 pages
Solving Linear Systems: Iterative Methods and Sparse Systems
No ratings yet
Solving Linear Systems: Iterative Methods and Sparse Systems
24 pages
CO-2 (2)
No ratings yet
CO-2 (2)
22 pages
Parallel Gauss Jordan
No ratings yet
Parallel Gauss Jordan
12 pages
RG2-ParallelizationPrinciples-HPCAI-Jan2020
No ratings yet
RG2-ParallelizationPrinciples-HPCAI-Jan2020
40 pages
Pc10 NumLinAl I
No ratings yet
Pc10 NumLinAl I
58 pages
CS 240A: Solving Ax B in Parallel: Dense A: Gaussian Elimination With Partial Pivoting (LU)
No ratings yet
CS 240A: Solving Ax B in Parallel: Dense A: Gaussian Elimination With Partial Pivoting (LU)
35 pages
Numerical Methods For Partial Differential Algebraic Systems of Equations
No ratings yet
Numerical Methods For Partial Differential Algebraic Systems of Equations
61 pages
The Sieve of Eratosthenes
No ratings yet
The Sieve of Eratosthenes
68 pages
Sparse Matrix Technology PDF
No ratings yet
Sparse Matrix Technology PDF
45 pages
Lecture 4: Principles of Parallel Algorithm Design (Part 4)
No ratings yet
Lecture 4: Principles of Parallel Algorithm Design (Part 4)
27 pages
MAT 461/561: Introduction, 3.1 Gaussian Elimination: James V. Lambers January 22, 2020
No ratings yet
MAT 461/561: Introduction, 3.1 Gaussian Elimination: James V. Lambers January 22, 2020
4 pages
Algo - 1
No ratings yet
Algo - 1
54 pages
Chapter 3
No ratings yet
Chapter 3
29 pages
Simulating Ocean Currents
No ratings yet
Simulating Ocean Currents
35 pages
Systems of Linear Equations
No ratings yet
Systems of Linear Equations
12 pages
Dense Matrix Algorithms: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
No ratings yet
Dense Matrix Algorithms: Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar
55 pages
Answer Format (CA2-DAA)
No ratings yet
Answer Format (CA2-DAA)
10 pages
Computers and Mathematics With Applications: Yueqiang Shang
No ratings yet
Computers and Mathematics With Applications: Yueqiang Shang
8 pages
Introduction To Algorithms: Dynamic Programming
No ratings yet
Introduction To Algorithms: Dynamic Programming
25 pages
G. W. Stewart - Matrix Algorithms-Society For Industrial and Applied Mathematics (1998)
No ratings yet
G. W. Stewart - Matrix Algorithms-Society For Industrial and Applied Mathematics (1998)
479 pages
M Array Manipulation Tips and Tricks: Atlab
No ratings yet
M Array Manipulation Tips and Tricks: Atlab
31 pages
MIT2 086F12 Notes Unit5 PDF
No ratings yet
MIT2 086F12 Notes Unit5 PDF
61 pages
Module 1 3
No ratings yet
Module 1 3
33 pages
To Print - Dynprog2
No ratings yet
To Print - Dynprog2
46 pages
D 001 Download
No ratings yet
D 001 Download
14 pages
Linear System of Equations: Gauss Elimination Method
No ratings yet
Linear System of Equations: Gauss Elimination Method
12 pages
Summary and Conclusion
No ratings yet
Summary and Conclusion
7 pages
Gaussian Elim
No ratings yet
Gaussian Elim
3 pages
PDF (Ebook) Parallel Programming with Co-Arrays by Robert W. Numrich ISBN 9781439840047, 1439840040 download
100% (8)
PDF (Ebook) Parallel Programming with Co-Arrays by Robert W. Numrich ISBN 9781439840047, 1439840040 download
65 pages
Solving Pdes With Cuda
No ratings yet
Solving Pdes With Cuda
34 pages
Chained Matrix Multiplication
No ratings yet
Chained Matrix Multiplication
32 pages
04 SparseLinearSystems
No ratings yet
04 SparseLinearSystems
41 pages
Lecture-01 Basic Principles For Arrays and Plotting - MATLAB
No ratings yet
Lecture-01 Basic Principles For Arrays and Plotting - MATLAB
25 pages
MATLAB_1
No ratings yet
MATLAB_1
31 pages
Matlab and Simulink I: Week 2
No ratings yet
Matlab and Simulink I: Week 2
36 pages
Section 1.1 Matrix Multiplication, Flop Counts and Block (Partitioned) Matrices
No ratings yet
Section 1.1 Matrix Multiplication, Flop Counts and Block (Partitioned) Matrices
4 pages
Numerical Solution of Linear Systems: Chen Greif
No ratings yet
Numerical Solution of Linear Systems: Chen Greif
59 pages
bhh93
No ratings yet
bhh93
27 pages
The Design and Analysis of Parallel Algorithms
No ratings yet
The Design and Analysis of Parallel Algorithms
412 pages
1 - Intro To Algorithms
No ratings yet
1 - Intro To Algorithms
11 pages
Layton W., Sussman M. Numerical Linear Algebra 2020
No ratings yet
Layton W., Sussman M. Numerical Linear Algebra 2020
274 pages
Chapter 9 - Parallel Computation Problems
No ratings yet
Chapter 9 - Parallel Computation Problems
43 pages
Matlab Matrix Tricks and Tips
No ratings yet
Matlab Matrix Tricks and Tips
46 pages
Algorithms For Parallel Machines
No ratings yet
Algorithms For Parallel Machines
7 pages
EGR214B1 19 16linsys
No ratings yet
EGR214B1 19 16linsys
44 pages
Introduction To Matlab: By: Amrit Lal Saha
No ratings yet
Introduction To Matlab: By: Amrit Lal Saha
40 pages
Btech Degree Examination, May2014 Cs010 601 Design and Analysis of Algorithms Answer Key Part-A 1
No ratings yet
Btech Degree Examination, May2014 Cs010 601 Design and Analysis of Algorithms Answer Key Part-A 1
14 pages
1.1 Parallelism Is Ubiquitous
No ratings yet
1.1 Parallelism Is Ubiquitous
3 pages
NM Ang
No ratings yet
NM Ang
107 pages
Fundamentals of Numerical Linear Algebra
No ratings yet
Fundamentals of Numerical Linear Algebra
265 pages
PC Intro Devamı
No ratings yet
PC Intro Devamı
143 pages
MATLAB Array Manipulation Tips and Tricks PDF
No ratings yet
MATLAB Array Manipulation Tips and Tricks PDF
63 pages
Science of Programming Matrix Computations
100% (1)
Science of Programming Matrix Computations
178 pages
TOA-cheatsheet
No ratings yet
TOA-cheatsheet
43 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet
(Applied Mathematical Sciences, 118) Edwige Godlewski, Pierre-Arnaud Raviart - Numerical Approximation of Hyperbolic Systems of Conservation Laws-Springer (2021)
No ratings yet
(Applied Mathematical Sciences, 118) Edwige Godlewski, Pierre-Arnaud Raviart - Numerical Approximation of Hyperbolic Systems of Conservation Laws-Springer (2021)
846 pages
newton_1
No ratings yet
newton_1
114 pages
Kinematic_wave_theory
No ratings yet
Kinematic_wave_theory
2 pages
s41598-024-74600-4
No ratings yet
s41598-024-74600-4
27 pages
klausen1999
No ratings yet
klausen1999
20 pages
biorthogonal_system
No ratings yet
biorthogonal_system
54 pages
1D Conservation Laws
No ratings yet
1D Conservation Laws
38 pages
Lie NOTES
No ratings yet
Lie NOTES
114 pages
Nonlinear Differential Equations
No ratings yet
Nonlinear Differential Equations
25 pages
(Applied Mathematical Sciences 35) Jack Carr (Auth.) - Applications of Centre Manifold Theory-Springer-Verlag New York (1981)
No ratings yet
(Applied Mathematical Sciences 35) Jack Carr (Auth.) - Applications of Centre Manifold Theory-Springer-Verlag New York (1981)
156 pages
Weak Solution For The Hyperbolic Equations and Its Numerical Computation
No ratings yet
Weak Solution For The Hyperbolic Equations and Its Numerical Computation
35 pages
Chapter 3 Regular Expression
No ratings yet
Chapter 3 Regular Expression
25 pages
Sergey Kojoian Graph Theory Notes
No ratings yet
Sergey Kojoian Graph Theory Notes
11 pages
Bca 3 Sem Digital Electronics 1 2230 Summer 2019
No ratings yet
Bca 3 Sem Digital Electronics 1 2230 Summer 2019
2 pages
X Y X XY X Y X X 4 T(S) K: Column C Linear (Column C) Polynomial (Column C)
No ratings yet
X Y X XY X Y X X 4 T(S) K: Column C Linear (Column C) Polynomial (Column C)
3 pages
Linear Models (Unit II) Chapter III 1
No ratings yet
Linear Models (Unit II) Chapter III 1
24 pages
Artificial Intelligence: (Unit 3: Problem Solving)
No ratings yet
Artificial Intelligence: (Unit 3: Problem Solving)
10 pages
unit-3-full notes
No ratings yet
unit-3-full notes
72 pages
PR END SEM
No ratings yet
PR END SEM
2 pages
Question Text: Answer Saved Marked Out of 1.00
No ratings yet
Question Text: Answer Saved Marked Out of 1.00
71 pages
CMSC 351: Algorithms
No ratings yet
CMSC 351: Algorithms
46 pages
SOLVED NUMERICALS EXAMPLES in Machine Learning
No ratings yet
SOLVED NUMERICALS EXAMPLES in Machine Learning
59 pages
Practical 1,2,3 NSM
No ratings yet
Practical 1,2,3 NSM
14 pages
OR MCQ
100% (1)
OR MCQ
7 pages
Probabilistic Graphical Model Handout
No ratings yet
Probabilistic Graphical Model Handout
6 pages
Floodfill algorithm
No ratings yet
Floodfill algorithm
4 pages
Java Puzzle
No ratings yet
Java Puzzle
275 pages
quantum-computing-notes-sus
No ratings yet
quantum-computing-notes-sus
2 pages
Ds Lab 1 To 4 Programs
No ratings yet
Ds Lab 1 To 4 Programs
31 pages
DSPD Cse-4 Course File 09-10
No ratings yet
DSPD Cse-4 Course File 09-10
34 pages
Array MIPS
No ratings yet
Array MIPS
6 pages
DAA Chandrakanta Mahanty
No ratings yet
DAA Chandrakanta Mahanty
103 pages
Comprog 5 - Programing Operator
No ratings yet
Comprog 5 - Programing Operator
18 pages
Association Rule Mining
No ratings yet
Association Rule Mining
54 pages
Assigment Problem
No ratings yet
Assigment Problem
30 pages
AI_lab_manual - day2
No ratings yet
AI_lab_manual - day2
25 pages
Numerical 1
No ratings yet
Numerical 1
23 pages
Automata Theory & A Historical Perspective: Alan Turing (1912-1954) Father of Modern Computer Science
No ratings yet
Automata Theory & A Historical Perspective: Alan Turing (1912-1954) Father of Modern Computer Science
3 pages
Vel Tech High Tech: Reg. No. Question Paper Code
No ratings yet
Vel Tech High Tech: Reg. No. Question Paper Code
3 pages
A - Mini - Project - Report - Tic - Tac - Toe 12
No ratings yet
A - Mini - Project - Report - Tic - Tac - Toe 12
18 pages

VSS-NumericalLibraries

Uploaded by

VSS-NumericalLibraries

Uploaded by

High Performance Numerical

4. Multiplications and subtractions

cyclic 1-D block-cyclic 2-D block-cyclic

 2D block partitioning: Can restrict the pivot

In general, reordering can

You might also like