0% found this document useful (0 votes)
78 views15 pages

Parallel Computing Unsteady Heat Equation 1D With Mpi: Ipung Pramono

This document discusses parallelizing the computation of the 1D unsteady heat equation using MPI. It describes discretizing and implementing the problem across multiple processors. Various algorithms are tested, including one-way communication with MPI_Send and MPI_Recv, two-way communication with MPI_Sendrecv, and non-blocking communication. Results show the non-blocking approach achieves the best speedup, up to a factor of 2.5x for 4 processors on large problem sizes, due to overlapping communication and computation.

Uploaded by

ipung_pramono
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views15 pages

Parallel Computing Unsteady Heat Equation 1D With Mpi: Ipung Pramono

This document discusses parallelizing the computation of the 1D unsteady heat equation using MPI. It describes discretizing and implementing the problem across multiple processors. Various algorithms are tested, including one-way communication with MPI_Send and MPI_Recv, two-way communication with MPI_Sendrecv, and non-blocking communication. Results show the non-blocking approach achieves the best speedup, up to a factor of 2.5x for 4 processors on large problem sizes, due to overlapping communication and computation.

Uploaded by

ipung_pramono
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

PARALLEL COMPUTING

UNSTEADY HEAT EQUATION


1D WITH MPI
Ipung Pramono

INTRODUCTION
|

There are many phenomena about heat in the


world

Modeling Problem
in Mathematic

Model and
Simulation

Computation

INTRODUCTION (CONT.)

Simplified Problems : assume for 1D


isolated
T=

T=

isolated

DISCRETIZATION
|

Heat equation

T1,t1 T2,t1

Tn,t1

T1,t2 T2,t2

Tn,t2

i1
j

J+1

I+1

IMPLEMENTATION

i=1

Proc.1

i=2

i=n

Proc.2

j+1

Proc.3

Proc.4

APPLICATE IN THE PROGRAM


|

if(my_rank > 0)

{
MPI_Send (&T[1],1,MPI_DOUBLE, my_rank-1, tag, MPI_COMM_WORLD);

|
|

if(my_rank < p-1)

{
MPI_Recv (&T[n+1],1, MPI_DOUBLE, my_rank+1, tag, MPI_COMM_WORLD, &status);

|
|

if (my_rank < p-1)

MPI_Send (&T[n],1,MPI_DOUBLE, my_rank+1, tag, MPI_COMM_WORLD);

if (my_rank > 0)

{
MPI_Recv(&T[0],1,MPI_DOUBLE, my_rank-1, tag, MPI_COMM_WORLD, &status);

|
|

RESULT (INTEL I7 IN THE LAB.)


No

Number of nodes

Sequential

2processors
speed up

3processors
seconds

speed up

4processors

seconds

seconds

seconds

speed up

120

0.004658

0.010802

0.431

0.013614

0.342

0.009988

0.466

600

0.019435

0.020224

0.961

0.012305

1.579

0.011569

1.680

1200

0.033949

0.026332

1.289

0.019505

1.741

0.021559

1.575

1800

0.04262

0.029984

1.421

0.028406

1.500

0.022755

1.873

3000

0.067064

0.04664

1.438

0.036598

1.832

0.031644

2.119

6000

0.112408

0.070857

1.586

0.059541

1.888

0.046107

2.438

ANALYSE
For a smaller problem size for example the 120
nodes the sequential method is better than using
parallelism because communication time is
dominant in comparison with the computation
time. That means if there are too many
processors involved they spend more time to
transfer the data to their neighbors as to
calculate the next step.
| If the two neighboring processors both start with
MPI_Recv and continue with MPI_Send,
deadlock will arise because none can return from
the MPI_recv
|

OPTIMIZE
The MPI_Sendrecv command is designed to
handle one-to-one data exchange (Algo-2)
| use non-blocking MPI, A more important
motivation for using non-blocking MPI is to
exploit the possibility of communication and
computation overlap or hiding communication
overhead. (non-blocking)
|

OPTIMIZE(CONT.)
|

Use MPI_Sendrecv

MPI_Sendrecv (&T[n],1,MPI_DOUBLE, my_rank+1, 2, &T[n+1],1, MPI_DOUBLE, my_rank+1, 1,


MPI_COMM_WORLD, &status);

Use non-blocking MPI

tag = 1;

MPI_Isend (&T[1],1,MPI_DOUBLE, my_rank-1, tag, MPI_COMM_WORLD, &request[0]);

tag = 1;

MPI_Irecv (&T[n+1],1, MPI_DOUBLE, my_rank+1, tag, MPI_COMM_WORLD, &request[1]);

RESULT
I

II

No

Number of nodes

1
2
3
4
5

600
1200
1800
3000
6000

No

Number of nodes

1
2

600
1200

3
4
5

1800
3000
6000

No Number of nodes

III

Sequential
seconds
0.019435
0.033949
0.04262
0.067064
0.112408

3processors
seconds
speed up
0.012305
1.579
0.019505
1.741
0.028406
1.500
0.036598
1.832
0.059541
1.888

4processors
seconds
speed up
0.011569
1.680
0.021559
1.575
0.022755
1.873
0.031644
2.119
0.046107
2.438

2processors
seconds
speed up

3processors
seconds
speed up

4processors
seconds
speed up

0.019435
0.033949

0.015873
0.028509

1.224
1.191

0.021008
0.024182

0.925
1.404

0.026304
0.016937

0.739
2.004

0.04262
0.067064
0.112408

0.033802
0.045894
0.065463

1.261
1.461
1.717

0.033681
0.039973
0.0558505

1.265
1.678
2.013

0.023062
0.030843
0.045793

1.848
2.174
2.455

Sequential
seconds

Sequential
seconds

2processors
seconds
speed up
0.020224
0.961
0.026332
1.289
0.029984
1.421
0.04664
1.438
0.070857
1.586

2processors
seconds
speed up

3processors
seconds
speed up

4processors
seconds
speed up

600

0.019435

0.020165

0.964

0.009701

2.003

0.009228

2.106

1200

0.033949

0.029884

1.136

0.022478

1.510

0.018715

1.814

1800

0.04262

0.031941

1.334

0.031863

1.338

0.021508

1.982

3000

0.067064

0.044221

1.517

0.037001

1.812

0.030798

2.178

6000

0.112408

0.067822

1.657

0.050361

2.232

0.04424

2.541

RESULT (FOR

2 PROCESSORS)

1.800
1.700
1.600
1.500
1.400

speedup

algo-1

1.300

algo-2
non-blocking

1.200
1.100
1.000
0.900
0.800
0

1000

2000

3000

4000

nodes

5000

6000

7000

RESULT (FOR

3 PROCESSORS)

2.400
2.200
2.000
1.800

speedup

algo-1
algo-2

1.600

non-blocking
1.400
1.200
1.000
0.800
0

1000

2000

3000

4000

nodes

5000

6000

7000

RESULT (FOR

4 PROCESSORS)

2.700
2.500
2.300
2.100
1.900

speedup

1.700

algo-1

1.500

algo-2
non-blocking

1.300
1.100
0.900
0.700
0.500
0

1000

2000

3000

4000

nodes

5000

6000

7000

CONCLUSION
|

Better performance using non-blocking MPI for


computation unsteady heat equation 1D

You might also like