0% found this document useful (0 votes)

65 views11 pages

Parallel Pde

The document proposes a parallel algorithm for solving partial differential equations (PDEs) using domain decomposition. The algorithm uses the Crank-Nicholson implicit scheme within subdomains and the DuFort-Frankel explicit scheme to approximate interface values between subdomains. This allows the subdomains to be computed concurrently on different processors. Numerical results show the algorithm is more accurate than using an explicit method for interfaces, especially with small time steps. Accuracy and speedup increase with more processors.

Uploaded by

Dian Neuro-Fuzzy Nuraiman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

65 views11 pages

Parallel Pde

Uploaded by

Dian Neuro-Fuzzy Nuraiman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Parallel algorithm for the solutions of PDEs in linux

clustered workstations
Feras A. Mahmoud, Mohammad H. Al-Towaiq
*
Jordan University of Science and Technology, Department of Mathematics and Statistics, P.O. Box 3030, Irbid 22110, Jordan
Abstract
In this paper we propose parallel algorithm for the solution of partial dierential equations over a rectangular domain
using the CrankNicholson method by cooperation with the DuFortFrankel method and apply it on a model problem,
namely, the heat conduction equation. One of the well known parallel techniques in solving partial dierential equations in
cluster computing environment is the domain decomposition technique. Using this technique, the whole domain is decom-
posed into subdomains, each of them has its own boundaries that are called the interface points. Parallelization is realized
by approximating interface values using the unconditionally stable DuFortFrankel explicit scheme, and these values serve
as Neumann boundary conditions for the CrankNicholson implicit scheme in the subdomains. The numerical results
show that our algorithm is more accurate than the algorithm based on the forward explicit method to approximate the
values of the interface points, especially, when we use a small number of time steps. Moreover, these numerical results
show that increasing the number of processors which are used in the cluster, yields an increase in the algorithm speedup.
2007 Published by Elsevier Inc.
Keywords: Heat conduction equation; Parallel computing; Domain decomposition; CrankNicholson; DuFortFrankel
1. Introduction
The mathematical formulation of most problems in science and engineering involving rates of change with
respect to two or more independent variables, usually representing time, length or angle, leads either to a par-
tial dierential equation (PDE) or to a set of such equations. Numerical approximation methods for solving
PDEs those employing nite dierences are more frequently used and more universally applicable than any
other. These numerical methods often require a large number of computations, which make us explore parallel
methods for solving PDEs.
Finite dierence solutions for PDEs can be found either explicitly or implicitly. The explicit method is easy
to implement on parallel computers but it has severe conditions for stability; that is, in order to attain reason-
able accuracy, the space step must be small which forces necessarily the time step to be small too. The implicit
0096-3003/$ - see front matter 2007 Published by Elsevier Inc.
doi:10.1016/j.amc.2007.11.013
*
Corresponding author.
E-mail addresses: [email protected] (F.A. Mahmoud), [email protected] (M.H. Al-Towaiq).
Available online at www.sciencedirect.com
Applied Mathematics and Computation 200 (2008) 178188
www.elsevier.com/locate/amc
method does not have these conditions for stability but instead a global linear system of equations needs to be
solved at each time step and it is not easy for parallel implementation.
Domain decomposition is a method widely used for solving time dependent PDEs and powerful tool for
devising parallel PDE methods. A conventional approach [1] of parallelizing the implicit scheme is to apply
the domain decomposition based on preconditioning methods to the problem arising from the semidiscretiza-
tion at each time step. In [2] is proved that the preconditioning methods is well conditioned when the time step
is small; nevertheless, small step size is not always describe in situations where implicit schemes become nec-
essary to use. If the original domain is decomposed into a set of non-overlapping subdomains, then the PDEs
dened in dierent subdomains could be solved on dierent processors concurrently. This often requires
numerical boundary conditions at the interface points between subdomains. Since these interface points are
not a part from the original model of the problem, we have to generate them numerically. One way to generate
these numerical boundary conditions is to use the solutions from the previous time step to calculate the solu-
tion at the next time step. This is often referred to as time lagging [3]. A modied approximation scheme of
mixed type was proposed by Kuznetsov [4] where the standard second order implicit scheme is used inside
each subdomain, while the explicit Euler scheme is applied to obtain the interface values on the new time level.
Once the interface values are available, the global problem is fully decoupled and can thus be computed in
parallel. In [5] Dawson proposed a similar hybrid scheme, where instead of using the same spacing as for
the interior points where the implicit scheme is applied, a larger spacing is used at each interface point where
the explicit scheme is applied. In [1] Du, Mu, and Wu proposed two new parallel nite dierence methods for
parabolic PDEs and he focused on a one-dimensional heat equation in a spatial interval 0; 1 as an example.
For computation on the subdomain interface, Du used in the rst method a high-order scheme, while he used
a multistep explicit scheme for the other one. He studied the stability and error analysis of the two new
schemes, and addressed the parallel eciency of these schemes. In [6] Zhang and Wan presented some new
techniques in designing nite dierence domain decomposition algorithm for the heat equation. The basic idea
is to dene the nite dierence schemes at the interface grid points with smaller time steps by Saulyevs asym-
metric schemes.
In this paper, we propose parallel nite dierence scheme for solving PDEs. For simplicity, we consider as a
model the heat conduction equation
ou
ot
s
2
o
2
u
ox
2
: 1
The parallel dierence scheme based on both, the CrankNicholson (CN) implicit and DuFortFrankel (DF)
explicit schemes. In this procedure, the values of interface points of each subdomain are calculated by using
the DF explicit scheme, and then these values serve as Neumann boundary conditions for the CN implicit
scheme in the subdomains. The rest of the paper is organized as follows. In Section 2, we present a detailed
description of the proposed algorithm. The stability of our parallel algorithm is given in Section 3. A numer-
ical results and performance analysis are presented in Section 4. Finally, we conclude this paper in Section 5.
2. DFCN parallel algorithm
Consider the heat conduction equation
ou
ot
s
2
o
2
u
ox
2
for 0 < x < and 0 < t < T; 2
with initial condition
ux; 0 f x for 0 6 x 6 ;
and boundary conditions
u0; t c
1
u; t c
2
for 0 6 t 6 T:
F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188 179
We partition the space into n subintervals and the time into m subintervals, and we dene the space step h

n
and the time step k
T
m
. Therefore, the domain is discretized uniformly. For designing the parallel algorithm
we begin by choosing primitive tasks, identifying data communication patterns among them, and looking for
ways to agglomerate tasks.
By using the domain decomposition technique, the n 1 elements of the space are divided among p processors
fairly; that is, p divides the number of the space subintervals n. Denote the grid points by ux
i
; t
j
uih; jk u
i;j
where i 0; 1; . . . ; n and j 1; 2; . . . ; m. So that the interface points correspond to i
n
p
; 2
n
p
; . . . ; p 1
n
p

and the boundary points correspond to i 0 and i n. Each processor is responsible to compute
n
p
1 points
where each two neighbor processors share only one interface point at each time step, and each processor will
compute this interface point concurrently.
Let
r s
2
k
h
2
;
then at any time step j 1; 2; . . . ; m, and if we use the approximation w
i;j
for u
i;j
in (2), the DF explicit scheme
1 2rw
i;j1
2rw
i1;j
w
i1;j
1 2rw
i;j1
3
is applied at the interface points whereas the CN implicit scheme
2 2rw
i;j1
rw
i1;j1
w
i1;j1
2 2rw
i;j
rw
i1;j
w
i1;j
: 4
will be used to compute the interior points of each subdomain.
Fig. 1 depicts the communications needed to compute the solution at time j 1 given the solution at time j
and time j 1. Processor q is responsible for computing w
i;j1
implicitly using the CN dierence scheme (4). It
can compute the values of the gray cells (interior points) without any communications. However, it cannot
compute the values of these gray cells until it computes the values of the black cells (interface points). Proces-
sor q can compute the values of the black cells explicitly, using the DF scheme (3), only if it gets values from
neighboring processors. In Fig. 1b we show how processor q exchanges values with the neighboring proces-
sors, q 1 and q 1. After these values are received, the black cells can be computed. The parallel program
allocates two extra points for processor q at each time step (the dotted cells). These points will receive the val-
ues received from the neighboring processors that will be stored in memory locations. These memory locations
are called the ghost points. During the iteration that computes row j 1, each processor sends each of its
neighbors the appropriate border values from row j and receives the neighbors row j border value in turn.
After the values has been received into the ghost points, every processor can compute all of its row j 1 inte-
rior values using the CN scheme (4).
3. Stability of the DFCN algorithm
The DFCN parallel algorithm is stable for all values of r > 0 if and only if both the DF explicit scheme
and the CN implicit scheme are stable for all values of r > 0. When only one processor is used, the fully CN
implicit scheme is applied and the algorithm is unconditionally stable [7]. Using two or more processors in the
DFCN algorithm leads us to approximate the values of the interface points by the DF explicit scheme, then
Fig. 1. Ghost points simplify parallel nite dierence programs. (a) When computing row j 1, processor q has the data values it needs to
ll in the gray cells, but it needs values from neighboring processors to ll in black cells. (b) Every processor sends its edge values to its
neighbors. Every processor receives incoming values into ghost points.
180 F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188
these interface values serve as Neumann boundary conditions for the CN implicit scheme in the subdomains.
So, we have to prove that the CN implicit scheme is unconditionally stable when boundary conditions of Neu-
mann type are used for the heat conduction equation (2).
3.1. Neumann boundary conditions
Consider a thin rod of length that is thermally insulated along its length and which radiates heat from the
end x 0. Then the boundary condition at x 0 is given by

ou
ox
g
1
u0; t v
1
:
A negative sign must be associated with
ou
ox
because the outward normal to the rod at this end is in the negative
direction of the x-axis. On the other hand, the boundary condition at x is given by
ou
ox
g
2
u; t v
2

with a positive sign because the outward normal to the rod at this end is in the same direction of the x-axis.
Note that g
1
, g
2
, v
1
, v
2
are constants in which g
1
and g
2
are nonnegative. Hence, instead of the boundary con-
ditions in (2), the boundary conditions have the form
ou0;t
ox
ou;t
ox
for 0 6 t 6 T:
If we wish to represent
ou
ox
more accurately at x 0 and x by the central dierence formula
oux; t
ox

ux h; t ux h; t
2h
; 5
it is necessary to introduce the ctitious temperature w
1;j
and w
n1;j
at the external mesh points h; jk and
h; jk, respectively, by imagining the rod to be extended very slightly. Then, the boundary conditions can
be represented by
w
1;j
w
1;j
2hg
1
w
0;j
v
1
; 6
w
n1;j
w
n1;j
2hg
2
w
n;j
v
2
: 7
The temperatures w
1;j
and w
n1;j
are unknown and necessitates another two equations. Specically, for the
CN formula, at i 0 and i n, we have
2 2rw
0;j1
rw
1;j1
w
1;j1
2 2rw
0;j
rw
1;j
w
1;j
; 8
and
2 2rw
n;j1
rw
n1;j1
w
n1;j1
2 2rw
n;j
rw
n1;j
w
n1;j
: 9
By substituting (6) and (7) into (8) and (9) respectively, the resulting formulae will be
2 2r1 hg
1
w
0;j1
2rw
1;j1
2 2r1 hg
1
w
0;j
2rw
1;j
4rhg
1
v
1
; 10
and
2 2r1 hg
2
w
n;j1
2rw
n1;j1
2 2r1 hg
2
w
n;j
2rw
n1;j
4rhg
2
v
2
: 11
For each time step we have to solve the n 1 tridiagonal system of linear equations, using Thomas algorithm
[8], which represented in matrix form as follows:
Aw
j1
Bw
j
c; for each j 0; 1; 2; . . . ; 12
where
w
j
w
0;j
; w
1;j
; . . . ; w
n;j

t
;
F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188 181
and the matrices A and B and the vector c are given by
A
2 21 hg
1
r 2r 0 0
r 2 2r r 0 0
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
.
.
.
0 r 2 2r r
0 0 2r 2 21 hg
2
r
_

_
_

_
;
B
2 21 hg
1
r 2r 0 0
r 2 2r r 0 0
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
.
.
.
0 r 2 2r r
0 0 2r 2 21 hg
2
r
_

_
_

_
;
and
c 4hv
1
g
1
r; 0; . . . ; 0; 4hv
2
g
2
r
t
:
As g
1
, g
2
, h and r are all nonnegative, then the matrix A in (12) is strictly diagonally dominant [9]. Since a
diagonally dominant matrix is nonsingular [9], then there is a unique solution to the tridiagonal linear system
(12) given by
w
j1
A
1
Bw
j
A
1
c: 13
To examine stability of the CN dierence scheme (12), let us assume that an error
e
0
e
0
0
; e
0
1
; . . . ; e
0
n

t
is made in representing the initial data
w
0
w
0;0
; w
1;0
; . . . ; w
n;0

t
:
So, the initial vector is actually w
0
e
0
, and so we have
w
1
A
1
Bw
0
e
0
A
1
c A
1
Bw
0
A
1
c A
1
Be
0
w
2
A
1
Bw
1
A
1
c
A
1
B
2
w
0
A
1
BA
1
c A
1
c A
1
B
2
e
0
.
.
.
w
k
A
1
Bw
k1
A
1
c
A
1
B
k
w
0

k1
i0
A
1
B
i
A
1
c A
1
B
k
e
0
:
Hence, at the kth time step, the error in w
k
due to e
0
is A
1
B
k
e
0
. In order for this error not to be magnied
in the successive steps, we want
A
1
B
k
e
0
_
_
_
_
6 ke
0
k
for all values of k. Therefore, we must have
kA
1
B
k
k 6 1;
182 F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188
which requires that
qA
1
B
k
qA
1
B
k
6 1;
where qA is the spectral radius of the matrix A [9]. The CN dierence scheme (12) is therefore stable only
when the modulus of every eigenvalue of the matrix A
1
B does not exceed one. Since the matrix B can be writ-
ten as B 4I A, then
A
1
B 4A
1
I:
Therefore, the method is stable only when
4
l
1

6 1;
where l is an eigenvalue of A. This is equivalent to l P2. Since g
1
, g
2
, h and r are all nonnegative, then an
application of Gerschgorins circle theorem [9] to the matrix A in (12) shows that all its eigenvalues are at least
2 for any value of r P0. Hence, the CN dierence scheme (12) is unconditionally stable. Note that the local
truncation error of the CN implicit scheme is Ok
2
h
2
, see [9].
3.2. Stability analysis of the DF method
The stability of the DF scheme can be investigated by writing the DF formula (3) in the matrix form
w
j1

2r
1 2r
Aw
j

1 2r
1 2r
w
j1
c; 14
where
w
j
w
1;j
; w
2;j
; . . . ; w
n1;j

t
; c 2rc
1
; 0; . . . ; 0; 2rc
2

t
;
and
A
0 1 0 0
1 0 1
.
.
.
.
.
.
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
.
.
.
.
.
.
.
.
.
.
.
.
1
0 0 1 0
_

_
_

_
:
Let
v
j

w
j
w
j1
_ _
;
then Eq. (14) can be written as
w
j1
w
j
_ _

2r
12r
A
12r
12r
I
I 0
_ _
w
j
w
j1
_ _

c
0
_ _
where I is the identity matrix of size n 1. Therefore,
v
j1
Pv
j
d; 15
where
P
2r
12r
A
12r
12r
I
I 0
_ _
and d
c
0
_ _
:
F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188 183
This technique has reduced a three-level dierence scheme to a two-level one. The DF scheme (15) will be
unconditionally stable when each eigenvalue of p has a modulus less than or equal to 1. The following two
theorems are useful for the analysis of the stability of three or more time level dierence schemes and are easy
to use. The proof of these theorems can be found in [7].
Theorem 1. If the matrix A can be written as
A
A
11
A
12
A
1n
A
21
A
22
A
2n
.
.
.
.
.
.
.
.
.
A
n1
A
n2
A
nn
_

_
_

_
;
where each A
ij
is an m m matrix, and all the A
ij
has a common set of n linearly independent eigenvectors, then the
eigenvalues of A are given by the eigenvalues of the matrices
k
k
11
k
k
12
k
k
1n
k
k
21
k
k
22
k
k
2n
.
.
.
.
.
.
.
.
.
k
k
n1
k
k
nn
k
k
nn
_

_
_

_
; k 1; 2; . . . ; m;
where k
k
ij
is the kth eigenvalue of A
ij
corresponding to the kth eigenvector g
k
common to all the A
ij
s.
Theorem 2. The eigenvalues of the n n tridiagonal matrix A, where
A
a b 0 0
c a b
.
.
.
.
.
.
0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0
.
.
.
.
.
.
c a b
0 0 c a
_

_
_

_
;
are
k
k
a 2

bc
p
cos
kp
n 1
_ _
; k 1; 2; . . . ; n:
The matrix A from (15) is a tridiagonal matrix. So, by Theorem (2), the matrix A has n 1 dierent eigenvalues
which are
k
k
2 cos
kp
n
_ _
; k 1; 2; . . . ; n 1;
and thus it has n 1 linearly independent eigenvectors g
i
; i 1; 2; . . . ; n 1. Although the matrix I has n 1
eigenvalues each equal to 1, then it has n 1 linearly independent eigenvectors which may be taken as g
i
;
i 1; 2; . . . ; n 1. Hence, by Theorem (1), the eigenvalues l of P are the eigenvalues of
2r
12r
k
k
12r
12r
1 0
_ _
184 F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188
where k
k
is the kth eigenvalue of A. The values of l can be computed by evaluating
det
2r
12r
k
k
l
12r
12r
1 l
_ _
0;
which gives
l
2

2r
1 2r
k
k
l
1 2r
1 2r
0:
Therefore,
l
2r cos
kp
n
_ _

1 4r
2
sin
2
kp
n
_ _
_
1 2r
;
and there are two cases: if 0 6 1 4r
2
sin
2

kp
n
< 1, then
jlj <
2r 1
1 2r
1;
and if 1 4r
2
sin
2

kp
n
< 0, then
jlj
2

2r cos
kp
n
_ _ _ _
2
4r
2
sin
2
kp
n
_ _
1
1 2r
2

4r
2
1
4r
2
4r 1
< 1:
Therefore the DF explicit dierence scheme (3) is unconditionally stable for all values of r > 0.
The local truncation error of this scheme is Ok
2
h
2

k
2
h
2
, see [9]. Successive renement of the values of h
and k may generate a nite dierence solution that is stable, but that may converge to the solution of a dif-
ferent PDE. For example, in the DF explicit dierence scheme, as both h and k tend to zero at the same rate,
the ratio
k
h
is constant, and thus we solve a modied PDE and not the original Eq. (2). However, the DF
scheme is consistent if k tends to zero faster than h.
4. Numerical results and performance analysis
In this section, we consider a heat conduction equation. We use the DFCN algorithm to approximate the
solution of this equation. The example is implemented using the academic cluster built in the department of
Computer Science at Jordan University of Science and Technology. This cluster contains 1 management node
and 18 Linux (Kernel 2.4.20.8 RedHat 9) workstations connected as a star network, each of which has a single
IBM Pentium IV with 2.4 GHz, 512 Cache, 512 MBs of memory and 40 GBs disk space. These hosts are con-
nected together by fast Ethernet, 1 GB switch and 1 optical interconnection switch. We use the Message Pass-
ing Interface (MPI) with the MPICH version 1.5.2 as a message passing library throughout the
implementations. The barrier synchronization and blocking point-to-point communication are used. The
graphs reported in the gures represent the average speedup and eciency over many runs of the DFCN
algorithm.
Example. Consider the heat conduction equation
ou
ot

4
p
2
o
2
u
ox
2
for 0 < x < 4 and t > 0;
with boundary conditions
u0; t u4; t 0; t > 0;
F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188 185
and initial condition
ux; 0 sin
p
4
x
_ _
1 2 cos
p
4
x
_ _ _ _
; 0 6 x 6 4:
The exact solution to this problem is
ux; t e
t
sin
p
2
x
_ _
e
t
4
sin
p
4
x
_ _
:
The solutions at t 0:01 will be approximated using the proposed DFCN algorithm with several values of h
when k 1 10
6
. Figs. 24 show the execution time, speedup and eciency of the DFCN parallel dier-
ence scheme corresponding to several values of n when 1; 2; 4; 8; 12; 16 and 18 processors are used.
It is clear, from Fig. 3, that the speedup of the DFCN algorithm is not ideal, i.e, it is not linear with the
number of processors. This because of the decreasing of the problem size when the number of processors
increases, which makes the communication time to be the dominant in comparison with the computation
one. Also, the height speed of the processors used in implementing our algorithm eects on the parallel exe-
cution time; that is, the small problem sizes will take little execution time to perform a certain calculations.
However, in the DFCN algorithm, as problem size n increases, so does the height of the speedup curve. Also,
for a xed number of processors, speedup is an increasing function of the problem size.
Fig. 2. The execution time of the DFCN algorithm using several values of h with k 1 10
6
at t 0:01.
Fig. 3. Speedup of the DFCN parallel algorithm using several values of h with k 1 10
6
at t 0:01.
186 F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188
For a problem of xed size, the eciency of a parallel computation typically decreases as the number of
processors increases, see Fig. 4. Since parallel communication increases when the number of processors
increases, the way to maintain eciency when increasing the number of processors is to increase the size of
the problem being solved. The proposed algorithm assumes that the data structures we manipulate t in pri-
mary memory. The maximum problem size we can solve is limited by the amount of primary memory that is
available. Also, in the DFCN parallel algorithm, as problem size n increases, the height of the eciency curve
increases.
The DFCN parallel algorithm is accurate. For example, when n 54; 000, the value of r is approximately
73:863 and the maximum error is 5:035 10
9
, while using this choice of n if the forward dierence scheme, see
[7,9], is used to approximate the values of the interface points instead of the DF scheme makes the approxi-
mate solution diverges.
To analyze the performance, let v represent the time needed to compute an interior point using the Thomas
algorithm. Using a single processor to update the n 1 points requires time n 1v. Because the algorithm
has m time steps, the total expected execution time of the sequential algorithm is
t
s
mn 1v: 16
To compute the parallel execution time using p processors, suppose that each of them is responsible for an
equal-sized portion contains
n
p
1 points, two boundaries and
n
p
1 interiors, in general. The boundary
points will be computed by the DF explicit scheme, while the interior points will be computed using the Tho-
mas algorithm. Suppose x represent the time needed to compute a boundary point, the parallel computation
time for each iteration is
t
p
comp

n
p
_ _
1
_ _
v 2x:
However, the parallel algorithm involves communication that the sequential algorithm does not. In general,
each processor must send values to its two neighboring processors and receive two values from them. If f rep-
resents the time needed for a processor to send (receive) a value to (from) another processor, the necessary
communications increase the parallel execution time for each iteration 2f. Therefore,
t
p
comm
2f:
Combining computation time with communication time, the overall parallel execution time for all m iterations
of the algorithm is
t
p
mft
p
comp
t
p
comm
g m
n
p
_ _
1
_ _
v 2x 2f
_ _
: 17
Fig. 4. Eciency of the DFCN parallel algorithm using several values of h with k 1 10
6
at t 0:01.
F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188 187
The speedup relative to the sequential algorithm is
S
t
s
t
p

n 1v
n
p
_ _
1
_ _
v 2x 2f
;
18
and the parallel eciency is given by
E
S
p

n 1v
n p v 2px 2pf
: 19
5. Conclusion
In this paper, the DFCN parallel algorithm has been discussed in depth. This algorithm uses the DF expli-
cit scheme to approximate the solution at the interior boundaries between subdomains. For the remaining
points in each subdomain, the algorithm uses the CN implicit scheme. This scheme has no stability constraint
and prevents the algorithm from moving to worse approximations.
A numerical example is given for the proposed DFCN parallel algorithm. From the numerical results we
conclude that the DFCN algorithm is recommended when small values of time steps are used. Small number
of time steps will decrease the inter-processors communications; which decreases the communication time of
this parallel algorithm. This algorithm gave more accurate results than the parallel algorithm that uses the for-
ward dierence scheme to approximate the values of the interface points especially when a small number of
time steps are used.
Furthermore, in the DFCN parallel algorithm, as problem size n increases, so does the height of the
speedup curve. Also, for a xed number of processors, speedup is an increasing function of the problem size.
Moreover, the eciency of the DFCN algorithm computation typically decreases as the number of proces-
sors increases. The way to maintain eciency when increasing the number of processors is to increase the size
of the problem being solved. Unfortunately, the maximum problem size we can solve is limited by the amount
of primary memory that is available.
References
[1] Q. Du, M. Mu, Z.N. Wu, Ecient parallel algorithms for parabolic problems, SIAM J. Numer. Anal. 30 (2001) 14691487.
[2] X. Cai, Additive Schwarz algorithms for parabolic convectiondiusion equations, Numer. Math. 50 (1991) 4152.
[3] Rivera-Gallego Wilson, Stability analysis of numerical boundary conditions in domain decomposition algorithms, Appl. Math.
Comput. 137 (2003) 375385.
[4] Y.A. Kuznetsov, New algorithms for approximate realization of implicit dierence scheme, Soviet J. Numer. Anal. Math. Model. 3
(1988) 99114.
[5] C.N. Dawson, Q. Du, T.F. Dupnot, A nite dierence domain decomposition algorithm for numerical solution of the heat equation,
Math. Comput. 57 (1991) 6371.
[6] Zhang Bao-Lin, Wan Zheng-Su, New techniques in designing nite-dierence domain decomposition algorithm for the heat equation,
Comput. Math. Appl. 45 (2003) 16951705.
[7] G.D. Smith, Numerical Solution of Partial Dierential Equations: Finite Dierence Methods, Oxford University Press, London, 1978.
[8] Gearge Em Karniadakis, Robert M. Kirby II, Parallel Scientic Computing in C++ and MPI, Cambridge University Press,
Cambridge, 2003.
[9] Richard L. Burden, J. Douglas Faires, Numerical Analysis, seventh ed., Brooks/Cole, United States of America, 2001.
188 F.A. Mahmoud, M.H. Al-Towaiq / Applied Mathematics and Computation 200 (2008) 178188

Partial Differentiation Equation - PDE
No ratings yet
Partial Differentiation Equation - PDE
42 pages
Crank Nicolson Method For Solving Parabo
No ratings yet
Crank Nicolson Method For Solving Parabo
16 pages
Parallel PNSC
No ratings yet
Parallel PNSC
26 pages
Crank Nicholson Method
No ratings yet
Crank Nicholson Method
7 pages
DeAngeli-sbac Pad2003
No ratings yet
DeAngeli-sbac Pad2003
8 pages
Basic Aspects of Discretization
No ratings yet
Basic Aspects of Discretization
56 pages
Article On Partial Differential Equations-1
No ratings yet
Article On Partial Differential Equations-1
13 pages
On Numerical Solution of The Parabolic Equation With Neumann Boundary Conditions
No ratings yet
On Numerical Solution of The Parabolic Equation With Neumann Boundary Conditions
9 pages
TH 122ISMEConference2003
No ratings yet
TH 122ISMEConference2003
10 pages
Numerical Methods PDE
No ratings yet
Numerical Methods PDE
13 pages
Numerical Solutions To PDE
No ratings yet
Numerical Solutions To PDE
34 pages
Numerical PDE
No ratings yet
Numerical PDE
12 pages
Elements of Computational Hydraulics
No ratings yet
Elements of Computational Hydraulics
78 pages
Performance Analysis of Different Iterative Solvers Parallelized On Gpu Architecture
No ratings yet
Performance Analysis of Different Iterative Solvers Parallelized On Gpu Architecture
8 pages
Lecture Notes in Computational Science and Engineering
No ratings yet
Lecture Notes in Computational Science and Engineering
11 pages
Pde Solution PDF
No ratings yet
Pde Solution PDF
9 pages
A Partial Differential Equation - NEW
No ratings yet
A Partial Differential Equation - NEW
46 pages
Chapter Two Literature Survey
No ratings yet
Chapter Two Literature Survey
12 pages
Lecture 7
No ratings yet
Lecture 7
10 pages
Partial Differential
No ratings yet
Partial Differential
27 pages
PDE Overview
No ratings yet
PDE Overview
334 pages
HTM 2013 Ieee
No ratings yet
HTM 2013 Ieee
8 pages
Finite Difference Methods Notes
No ratings yet
Finite Difference Methods Notes
125 pages
NMFHT 2020
No ratings yet
NMFHT 2020
358 pages
ADA233453
No ratings yet
ADA233453
25 pages
Final Report
No ratings yet
Final Report
5 pages
1 s2.0 S016892742200006X Main
No ratings yet
1 s2.0 S016892742200006X Main
12 pages
30 Partial Differential Equations 02-11-2022
No ratings yet
30 Partial Differential Equations 02-11-2022
49 pages
Unit 2 CFD
No ratings yet
Unit 2 CFD
2 pages
Chapter 2 - Numerical Methods For Parabolic PDE
No ratings yet
Chapter 2 - Numerical Methods For Parabolic PDE
6 pages
Tadmor BAMS v49-2012
No ratings yet
Tadmor BAMS v49-2012
48 pages
NM (II) : Lab 3: Pdes - Heat Equation
No ratings yet
NM (II) : Lab 3: Pdes - Heat Equation
1 page
Finite Different Method - Heat Transfer - Using Matlab
No ratings yet
Finite Different Method - Heat Transfer - Using Matlab
27 pages
Finite-Difference Approximations To The Heat Equation
No ratings yet
Finite-Difference Approximations To The Heat Equation
27 pages
FD Methods For Parabolic Pdes: 10:12:57, Subject To The Cambridge Core Terms of Use
No ratings yet
FD Methods For Parabolic Pdes: 10:12:57, Subject To The Cambridge Core Terms of Use
30 pages
Panas Dengan Metode Persamaan Beda
No ratings yet
Panas Dengan Metode Persamaan Beda
27 pages
EngAn3 CFD 2013 14 Lect - 2
No ratings yet
EngAn3 CFD 2013 14 Lect - 2
48 pages
Numerical Analysis: Dublin Institute of Technology, Kevin Street School of Mathematical Sciences A Course in
No ratings yet
Numerical Analysis: Dublin Institute of Technology, Kevin Street School of Mathematical Sciences A Course in
52 pages
CFD Hoffmann v1
No ratings yet
CFD Hoffmann v1
500 pages
The IMA Volumes in Mathematics and Its Applications: Willard Miller, JR
No ratings yet
The IMA Volumes in Mathematics and Its Applications: Willard Miller, JR
308 pages
Finite Difference Method For PDEs
No ratings yet
Finite Difference Method For PDEs
21 pages
49aiaa 51 Amjad
No ratings yet
49aiaa 51 Amjad
16 pages
Williams Chapter 2
No ratings yet
Williams Chapter 2
13 pages
Book DDM
No ratings yet
Book DDM
55 pages
Partial Differential Equations: Solution by Finite Difference Method
No ratings yet
Partial Differential Equations: Solution by Finite Difference Method
34 pages
Cfdpre
No ratings yet
Cfdpre
354 pages
4th Unit Notes Mathematical Methods
No ratings yet
4th Unit Notes Mathematical Methods
35 pages
Mathematical Problems in Engineering - 2000 - Dehghan - On The Numerical Solution of The Diffusion Equation With A Nonlocal
No ratings yet
Mathematical Problems in Engineering - 2000 - Dehghan - On The Numerical Solution of The Diffusion Equation With A Nonlocal
12 pages
PDE
No ratings yet
PDE
39 pages
Notes PDF
No ratings yet
Notes PDF
9 pages
9 PDEs PDF
No ratings yet
9 PDEs PDF
60 pages
Roberts 1984
No ratings yet
Roberts 1984
28 pages
2018 Mult 9
No ratings yet
2018 Mult 9
46 pages
AudioStegano 1
No ratings yet
AudioStegano 1
4 pages
MODULE 2 Calculus2
No ratings yet
MODULE 2 Calculus2
9 pages
RIT AR 20-III-I Question Bank (DAA)
No ratings yet
RIT AR 20-III-I Question Bank (DAA)
5 pages
6 - Probability Distributions
No ratings yet
6 - Probability Distributions
20 pages
Posts Theorem PDF
No ratings yet
Posts Theorem PDF
10 pages
Chapter 3 - Multiple Linear Regression Models
100% (1)
Chapter 3 - Multiple Linear Regression Models
29 pages
Computing Key Stage 4 Lesson COMy11u1L4
No ratings yet
Computing Key Stage 4 Lesson COMy11u1L4
9 pages
Review B - Unit 2 Topics 2.1 - 2.4
No ratings yet
Review B - Unit 2 Topics 2.1 - 2.4
4 pages
Dsa Assignment
No ratings yet
Dsa Assignment
37 pages
Phase Plane Analysis
No ratings yet
Phase Plane Analysis
83 pages
Fast - Algorithms - For - Mining Association Rules - R Agrawal - R Srikant-IBM
No ratings yet
Fast - Algorithms - For - Mining Association Rules - R Agrawal - R Srikant-IBM
32 pages
PCA Code-Checkpoint
No ratings yet
PCA Code-Checkpoint
4 pages
Modular Assessment Grade 11: Statistics and Probability Mr. Antonio E. Soto JR
No ratings yet
Modular Assessment Grade 11: Statistics and Probability Mr. Antonio E. Soto JR
4 pages
Object Detection Using Yolo Algorithm-1
No ratings yet
Object Detection Using Yolo Algorithm-1
9 pages
Chapter Three: Lecture 1: Solving Problems by Searching and Constraint Satisfaction Problem
No ratings yet
Chapter Three: Lecture 1: Solving Problems by Searching and Constraint Satisfaction Problem
53 pages
Operations Management Final Exam
No ratings yet
Operations Management Final Exam
23 pages
Operations Research
No ratings yet
Operations Research
9 pages
Deep Learning For Diagnosis and Classification of Faults in Industrial Rotating Machinery
No ratings yet
Deep Learning For Diagnosis and Classification of Faults in Industrial Rotating Machinery
23 pages
Dde 23
No ratings yet
Dde 23
3 pages
Tutsheet 8
No ratings yet
Tutsheet 8
2 pages
Introduction (v4)
No ratings yet
Introduction (v4)
16 pages
IEM 4103 Quality Control & Reliability Analysis IEM 5103 Breakthrough Quality & Reliability
No ratings yet
IEM 4103 Quality Control & Reliability Analysis IEM 5103 Breakthrough Quality & Reliability
42 pages
Lect Slides - Dynamic Response Characteristics of More Complicated Processes
No ratings yet
Lect Slides - Dynamic Response Characteristics of More Complicated Processes
31 pages
Economic Order Quantity in Fuzzy Sense For Inventory
No ratings yet
Economic Order Quantity in Fuzzy Sense For Inventory
74 pages
Acín 2018 New J. Phys. 20 080201 PDF
No ratings yet
Acín 2018 New J. Phys. 20 080201 PDF
25 pages
Pengaruh Digitalisasi Terhadap Efektivitas Pelayanan Bank (Studi Pada Nasabah Pengguna M-Din Muamalat) - Kelompok 5 - Rizki Nurdiana
No ratings yet
Pengaruh Digitalisasi Terhadap Efektivitas Pelayanan Bank (Studi Pada Nasabah Pengguna M-Din Muamalat) - Kelompok 5 - Rizki Nurdiana
7 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
Euler
No ratings yet
Euler
17 pages
Unit IV Morphology Introduction Lecture
No ratings yet
Unit IV Morphology Introduction Lecture
16 pages

Parallel Pde

Uploaded by

Parallel Pde

Uploaded by

Parallel algorithm for the solutions of PDEs in linux

You might also like