0% found this document useful (0 votes)

9 views9 pages

Solving The Euler Equations On Graphics Processing

Uploaded by

eri sugiatmoko

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views9 pages

Solving The Euler Equations On Graphics Processing

Uploaded by

eri sugiatmoko

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220857388

Solving the Euler Equations on Graphics Processing Units

Conference Paper in Lecture Notes in Computer Science · May 2006

DOI: 10.1007/11758549_34 · Source: DBLP

CITATIONS READS
84 418

3 authors:

Trond Runar Hagen Knut-Andreas Lie

SINTEF SINTEF Digital, Oslo, Norway
16 PUBLICATIONS 885 CITATIONS 219 PUBLICATIONS 6,021 CITATIONS

SEE PROFILE SEE PROFILE

Jostein Roald Natvig

Schlumberger Limited
42 PUBLICATIONS 1,738 CITATIONS

SEE PROFILE

All content following this page was uploaded by Trond Runar Hagen on 22 May 2014.

The user has requested enhancement of the downloaded file.

Solving the Euler Equations on Graphics
Processing Units

Trond Runar Hagen1,2 , Knut–Andreas Lie1,2 , and Jostein R. Natvig1,2

1
SINTEF, Dept. Applied Math., P.O. Box 124 Blindern, N-0314 Oslo, Norway
2
Centre of Mathematics for Applications (CMA),University of Oslo, Norway
{trr,knl,jrn}@sintef.no, http://www.sintef.no/gpgpu

Abstract. The paper describes how one can use commodity graphics
cards (GPUs) as a high-performance parallel computer to simulate the
dynamics of ideal gases in two and three spatial dimensions. The dy-
namics is described by the Euler equations, and numerical approxima-
tions are computed using state-of-the-art high-resolution finite-volume
schemes. These schemes are based upon an explicit time discretisation
and are therefore ideal candidates for parallel implementation.

1 Introduction
Conservation of physical quantities is a fundamental physical principle that is
often used to derive models in the natural sciences. In this paper we will study
one such model, the Euler equations describing the dynamics of an ideal gas
based on conservation laws for mass, momentum, and energy. In three spatial
dimensions the Euler equations read
         
ρ ρu ρv ρw 0
 ρu   ρu2 + p   ρuv   ρuw   0 
         
 ρv  +  ρuv  +  ρv 2 + p  +  ρvw  =  0  . (1)
         
ρw  ρuw   ρvw   ρw2 + p   gρ 
E t u(E + p) x v(E + p) y w(E + p) z gρw

Here ρ denotes the density, (u, v, w) the velocity vector, p the pressure, g the
acceleration of gravity, and E the total energy (kinetic plus internal energy)
given by E = ρ(u2 + v 2 + w2 )/2 + p/(γ − 1). In all computations we use γ = 1.4.
The Euler equations are one particular example of a large class of equations
called hyperbolic systems of conservation laws, which can be written on the form

Qt + F (Q)x + G(Q)y + H(Q)z = S(Q). (2)

This class of PDEs exhibits very singular behaviour and admits various kinds of
discontinuous and nonlinear waves, such as shocks, rarefactions, phase bound-
aries, fluid and material interfaces, etc. Resolving propagating discontinuities
accurately is a difficult task, to which a lot of research has been devoted in the
last 2–3 decades. Today, a successful numerical method will typically be of the
2 T. R. Hagen, K.-A. Lie, and J. R. Natvig

high-resolution type (see e.g., [3]) and be able to accurately capture discontin-
uous waves and at the same time offer high-order resolution of smooth parts of
the solution.
Modern high-resolution methods for nonstationary problems are typically
based upon explicit temporal discretisation. In explicit methods there is no cou-
pling between unknowns in different grid cells, and one therefore avoids the use
of linear system solvers, which is a typical bottleneck in many fluid dynamics
algorithms. High-resolution methods are therefore relatively easy to parallelise,
using e.g., domain decomposition. In this paper we will discuss parallel imple-
mentation of gas-dynamics simulations on commodity graphics cards (GPUs)
residing in recent desktop computers and workstations. The idea of using GPUs
for numerical simulation is far from new—cf. e.g., http://www.gpgpu.org—but
except for our previous work [1] on shallow water waves, this is the first paper
to consider a GPU implementation of high-resolution schemes for models on the
form (2).
From a computational point-of-view, a modern GPU can be considered as
a single-instruction, multiple-data processor capable of parallel processing of
floating-point numbers. Whereas an Intel Pentium 4 CPU has a theoretical per-
formance of at most 15 Gflops, performance numbers as high as 165 Gflops have
been reported for the NVIDIA GeForce 7800 cards. The key to this unrivalled
processing power is the fact that current GPUs contain up to 24 parallel pipelines
that each are capable of processing vectors of length four simultaneously. By
exploiting this amazing computational power for 2D and 3D gas-dynamics sim-
ulations, we observe speedup factors of order 10—20 on a single workstation.

2 Numerical Methods

To solve the Euler equations in two and three dimensions, we will use a family of
semi-discrete finite-volume schemes on a regular CartesianRgrid and seek approx-
1
imations to (2) in terms of the cell-averages Qijk = |Ωijk | ΩijkQ dV . Integrating
(2) over Ωijk , we obtain an evolution equation for the cell-averages

dQijk
= − Fi+1/2,jk − Fi−1/2,jk − Gi,j+1/2,k − Gi,j−1/2,k
dt (3)
− Hij,k+1/2 − Hij,k−1/2 + Sijk ,

where Fi±1/2,jk denote the flux over the surfaces with normal along the x-axis,
etc. The fluxes are approximated using a standard Gaussian quadrature (fourth
order, tensor product rule):
Z yj+1/2 Z zk+1/2
1
F i+1/2,jk (t) = F Q(xi+1/2 , y, z, t) dydz
|Ωijk | yj−1/2 zk−1/2
1 X ∆y ∆z (4)
≈ F Q xi+1/2 , yj + n √ , zk + m √ , t .
4∆x 2 3 2 3
n,m={−1,1}
Solving the 3D Euler Equations on GPUs 3

To evaluate the integrand, we need to reconstruct a continuously defined func-

tion from the cell-averages. To this end, we will use a function that is piecewise
continuous inside each grid cell. In a first-order method, one would use a piece-
wise constant function. To obtain second-order (on smooth solutions), we use a
piecewise linear reconstruction for each component in Q
x − xi
Q̂ijk (x, y, z) = Qij + L Dx+ Qijk , Dx− Qijk
∆x
y − yj z − zk (5)
+ L Dy+ Qijk , Dy− Qijk + L Dz+ Qijk , Dz− Qijk ,
∆y ∆z

where Dx± = ±(Qi±1,jk − Qijk ), etc. The so-called limiter L is a nonlinear func-
tion of the forward and backward differences, whose purpose is to prevent the
creation of overshoots at local extrema. Here we use the family of generalised
minmod limiters

maxi zi , zi < 0 ∀i,

1
L(a, b) = MM(θa, 2 (a + b), θb), MM(z1 , . . . , zn ) = mini zi , zi > 0 ∀i, (6)

0, otherwise.


The reconstruction Q̂(x, y, z) is discontinuous across all cell-interfaces, and thus

gives a left-sided and right-sided point value, QL and QR , at each integration
point in (4). To evaluate the flux across the interface at each integration point,
we use the central-upwind flux [2]

a+ F (QL ) − a− F (QR ) a+ a−
F(QL , QR ) = −
+ + (QR − QL ),
+
a −a a − a− (7)
a+ = max 0, λ+ (QL ), λ+ (QR ) , a− = min 0, λ− (QL ), λ− (QR ) ,

±
where
p λ (Q) are the slow and fast eigenvalues of dF/dQ, given analytically as
u ± γp/ρ.
Finally, we need to specify how to integrate the ODEs (3) for the cell-
averages. To this end, we use a second-order TVD Runge–Kutta method [5]
(1)
Qij = Qnij + ∆tRij (Qn ),
1 1 (1) (8)
Qn+1 = Qnij + Qij + ∆tRij (Q(1) ) ,

ij
2 2
where Rij denotes the right-hand side of (3). The time step is restricted by a
CFL-condition, which states that disturbances can travel at most one half grid
cell each time step, i.e., max(a+ , −a− )∆t ≤ ∆x/2, and similarly in y and z.

3 GPU Implementation
The data-driven programming model of GPUs is quite different from the instruc-
tion-driven programming model most people are used to on a CPU. On a CPU,
4 T. R. Hagen, K.-A. Lie, and J. R. Natvig

Make initial Shaders for

data Yes graphics output

Completed all
No time steps?

Find maximal
CPU
Start of new Step length from
global
time step stability condition
eigenvalues
(eigenvalues)

Runge-Kutta substep iteration

Yes

Compute
Set boundary Reconstruct Evaluate edge
Runge-Kutta Last substep?
conditions point values fluxes
substep

Fig. 1. Flow chart for the GPU implementation of the semi-discrete finite-volume
scheme. Gray boxes are executed on the GPU and white boxes on the CPU.

a computer program for the algorithm in Section 2 would consist of a set of

arrays and the processing is performed as series of loops that march through
all cells to compute reconstructions, integrate fluxes, compute flux differences
and evolve the ODEs, etc. On the GPU, each grid cell is associated with a pixel
(fragment) in an off-screen frame buffer. The data stream (cell-averages, fluxes,
etc.) is given as textures and is invoked by rendering a geometry to a frame buffer.
The data stream is processed by a series of kernels (fragment shaders, in graphics
terminology) using the fragment-processing capabilities in the rendering pipeline.
Writing each computational kernel using Cg (or GLSL) is straightforward for
any computational scientist capable of writing C/C++. However, setting up the
graphics pipeline requires some familiarity with computer graphics (in our case,
OpenGL).
The flow chart for the simulation algorithm is given in Figure 1. Worth noting
is the computation of the maximum eigenvalues to determine the time step.
Finding the maximum is implemented using an ’all-reduce’ operation utilising
the depth buffer combined with a read-back to the CPU; see e.g., [1]. In each
of the two Runge–Kutta steps, four basic operations are performed. First we
set the boundary data. Then we compute the reconstruction using (5) and (6).
The most computationally intensive step is the evaluation of edge fluxes and
computation of the source term. Before this calculation can start, the time step
∆t must be passed to the shader by the CPU. Finally, the step is completed by
adding fluxes and the source term to the cell averages. To complete a full time
step, the sequence of operations in the Runge–Kutta box are performed twice.
Solving the 3D Euler Equations on GPUs 5

For simulations in 2D, the vectors with cell-averages, slopes, fluxes, etc. have
length four and can each be fitted in a single texture RGBA-element. In 3D, the
vectors have length five and do not fit in a single RGBA-element. We therefore
chose to split each vector in two three-component textures. Notice that this
opens up for adding up to three extra quantities in the vector of unknowns (e.g.,
to represent two or more gases with different γ’s) at a very low computational
cost, since the GPU processes 4-vectors simultaneously.
In 2D, the Cartesian grid is simply embedded in a rectangle. In 3D, the grid
(padded with ghost-cells to represent the boundary) is unfolded in the z-direction
and the 3D arrays of vectors are mapped onto larger 2D textures. Memory limi-
tations on the GPU will restrict the sizes of the grids one can process in a single
batch. To be able to run highly resolved simulations in 3D, we will therefore use
a domain decomposition approach, in which the domain is divided into smaller
rectangular blocks that can be handled separately. The algorithm is straight-
forward: Each subdomain is extended with one grid-layer of overlap into the
neighbouring subdomains for each time-step to be carried out. The initial data
is passed by the CPU to the GPU, which performs a given number of time-steps
as described above. The result is then read back to the CPU, where it is inserted
into the corresponding subdomain in the global solution on the next (global)
time level. Since passing of initial data and read-back of computational results
can be performed asynchronously between the CPU and the GPU, the perfor-
mance reduction due to stalls on the GPU will be insignificant. Moreover, this
algorithm easily extends to multiple CPU-GPU configurations by implementing
some kind of message passing and control on the CPU-side.

4 Numerical Examples

In the following we present a few numerical examples to assess the computational

efficiency of our GPU implementation. To this end, we compare runtimes on two
NVIDIA GeForce graphics cards (6800 Ultra and 7800 GTX) with runtimes on
two different CPUs (a 2.8 GHz Intel Xeon CPU and an AMD Athlon X2 4400+,
respectively). The timings are averaged over all timesteps and do not include any
preprocessing. The CPU reference codes are implemented in C, using a design
that has evolved during 6–7 years research on high-resolution schemes. High
computational efficiency has been ensured by carefully minimising the number
of arithmetic operations, optimal ordering of loops, use of temporary storage,
replacing divisions by multiplications whenever possible, etc. Apart from that,
our CPU codes contain no hardware-specific hand-optimisation, but rather rely
on general compiler optimisation; icc -O3 -ipo -xP (version 8.1) for the Intel
CPU and gcc -O3 for the AMD. To ensure a fair comparison, we have used the
same design choices for the GPU implementations, trying to retain a one-to-one
correspondence of statements in the CPU and GPU computational kernels.
Another important question is accuracy. The numerical methods considered
in the paper are stable and the accuracy will therefore not deteriorate signifi-
cantly due to rounding errors. In fact, all our tests indicate that the difference
6 T. R. Hagen, K.-A. Lie, and J. R. Natvig

t=0.0 t=0.1 t=0.2 t=0.3

Fig. 2. Emulated Schlieren images of a shock-bubble interaction.

Table 1. Runtime per time step in seconds and speedup factor for two CPUs versus
two GPUs for the shock-bubble problem run on a grid with N × N cells for bilinear
(upper part) and CWENO reconstruction (lower part).

N Intel 6800 speedup AMD 7800 speedup

128 4.37e-2 3.70e-3 11.8 1.88e-2 1.38e-3 13.6
256 1.74e-1 8.69e-3 20.0 1.08e-1 4.37e-3 24.7
512 6.90e-1 3.32e-2 20.8 2.95e-1 1.72e-2 17.1
1024 2.95e-0 1.48e-1 19.9 1.26e-0 7.62e-2 16.5
128 1.05e-1 1.22e-2 8.6 7.90e-2 4.60e-3 17.2
256 4.20e-1 4.99e-2 8.4 3.45e-1 1.74e-2 19.8
512 1.67e-0 1.78e-1 9.4 1.03e-0 6.86e-2 15.0
1024 6.67e-0 7.14e-1 9.3 4.32e-0 2.99e-1 14.4

between single precision (GPU) and double precision (CPU) results are of the
order s , where s = 1.192 · 10−7 is the smallest number such that 1 + s − 1 > 0
in single precision. In other words, for the applications considered herein, the
discretisation errors dominate errors due to lack of precision.
Example 1 (2D Shock-Bubble Interaction). In this example we consider the in-
teraction of a planar 2.95 Mach shock in air with a circular region of low density.
The gas is initially at rest and has unit density and pressure. Inside a circle of
radius 0.2 centred at (0.4, 0.5) the density is 0.1. The incoming shock-wave starts
at x = 0 and has a post-shock pressure p = 10.0. Figure 2 shows the evolution
of the bubble in terms of emulated Schlieren images (density gradients depicted
using a nonlinear graymap) as described by the 2D Euler equations, i.e., (1) with
g = w ≡ 0.
Table 1 reports a comparison of average runtime per time step for GPU
versus CPU implementations of the high-resolution scheme using either the bi-
linear reconstruction in (5) and (6), or the third-order CWENO reconstruction
[4]. The corresponding schemes thus have second-order accuracy in time and
second and third order accuracy in space, respectively. For the bilinear recon-
struction, the resulting speedup factors of order 20 and 15 for the GeForce 6800
and 7800, respectively, are quite amazing since we did not try to optimise the
GPU implementation apart from the obvious use of vector operations whenever
appropriate.
The CWENO reconstruction is quite complicated, and we have observed on
various Intel CPUs that icc gives significantly faster code than gcc. We therefore
Solving the 3D Euler Equations on GPUs 7

Fig. 3. The Rayleigh–Taylor instability at times t = 0.5, 0.6, and 0.7.

Table 2. Runtime per time step in seconds and speedup factor for CPU versus GPU
for the 3D Rayleigh–Taylor instability run on a grid with N × N × N cells.

N AMD 7800 speedup

49 5.23e-1 4.16e-2 12.6
64 1.14e-0 8.20e-2 13.9
81 1.98e-0 1.72e-1 11.5

expect the CWENO code to be suboptimal on the AMD CPU. Moreover, due to
the large number of temporary registers required in the CWENO reconstruction,
we had to split the computation of edge fluxes in two passes on the GPU: one
for the F -fluxes and one for the G-fluxes. This introduces extra computations
compared with the CPU code and also extra render target switches and texture
fetches. This reduces the theoretical speedup by factor between 25 and 50%, as
can be seen in the lower half of Table 1, comparing the 6800 card and the Intel
CPU with the icc compiler.

Example 2 (3D Rayleigh–Taylor Instability). In the next example we simulate a

Rayleigh–Taylor instability, which arises when a layer of heavier fluid is placed
on top of a lighter fluid and the heavier fluid is accelerated downwards by gravity.
Similar phenomena occur more generally when a light fluid is accelerated towards
a heavy fluid. In the simulation, we consider the domain [−1/6, 1/6]2 × [0.2, 0.8]
with gravitational acceleration g = 0.1 in the z-direction. The lower fluid has
unit density and the upper fluid density ρ = 2.0. Initially the two fluids are at
rest, in hydrostatic
p balance, and separated by an interface located at z = 1/2 +
0.01 cos(6π min( x2 + y 2 , 1/6)). Reflective boundary conditions are assumed on
all exterior boundaries. Figure 3 shows the evolution of the instability.
Table 2 reports a comparison of average runtime per time step for a GPU
versus a CPU implementation for the high-resolution scheme with the trilinear
reconstruction in (5) and (6). Compared with the 2D simulation, the speedup
8 T. R. Hagen, K.-A. Lie, and J. R. Natvig

is reduced. There are two factors contributing the the reduced speedup: (i)
cache misses on the GPU due to lookup in the unfolded 3D texture, and (ii)
use of only three out of four vector components in all basic arithmetic oper-
ations. For the 2D solver, all texture fetches are to neighbouring texel loca-
tions, whereas the access in the z-direction introduces non-local texture fetches.
Similarly, the 2D solver uses a single four-component texture to represent the
conserved quantities, whereas the 3D solver needs to use two three-component
textures: (ρ, ρu, ρv, ρw, E), plus a passive tracer to distinguish the gases.

5 Concluding Remarks
In this paper we have demonstrated the application of GPUs as high-performance
computational engines for compressible gas dynamics simulations in 2D and
3D. Unlike many other fluid dynamics algorithms, the current high-resolution
schemes do not involve any linear system solvers, which may be a performance
bottleneck in many GPU/parallel implementations. Instead, the schemes are
based upon explicit temporal discretisation, for which each cell can be updated
independent of the others. This makes the schemes perfect candidates for parallel
implementation. Moreover, a high number of arithmetic operations per memory
fetch makes these algorithms ideal for high performance data-stream based com-
puter architectures and results in fairly amazing speedup numbers.
The possibility of using a simple domain-decomposition algorithm to handle
large simulation models makes it attractive to explore future use of clusters of
CPU–GPU nodes for this type of simulations. The communication need between
different nodes is low compared with the computations performed on each GPU,
and communication bandwidth is therefore not expected to be a major issue.

Acknowledgement
The research is funded by the Research Council of Norway under grants number
158911/I30 (Hagen and Lie) and 139144/431 (Natvig).

References
1. Hagen, T.R., Hjelmervik, J.M., Lie, K.-A., Natvig, J.R., Henriksen, M.O.: Visual
simulation of shallow-water waves. Simul. Model. Pract. Theory, 13 (2005) 716–726.
2. Kurganov, A, Noelle, S., Petrova, G.: Semidiscrete central-upwind schemes for hy-
perbolic conservation laws and Hamilton–Jacobi equations. SIAM J. Sci. Comput.
23 (3) (2001) 707–740.
3. LeVeque, R: Finite volume methods for hyperbolic problems, Cambridge Texts in
Applied Mathematics, Cambridge University Press, Cambridge, 2002.
4. Levy, D., Puppo, G., Russo. G.:Compact central WENO schemes for multidimen-
sional conservation laws. SIAM J. Sci. Comput. 22(2) (2000) 656–672.
5. Shu, C.-W.: Total-variation-diminishing time discretisations. SIAM J. Sci. Stat.
Comput. 9 (1988) 1073–1084.

View publication stats

Principles of Multiscale Modeling 1st Edition Weinan E Instant Download
100% (2)
Principles of Multiscale Modeling 1st Edition Weinan E Instant Download
71 pages
Multiphase Flow in A Porous Media
100% (2)
Multiphase Flow in A Porous Media
236 pages
CS8076 - GPU Architecture and Programming
No ratings yet
CS8076 - GPU Architecture and Programming
244 pages
NS Freefem
No ratings yet
NS Freefem
49 pages
Bastian, P. - Numerical Computation of Multiphase Flows in Porous Media (1999)
No ratings yet
Bastian, P. - Numerical Computation of Multiphase Flows in Porous Media (1999)
236 pages
Naughty Dog Gdc08 Uncharted Tech
No ratings yet
Naughty Dog Gdc08 Uncharted Tech
58 pages
Blender Wiki PDF Manual
100% (4)
Blender Wiki PDF Manual
1,351 pages
Weinan E. Principles of Multiscale Modeling (Draft, CUP, 2011) (510s)
100% (1)
Weinan E. Principles of Multiscale Modeling (Draft, CUP, 2011) (510s)
510 pages
Notes NMCE
No ratings yet
Notes NMCE
264 pages
Efficient High-Order Discretizations For Computational Fluid Dynamics
No ratings yet
Efficient High-Order Discretizations For Computational Fluid Dynamics
314 pages
An Introduction To The Lattice Boltzmann
100% (1)
An Introduction To The Lattice Boltzmann
29 pages
Weinan Book
No ratings yet
Weinan Book
510 pages
Kwok Thesis
No ratings yet
Kwok Thesis
201 pages
Computational Fluid Dynamics 2006 Proceedings of The Fourth International Conference On Computational Fluid Dynamics ICCFD4 Ghent Belgium 10 14 J
No ratings yet
Computational Fluid Dynamics 2006 Proceedings of The Fourth International Conference On Computational Fluid Dynamics ICCFD4 Ghent Belgium 10 14 J
914 pages
GPU Methodologies For Numerical Partial Differential Equations Andrew Gloster
No ratings yet
GPU Methodologies For Numerical Partial Differential Equations Andrew Gloster
158 pages
The IMA Volumes in Mathematics and Its Applications: Willard Miller, JR
No ratings yet
The IMA Volumes in Mathematics and Its Applications: Willard Miller, JR
308 pages
Mesh
No ratings yet
Mesh
187 pages
Adaptive High-Order Methods in Computational Fluid Dynamics - Z. J. Wang (World Scientific, 2011)
No ratings yet
Adaptive High-Order Methods in Computational Fluid Dynamics - Z. J. Wang (World Scientific, 2011)
471 pages
Witherden 2015 - On The Development and Implementation of High-Order Flux Reconstruction Schemes For Computational Fluid Dynamics
No ratings yet
Witherden 2015 - On The Development and Implementation of High-Order Flux Reconstruction Schemes For Computational Fluid Dynamics
131 pages
Get Principles of Multiscale Modeling 1st Edition Weinan E Free All Chapters
No ratings yet
Get Principles of Multiscale Modeling 1st Edition Weinan E Free All Chapters
51 pages
3d Game Development With LWJGL
No ratings yet
3d Game Development With LWJGL
344 pages
Thesis Paulo Lyra
No ratings yet
Thesis Paulo Lyra
376 pages
X Normal Help
No ratings yet
X Normal Help
135 pages
LectureNotes ps2pdf 6mai
No ratings yet
LectureNotes ps2pdf 6mai
94 pages
Nadeem
No ratings yet
Nadeem
83 pages
31C42 Weinan e Principles of Multiscale Modeling
No ratings yet
31C42 Weinan e Principles of Multiscale Modeling
510 pages
Joint ICTP-IAEA Course On Natural Circulation Phenomena and Passive Safety Systems in Advanced Water Cooled Reactors
No ratings yet
Joint ICTP-IAEA Course On Natural Circulation Phenomena and Passive Safety Systems in Advanced Water Cooled Reactors
36 pages
Node Pairs Report
No ratings yet
Node Pairs Report
185 pages
Section 4
No ratings yet
Section 4
42 pages
Fast Matrix-Free Evaluation of Discontinuous Galerkin Finite Element Operators
No ratings yet
Fast Matrix-Free Evaluation of Discontinuous Galerkin Finite Element Operators
31 pages
Convex Integration Constructions in Hydrodynamics
No ratings yet
Convex Integration Constructions in Hydrodynamics
44 pages
Notes 24
No ratings yet
Notes 24
58 pages
Direct Numerical Simulation
No ratings yet
Direct Numerical Simulation
38 pages
Hou Interfaces23Conf KITP
No ratings yet
Hou Interfaces23Conf KITP
30 pages
Out 3
No ratings yet
Out 3
83 pages
Morgany, K & Perairez J - Unstructured Grid Finite-Element Methods For Fluid Mechanics (1997)
No ratings yet
Morgany, K & Perairez J - Unstructured Grid Finite-Element Methods For Fluid Mechanics (1997)
70 pages
High Performance Parallel Computing of Flows in Complex Geometries - Part 1 - Methods
No ratings yet
High Performance Parallel Computing of Flows in Complex Geometries - Part 1 - Methods
25 pages
Ashwin Chinnayya Et Al - A New Concept For The Modeling of Detonation Waves in Multiphase Mixtures
No ratings yet
Ashwin Chinnayya Et Al - A New Concept For The Modeling of Detonation Waves in Multiphase Mixtures
10 pages
Computational Fluid DynamicsFluent Modeling CourseFirst: An Introduction To CFD
No ratings yet
Computational Fluid DynamicsFluent Modeling CourseFirst: An Introduction To CFD
67 pages
2017 Aiaa - 0742
No ratings yet
2017 Aiaa - 0742
23 pages
2023, Advances in Phase Field Modeling of Multiphase Flow
No ratings yet
2023, Advances in Phase Field Modeling of Multiphase Flow
25 pages
Alias API
No ratings yet
Alias API
1,152 pages
Mesh DiscretIzation
No ratings yet
Mesh DiscretIzation
45 pages
CGChosen No04 Monsters&Creatures July2007 PDF
100% (1)
CGChosen No04 Monsters&Creatures July2007 PDF
96 pages
Mta0199 A
No ratings yet
Mta0199 A
20 pages
Pipeline Optimization Techniques
No ratings yet
Pipeline Optimization Techniques
7 pages
Lecture 1.0
No ratings yet
Lecture 1.0
29 pages
Acceleration of A 2D Unsteady Euler Solver With GPU On Nested Cartesian
No ratings yet
Acceleration of A 2D Unsteady Euler Solver With GPU On Nested Cartesian
12 pages
Lattttice Boltzmann
No ratings yet
Lattttice Boltzmann
12 pages
Proceeding 2021
No ratings yet
Proceeding 2021
10 pages
Journal of Computational Physics: Sanghyun Ha, Junshin Park, Donghyun You
No ratings yet
Journal of Computational Physics: Sanghyun Ha, Junshin Park, Donghyun You
19 pages
49aiaa 51 Amjad
No ratings yet
49aiaa 51 Amjad
16 pages
(Ebook - PDF) Modelling Physics Correctly For Computer Graphics
No ratings yet
(Ebook - PDF) Modelling Physics Correctly For Computer Graphics
23 pages
AIAA 2013 2567 Charest
No ratings yet
AIAA 2013 2567 Charest
21 pages
A Comparative Numerical Study of Pressure-Poisson-Equation Discretization Strategies For SPH
No ratings yet
A Comparative Numerical Study of Pressure-Poisson-Equation Discretization Strategies For SPH
9 pages
Numerical Methods For High-Speed Flows: Sergio Pirozzoli
No ratings yet
Numerical Methods For High-Speed Flows: Sergio Pirozzoli
37 pages
Lec 1 Examples
No ratings yet
Lec 1 Examples
9 pages
Analysis of The Boundary Problem With The Preference of Mass Flow
No ratings yet
Analysis of The Boundary Problem With The Preference of Mass Flow
9 pages
Hydro
No ratings yet
Hydro
4 pages
Numerical Study Between Structured and Unstructured Meshes For Euler and Navier-Stokes Equations
No ratings yet
Numerical Study Between Structured and Unstructured Meshes For Euler and Navier-Stokes Equations
13 pages
Abgrall11 - Some Examples of High Order Simulations Parallel of Inviscid Flows On Unstructured and Hybrid Meshes by Residual Distribution Schemes
No ratings yet
Abgrall11 - Some Examples of High Order Simulations Parallel of Inviscid Flows On Unstructured and Hybrid Meshes by Residual Distribution Schemes
8 pages
Direct Numerical Simulation of A Compressible Multiphase Flow Through The Fast Eulerian Approach
No ratings yet
Direct Numerical Simulation of A Compressible Multiphase Flow Through The Fast Eulerian Approach
4 pages
Steady Transport Problems: Stabilization Techniques: Problems Modeled by Convection-Diffusion-Reaction Equations
No ratings yet
Steady Transport Problems: Stabilization Techniques: Problems Modeled by Convection-Diffusion-Reaction Equations
13 pages
High Level Shading Language
No ratings yet
High Level Shading Language
14 pages
Stalker Complete 2009 v1.4.4 User Manual
No ratings yet
Stalker Complete 2009 v1.4.4 User Manual
18 pages
ArcGIS Pro 3.0 and 3.1
100% (1)
ArcGIS Pro 3.0 and 3.1
2 pages
Advances in OpenGL ES 3 0
No ratings yet
Advances in OpenGL ES 3 0
55 pages
Webgl Reference Card 1 - 0 PDF
No ratings yet
Webgl Reference Card 1 - 0 PDF
4 pages
300+ Top Unity 3D Interview Questions and Answers: C++ Engineering at Tomtom
No ratings yet
300+ Top Unity 3D Interview Questions and Answers: C++ Engineering at Tomtom
10 pages
3d Game Engine Design A Practical Approa
No ratings yet
3d Game Engine Design A Practical Approa
15 pages
Poser 9 Python Methods Manual
No ratings yet
Poser 9 Python Methods Manual
366 pages
CSC 481 .. Revision Questions
No ratings yet
CSC 481 .. Revision Questions
13 pages
Maya Under Water Lighting
No ratings yet
Maya Under Water Lighting
12 pages
Botanica Cotidiano
No ratings yet
Botanica Cotidiano
183 pages
Psycho Py Manual
No ratings yet
Psycho Py Manual
981 pages
MILKDROP Preset Authoring Guide
No ratings yet
MILKDROP Preset Authoring Guide
34 pages
Inkwood Documentation
No ratings yet
Inkwood Documentation
10 pages
Unity Shader
No ratings yet
Unity Shader
33 pages
OpenGL Basics For Sample Program
No ratings yet
OpenGL Basics For Sample Program
32 pages
Poser Pro 2014 SR4 Readme
No ratings yet
Poser Pro 2014 SR4 Readme
12 pages
How Gpus Work
No ratings yet
How Gpus Work
5 pages
Xbox 360 Architecture: Lennard Streat Samuel Echefu
No ratings yet
Xbox 360 Architecture: Lennard Streat Samuel Echefu
23 pages
E0 271: Graphics and Visualization Assignment #2: Goals
No ratings yet
E0 271: Graphics and Visualization Assignment #2: Goals
2 pages
NPPlog
No ratings yet
NPPlog
2 pages
Shader Format
No ratings yet
Shader Format
2 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Geometric functions in computer aided geometric design
From Everand
Geometric functions in computer aided geometric design
Oscar Ruiz
No ratings yet
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
From Everand
Mathematics 1St First Order Linear Differential Equations 2Nd Second Order Linear Differential Equations Laplace Fourier Bessel Mathematics
Andrew Igla
No ratings yet
Mathematical Functions
From Everand
Mathematical Functions
Oliver Linton
No ratings yet

Solving The Euler Equations On Graphics Processing

Uploaded by

Solving The Euler Equations On Graphics Processing

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Solving the Euler Equations on Graphics Processing Units

Conference Paper in Lecture Notes in Computer Science · May 2006

Trond Runar Hagen Knut-Andreas Lie

SEE PROFILE SEE PROFILE

Jostein Roald Natvig

The user has requested enhancement of the downloaded file.

Trond Runar Hagen1,2 , Knut–Andreas Lie1,2 , and Jostein R. Natvig1,2

Qt + F (Q)x + G(Q)y + H(Q)z = S(Q). (2)

To evaluate the integrand, we need to reconstruct a continuously defined func-

The reconstruction Q̂(x, y, z) is discontinuous across all cell-interfaces, and thus

Make initial Shaders for

Runge-Kutta substep iteration

a computer program for the algorithm in Section 2 would consist of a set of

In the following we present a few numerical examples to assess the computational

t=0.0 t=0.1 t=0.2 t=0.3

Fig. 2. Emulated Schlieren images of a shock-bubble interaction.

N Intel 6800 speedup AMD 7800 speedup

Fig. 3. The Rayleigh–Taylor instability at times t = 0.5, 0.6, and 0.7.

N AMD 7800 speedup

Example 2 (3D Rayleigh–Taylor Instability). In the next example we simulate a

View publication stats

You might also like