Agard CFD PDF
Agard CFD PDF
9
n
0
n
a
3
AGARD
ADVISORY GROUP FOR AEROSPACE RESEARCH & DEVELOPMENT
7 RUE ANCELLE, 92200 NEUILLY-SUR-SEINE, FRANCE
-- 'L
'8
I
1
Papers presented and discussions recorded at the 77th Fluid Dynamics Panel Symposium
held in Seville, Spain, 2-5 October 1995.
I I ,. ...................... 1
-
-@- I
NORTH ATLANTIC TREATY ORGANIZATION
Papers presented and discussions recorded at the 77th Fluid Dynamics Panel Symposium
held in Seville, Spain, 2-5 October 1995.
I
The Mission of AGARD
According to its Charter, the mission of AGARD is to bring together the leading personalities of the NATO nations in the
fields of science and technology relating to aerospace for the following purposes:
- Recommending effective ways for the member nations to use their research and development capabilities for the
common benefit of the NATO community;
- Providing scientific and technical advice and assistance to the Military Committee in the field of aerospace research
and development (with particular regard to its military application);
- Continuously stimulating advances in the aerospace sciences relevant to strengthening the common defence posture;
- Improving the co-operation among member nations in aerospace research and development;
- Providing assistance to member nations for the purpose of increasing their scientific and technical potential;
- Rendering scientific and technical assistance, as requested, to other NATO bodies and to member nations in
connection with research and development problems in the aerospace field.
The highest authority within AGARD is the National Delegates Board consisting of officially appointed senior
representatives from each member nation. The mission of AGARD is carried out through the Panels which are composed of
experts appointed by the National Delegates, the Consultant and Exchange Programme and the Aerospace Applications
Studies Programme. The results of AGARD work are reported to the member nations and the NATO Authorities through the
AGARD series of publications of which this is one.
Participation in AGARD activities is by invitation only and is normally limited to citizens of the NATO nations.
ISBN 92-836-0026-6
ii
Progress and Cha lenges in CFD Methods
and Algorithms
(AGARD CP-578)
Executive Summary
Computational Fluid Dynamics (CFD) now plays an essential role in the design of aerospace vehicles.
The ability of numerical methods to accurately simulate complex external and internal aerodynamic
flows is crucial to the success of these methods in the design process, and for airplanes leads to
improved performance, agility and maneuverability.
In the last decade, considerable progress has been made in the development of numerical methods
related to CFD. As a result, various promising CFD schemes and algorithms have been developed.
However, they are not currently used in industrial codes. At the same time, new developments in
computer hardware and architectures have led to significant advances in parallel computing and
multiprocessing. These topics, which are considered likely to constitute pacing items and new
challenges in CFD in the near future, formed the framework for the program for this Symposium.
The following subjects were addressed: parallel computing, advanced spatial discretization techniques,
unstructured, hybrid and overlapping grids, adaptive meshes, fast implicit and iterative solvers, large
eddy and direct numerical simulations of turbulent flows, chemically reacting flows and unsteady
aerodynamics. Interesting and new aspects of techniques involving these subjects were discussed,
substantiating their extended potential and improved capabilities. Several important directions of
research such as aerodynamic shape optimization and multidisciplinary analysis and design were
identified, which should be the subject of intensive advanced research in the near future.
The Symposium provided a very valuable opportunity for exchange of information about recent
developments and achievements. It can, therefore, be expected to significantly contribute to future
important progress in the advancement of numerical techniques used in the design of aerospace vehicles
and other flying objects.
Jean-AndrC Essers
Programme Committee Chairman
...
LLI
Progrhs rkalisks et dkfis en mkthodes
et algorithmes CFD
(AGARD CP-578)
L’aerodynamique numCrique (CFD) joue dCsormais un rBle essentiel dans la conception des vChicules
akrospatiaux. La capacitC des mCthodes numCriques B simuler avec prCcision des Ccoulements
akrodynamiques complexes internes et externes est essentielle pour la reussite de ces mCthodes dans le
processus de conception et pour les akronefs, elle permet d’amCliorer les performances, I’agilitC et la
manaeuvrabilitC des appareils.
Au cours de la dernikre dkcennie, des progrks considCrables ont CtC rCalisCs dans le dCveloppement de
mCthodes numCriques se rapportant au CFD. De ce fait, divers algorithmes et diverses mCthodes CFD
prometteurs ont CtC dCveloppCs. Cependant, ils n’ont pas CtC intCgr6s aux codes industriels. En meme
temps, les nouveaux dkveloppements en materiel et architectures informatiques ont permis des
avancCes apprkciables dans le domaine du calcul en parallhle et du multitraitement. Ces sujets, qui sont
considCrCs comme susceptibles de constituer les jalons et les nouveaux challenges du CFD dans un
avenir proche, ont constituC I’ossature du programme de ce symposium.
Les sujets suivants ont CtC examinks: le calcul en parallhle, les techniques de discritisation spatiale
avancCes, les maillages non-structurks, hybrides et imbriquCs, les maillages adaptatifs, les codes de
rCsolution rapides, implicites et itCratifs, la simulation des grands tourbillons et la simulation numCrique
directe d’Ccoulements turbulents, les Ccoulements B reaction chimique et 1’aCrodynamique non
permanente.
Des discussions pertinentes ont eu lieu sur des aspects nouveaux et intkressants de techniques se
rapportant B ces sujets, confirmant ainsi I’extension de leur potentiel et 1’amClioration de leurs
capacitts. Plusieurs orientations importantes pour la recherche, telles que I’optimisation du profil
akrodynamique et I’analyse et la conception multidisciplinaires ont Ctk identifiCes c o m e devant faire
l’objet de travaux de recherche avancCs intensifs dans un avenir proche.
Le symposium a fourni l’occasion inestimable pour Cchanger des informations sur les realisations et les
dCveloppements rCcents. I1 devrait, par conskquent, reprksenter une contribution non negligeable aux
futurs progrks importants dans l’avancement des techniques numtriques pour la conception des
vChicules akrospatiaux et d’autres objets volants.
Jean-AndrC Essers
Programme Committee Chairman
iv
Contents
Page
Synthbe iv
Reference
KEYNOTE SESSION
Chairman: J.A. Essers
The Present Status, Challenges, and Future Developments in Computational Fluid Dynamics 1
by A. Jameson (Invited)
A Parallel Spectral Multi-Domain Solver Suitable for DNS and LES Numerical Simulation 5
of Incompressible Flows
by A. Pinelli and A. Vacca
On Improving Parallelism in the Transonic Unsteady Rotor Navier Stokes (TURNS) Code 6
by A.M. Wissink, A.S. Lyrintzis and R.C. Strawn
Un SchCma Cinematique d’Ordre 2 PrCservant les PositivitCs pour les Equations d’Euler 10
Compressibles sur Maillages non Structures Auto-Adaptatifs
by Ph. Villedieu, J.L. Estivalezes and J.J. Hylkema
A Meshless Technique for Computer Analysis of High Speed Flows 11
by T. Fischer, E. Oiiate and S. Idelsohn
Can ce11ed 12
Numerical Simulation of Internal and External Gas Dynamic Flows on Structured and 13
Unstructured Adaptive Grids
by U.G. Pirumov, I.E. Ivanov and I.A. Kryukov
An Investigation of the Effects of the Artificial Dissipation Terms in a Modern TVD Scheme 14
on the Solution of a Viscous Flow Problem
by R.D. Briggs and S. Shahpar
Cancelled 15
A Flux Filter Scheme Applied to the Euler and Navier Stokes Equations 16
by A. Vinckier, J. Jacobsen and S. Wagner
A PCG/E-B-E Iteration for High Order and Fast Solution of 3-D Navier-Stokes Equations 19
by A. Rustem Aslan, U. GulGat and A. Misirhoglu
vi
Ameliorations RCcentes du Code de Calcul d’Ecoulements Compressibles FIU3M 29
by L. Cambier, D. Darracq, M. Gazaix, Ph. Guillen, Ch. Jouet and L. Le Toullec
General Discussion GD
Recent Publications of
the Fluid Dynamics Panel
AGARDOGRAPHS (AG)
Computational Aerodynamics Based on the Euler Equations
AGARD AG-325, September 1994
Scale Effects on Aircraft and Weapon Aerodynamics
AGARD AG-323, July 1994
Design and Testing of High-Performance Parachutes
AGARD AG-319, November 1991
Experimental Techniques in the Field of Low Density Aerodynamics
AGARD AG-318 (E), April 1991
Techniques experimentales likes ?
I’aCrodynamique
i a basse densite
AGARD AG-318 (FR), April 1990
A Survey of Measurements and Measuring Techniques in Rapidly Distorted Compressible Turbulent Boundary Layers
AGARD AG-315, May 1989
Reynolds Number Effects in Transonic Flows
AGARD AG-303, December 1988
REPORTS (R)
Parallel Computing in CFD
AGARD R-807, Special Course Notes, October 1995
Optimum Design Methods for Aerodynamics
AGARD R-803, Special Course Notes, November 1994
Missile Aerodynamics
AGARD R-804, Special Course Notes, May 1994
Progress in Transition Modelling
AGARD R-793, Special Course Notes, April 1994
Shock-WaveBoundary-LayerInteractions in Supersonic and Hypersonic Flows
AGARD R-792, Special Course Notes, August 1993
Unstructured Grid Methods for Advection Dominated Flows
AGARD R-787, Special Course Notes, May 1992
Skin Friction Drag Reduction
AGARD R-786, Special Course Notes, March 1992
Engineering Methods in Aerodynamic Analysis and Design of Aircraft
AGARD R-783, Special Course Notes, January 1992
Aircraft Dynamics at High Angles of Attack: Experiments and Modelling
AGARD R-776, Special Course Notes, March 1991
ADVISORY REPORTS (AR)
Aerodynamics of 3-D Aircraft Afterbodies
AGARD AR-318, Report of WG17, September 1995
A Selection of Experimental Test Cases for the Validation of CFD Codes
AGARD AR-303, Vols. I and 11, Report of WG-14, August 1994
Quality Assessment for Wind Tunnel Testing
AGARD AR-304, Report of WG-15, July 1994
Air Intakes of High Speed Vehicles
AGARD AR-270, Report of WG13, September 1991
Appraisal of the Suitability of Turbulence Models in Flow Calculations
AGARD AR-291, Technical Status Review, July 1991
Rotary-Balance Testing for Aircraft Dynamics
AGARD AR-265, Report of WG11, December 1990
Calculation of 3D Separated Turbulent Flows in Boundary Layer Limit
AGARD AR-255, Report of WG10, May 1990
Adaptive Wind Tunnel Walls: Technology and Applications
AGARD AR-269, Report of WG12, April 1990
CONFERENCE PROCEEDINGS (CP)
Aerodynamics of Store Integration and Separation
AGARD CP-570, February 1996
Aerodynamics and Aeroacoustics of Rotorcraft
AGARD CP-552, August 1995
Application of Direct and Large Eddy Simulation of Transition and Turbulence
AGARD CP-551, December 1994
Wall Interference, Support Interference, and Flow Field Measurements
AGARD CP-535, July 1994
Computational and Experimental Assessment of Jets in Cross Flow
AGARD CP-534, November 1993
High-Lift System Aerodynamics
AGARD CP-515, September 1993
Theoretical and Experimental Methods in Hypersonic Flows
AGARD CP-514, April 1993
Aerodynamic EngindAirframe Integration for High Performance Aircraft and Missiles
AGARD CP-498, September 1992
Effects of Adverse Weather on Aerodynamics
AGARD CP-496, December 1991
Manoeuvring Aerodynamics
AGARD CP-497, November 1991
Vortex Flow Aerodynamics
AGARD CP-494, July 1991
Missile Aerodynamics
AGARD CP-493, October 1990
Aerodynamics of Combat Aircraft Controls and of Ground Effects
AGARD CP-465, April 1990
Computational Methods for Aerodynamic Design (Inverse) and Optimization
AGARD CP-463, March 1990
Applications of Mesh Generation to Complex 3-D Configurations
AGARD CP-464, March 1990
Fluid Dynamics of Three-Dimensional Turbulent Shear Flows and Transition
AGARD CP-438, April 1989
Validation of Computational Fluid Dynamics
AGARD CP-437, December 1988
Aerodynamic Data Accuracy and Quality: Requirements and Capabilities in Wind Tunnel Testing
AGARD CP-429, July 1988
Aerodynamics of Hypersonic Lifting Vehicles
AGARD CP-428, November 1987
Aerodynamic and Related Hydrodynamic Studies Using Water Facilities
AGARD CP-413, June 1987
Applications of Computational Fluid Dynamics in Aeronautics
AGARD CP-412, November 1986
ix
Fluid Dynamics Panel
Chairman: M. C. DUJARRIC Deputy Chairman: Professor C. CIRAY
Future Launchers Office Aeronautical Eng. Department
ESA Headquarters Middle East Technical Univ.
8-10 rue Mario Nikis Inonu Bulvari PK:06531
75015 Paris - France Ankara - Turkey
PROGRAMME COMMITTEE
X
Technical Evaluation Report
AGARD Fluid Dynamics Panel Symposium on
''Progress and Challenges in CFD Methods and Algorithms"
N. KroI
Institute of Design Aerodynamics
DLR, 38108 Braunschweig
Lilienthalplatz 7, Germany
-
which are not yet currently used in industrial rithms
codes. At the same time, new developments in Implicit and iterative methods for Euler and
computer hardware and architectures have led Navier-Stokes equations, fast iterative solv-
to significant advances in parallel computing ers (multi-grid, Krylov subspace techniques)
and multiprocessing."
Numerical techniques for parallel computing
It must also be stated that despite the recent advances
and multiprocessing
CFD still suffers from deficiencies in accuracy, robust-
ness and efficiency for complex applications, such as
T-2
and addressed several issues in algorithm design. In par- work towards more. industrial applicability. As stated by
ticular, a unified approach to design accurate and effi- Rubbert, h i s is the responsibility of the money givers
cient shock capturing algorithms was presented. Some who inhabit the research engine.
examples of state-of-the-art calculations, which can be In summary, Rubbert's paper performed a general criti-
performed in an industrial environment, were given. cal assessment of today's system of research and its
Jameson pointed out that beside the transition to more stage of change. His observations represent the prag-
sophisticated algorithms, the present challenge is to ex- matic point of view of industry, from which the interest
tend the. effective use of CFD techniques to more com- of researcher's basic scientific findings are less empha-
plex applications. As key problems, he identified turbu- sized. This paper makes CFD researchers sensitive to in-
lent flows at Reynolds numbers associated with full dustrial needs. but some specific views of aemnautical
scale Right, chemically reacting flows, combustion and industry on the status of CFD and future. requirements
unsteady flows. Furthermore., multidisciplinary analysis, would have been desirable.
aerodynamic shape optimization and in the long run
multidisciplinary optimization were designated as im-
portant future target a r e a of CFD. In his presentation, The paper by Knight 131 presentedan overview of paral-
Jmeson outlined a very promising technique for effi- le1 computing in computational fluid dynamics. In the
cient three-dimensional shape optimization based on first part of the paper the basics of parallel computing
control theory. He demonstrated a succesriful design of a wen addressed, including the introduction of the dis-
swept-wing with very low wave drag within 40 design tinct levels of pallelism, the classification of parallel
iterations. In this example, the flow was modeled by the computer architectures and the description of the two
Euler equations. He mentioned that with this technique, basic programing paradigms, namely message passing
even in the case of three-dimensional flows, the compu- and data parallelism. The second part focused on several
tational requirements are so moderate that the calcula- key issues in the context of code development for paral-
tions can be performed with workstations such as the le1 computing. Dynamic load balancing and scalabiity
IBM RISC 6000 series. were identified as critical issues for complex CFD appli-
In summary, the invited paper delivered by Jorncson cations carried out on massively parallel computers.
gave a precise outline of the scope of the symposium Furthermore., a major con- of p d e l computing is
portability. Here. Knight discussed cumnt research ac-
and the expected outcome of the meeting.
tivities, including the development of message passing
standards (e.g. PVM, MPI) and data parallel pmgram-
In his presentation [Z],Rubben focused his remarks on ming language standards (e.g. m. In his presentation,
challenges and pacing items in CFD that extend beyond Knight pointed out that in the US. aerospaw industry
the technical ones. He pointed out that the key to devel- has taken a leading role in the application of parallel
oping better airplanes or better CFD is the same, namely computing to practical analysis and design. In the past
to analyze, understand and impmve the pmcesses by few years several major aerospace corporations have dc-
which airplanes or CFD m created. Rubbert called the veloped extensive networks of workstations for routine
pmess by which CFD capabilities are created the re- applications. Several examples were given in the paper.
s a c h engine. Such a research engine involves industry, Knight's presentation provided a basic intmduction into
academia and government, and the three components in- the field of parallel computing. The fundamental termi-
teract with each other as a system. In the past this sys- nology were explained and all critical issues were dis-
tem functioned quite well, but in his opinion, it has cussed. Therefore, the paper was very helpful for the un-
been almost disconnected from the customers of CFD derstanding and assessment of the following technical
research, namely the practicing design engineers. Im- papers which dealt with parallelition. Unfortunately,
pressive results of research have bee0 achieved, but they the paper did not discuss the potentials and limitations
wen not necessarily applicable by industry. The paper of high performance parallel computers to tackle large
pointed out many principal characteristics and attributes scale applications or new challenges in CFD. Results
which an improved, pmperly functioning research en- were only presented f a networks of loosely coupled
gine should have. The leading principles are customer workstations.
focus and customer satisfaction. Two farther key factors
were identified which will pace the change of the re-
search engine. The first is a two-way, more intensive Although papers on grid generation techniques were ex-
communication between the research cummunity and plicitly not encouraged by the call for papers, an invited
the engineering cummunity in mdustry. The second is a paper on status and proms of both structured and un-
modification of the evaluation system of the research structured grid generation for complex configurations
might have been desirable. The turn around time and ac- tured multigrid solver for industrial CFD applications.
curacy of the numerical simulation of industrial applica- Portability is achieved through the use of a message
tions very often depend on the capability of the avail- passing based high level communication library. This li-
able grid generation pncedure. Therefore, for ths brary supports any operation which is necessary m par-
critical assesment of numerical algorithms using struc- allel mode and involves communication between differ-
tured, unstructured a hybrid meshes, the capabilities ent processes. Performance measurements on a large
and limits of the underlying grid generation techniques variety of computers of different architectures demon-
have to be taken into account. strated the compreheiwive portability of the code. Appli-
cations included inviscid compntations for a generic air-
Z.2 Pa& Compnt@g craft consisting of wing/body/pylon/engine and viscous
Parallel computing is an important means to cut down calculations for a wing-body configuration on a m p u -
turn aronnd time and computational costs of large scale tational mesh with 6.6 million points. n e paper showed
applications. Fwthermom, it is believed that the exploi- that the complexity of coday‘s problems in applied aero-
tation of massively parallel computing is the key to dynamics cm be @ckled with parallel computers. It also
tackle new grand challenges in Cm,such as multidisci- revealed the necessity for an automatic and effective
plinary analysis and optimizaton. In the last several load balancing tool that allows the mapping of an initial
years a wide variety of parallel architectum have be- blcck structure to a higher number of processas than
come available which m r in the design of the CPU‘s
e given blocks. Details on the parallel efficiency of the
(vector versus RISC processar). the memory organiza- multigrid method ussd in the applications were not
tion (e.g. shared versus distributed memory) and the given.
communication spyatem (hardware nnd s o b ) . For the
future some of the vendors promise substantial incrrase The papas by Wssittk et d.[a], Diar d ’ A h i d a et al.
of computational power in both memory and CPU,One [7] and Bmkock er d.[36] focused on the parallel im-
of the main issues in parallel computing is the design of plementation of implicit Eulcr/Naver-Stokes solvers. In
numerical algolifhms which efficiently exploit the capa- [a] for example, two modifications of the well known
bilities of the parallel hardware. Especially in the case of
implicit LU-SOS scheme (Lower-Upper Symmetric
distributed memory machines, this is a non trivial task. Gauss-Seidel) were presented. The fust replaces the
T%e important sspeds in designing parallel algorithms Gaups-Seidel sweep in LU-SOS with a Jacobi-like
for these architectures are pardoning of data and compu- sweep which only quires n e w neighbor communi-
tation among the processors, communication at the in- cation and is therefore easy to parallelize. The second
ternal bgundariss, load baiancing and overhead due to one is a hybrid approach that couples the global Jacobi
oommunication and extra computations. Simpler algo- typc communication with the more efficient Gauss-
rithms, such as explicit schemes, parallelize quite easily Seidel sweep on each subdomaio. In both strategies
and thq lead to highpmRrmnaacem-t p d d corn-
multiple sweeps am required in each subdomain in order
plltsrs. However, due to their pwr convergence rates
to maintain the convergence behavior of the baseline
they are overall much less &cient than implicit
LU-SGS method. Both strategies have been investigated
schemes. although the latter anes g e ~ d performy far in detail with respect to parallel speed-up,convergence
M o w the peak of the p d e l machines due to the more rate and computational efliciency. Inviscid calculations
intensive and more global communicstion involved The
for 3-D hovering helicopter blades demonamed that
adjustment and further development of more sophisti- the hybrid strategy Is a promising implicit scheme for
cated algorithms such as multigrid and domain decom- parallel computers with a smaller number of pow&
position methods on parallel architectures are very processors.
promising. In contrast to explicit schemes, they provide
global distribution of information, however in a much
mom efficient way than traditional implicit schemes. %e presentations delivered by Pinelli et d. [5] and
Further research in this direction is needed in order to Srmg er ul. [21] eddrcsscd the padlelization of a l p
efficiently exploit the capabilities of parallel computers. rithms for DNS and LES.In [21] the various aspects of
the parallel implementation of a typical higher order
DNS solver baaed on domain decomposition were dis-
In this symposium, papers [41, [51. [61, [7l,1211. [U1 cussed. The intrinsic or algorithmic efficiency has been
and [36] dealt mainly with parallel computing and cov-
de6ned (see also [4]), which deals with the paralleliz-
ered various w ts thenof. Ths paper by ELfeld er ol.
ability of a dvm algorithm, regardless of the machine.
€41stressed the issue of portability. They described the Baged on some analysis, the authors showed that due to
portable parallelition of a state-of-the-ari block-struc-
extra Boating point operations at inner block boundaries
T-5
the algorithmic effickncy decreases rapidly as the spa- resolution of discontinuities and viscous shear layers.
tial discretization increws, that is, as the comsponding High resolution of all physical phenomena is r e q u i d
stencil-size grows. Test calculations on different parallel on a computational mesh with a minimum number of
architectures indicated that the machine d c i e n c y is grid points. Funherma, the spatial discretization
even considerably lower than the algorithmic efficiency. should support a robust and efficient time integration.
Funhennore, the paper reponed on fist experience that Recently, substantial progress has kmmade in this area
has bean gained for the implementation of the DNS and many diffemt promising approaches for the im-
solver on a SGI Power Challenge Array (4 nodes each proved d i s c h t i o n of the Euler and Navier-Stokes
comprised of a 16CPU shared memory parallel ma- equations are h w o in the litenrture. Among these are
chine) using a combination of fine-gained (&& s.g. impmved shock capturing algorithms based on flux
memory) and coarse-grained parallelism (explicit mes- difference and tlw vector splitting, multidmnsional
sage passing). The results were very promising. how- upwinding, residual distribution schemes and kinetic
ever, this parallelization strafegy needs furthGr investi- flux splitting. These methods have been investigated in
gation. detail for one and twbdimensional flows. Very ohn,
however, their supetiority to conventional methods have
only been demonstrated for simple test c a w . Therefore,
The paper by Sibitla and Wtaletii [E] did not show any
the key issue remains the manifestation of the improved
parallel computations, but it addressedseveral impnrant abilities of the advanced methods for relevant ZD and
aspects of multiblock-structured grid algorithms in a 3-D viscous flows around m a complex geometries.
parallel conlputing eowironment. As in [41 th@ manage-
ment of data communication between adjacent blocks is
provided by a parallel library (PARAGRID) which en- At the symposium several papers [91, [IO]. [121. [14],
sures that the same average values are assigned to all (151, [I61 and [ln. w e specifically devoted to im-
replicas of the same bousdaty node owned by different provements of the spatial discretization of Eulermavier-
blocks. The paper discussed the influence of block sub- Stokes solvers. The paper by DetoAnys er ul [9] pre-
division on accuracy and efficiency within the frame- sented the development of a new quadratic reconstrue
work of a multigrid scheme. Tbc solution algorithm has tion finite volume scheme for unstructured polygonal
been modified in order to account for the presance of lo- meshes. The most firtquently employed linear cell tb
cally unstructured topologies at block bmndaries (sin- construction of the flow variables requires sufficiently
gular points). For some test cases it could be demon- regular grids for second order accuracy and it results in
s m s d that the convergence af the numerical method a first order scheme for irregular meshcs. In contrast to
could only be. emumi with this modification. . this, the pmposed q d r a t i c mcominaction pmvides a
full second order seheme even for very irregular
meshes. In order to avoid spurious oscillations in the vi-
In conclusion, most of the papers focused on some spe- cinity of discontinuities, the quadratic reconstruction is
cifie rtlgorithmic aspects of parallel computing. Eaort switeherl to a momtone constant one with the help of a
was essentially put in adjusting sequential algorithms p’operly de6ned discoatinuity detector. %e. method is
rather than developing new p a d e l schemes. Only a designed to deal with adaptive u n m c w grids con-
few large scale Cm, applications have been presented sisting of cells with an arbitrary number of edges. ’Rme
demonsuating the capabilities and h i t s of parallel ar-
integration is perfomredby an implicit scheme based on
chitectures for industrial CPD applications. One of the Newton-Krylov techniqlres. The efficiency and high BC-
main challenges for paraUel complex applications ls the curacy of the numerical method wem d e m o n m t d for
load balanced p d o n i n g of the flow domain, which is various 2-D inviscid and viscous laminar computations
essential for obtaining optimal machine psrfowance. including test cases with locally distorted mesh. How-
This important issues were hardly addressed in the con- ever, the mahod needs to be. carehlly investigated for
ference. 3-D complex geomc4ries end turbulent flows at high
2.3 AdvanudSpatidDlserrtiE.bon *sctnMQs Reynolds numbers where hifly irregular meshes are
expGcted. Furthurtlom, the sensitivity of the q u a d d c
Although in the last dwade extensive research has been rwonshudon with rospoct to implementation of wall
ongoing towards the developEnt of BowuBte Eukx asd bOUIld&y COnditiOM has to be h v d g a t c d .
Naviet-Stokes solvtrs, the improvement of spatial $is-
cretization sc- is still a major concern in CPD.
Suitable discretization schemes are expected to offer The paper d e l i d by yillcdiau et al. [IO]presented a
certain properties. These ace conservation, at least sec- second order scheme based on kinetic flux splitting.Ihe
ond order accutacy in smooth flow regions and sharp main fGature of this approach is that under a CPL like
roadition W ty and eaergy can ka proved t e & tension of these schemes to Eulerflvavig-Stokea equa-
nonnegative. This malres the msthoa very arrmctive for tions is stmi& forward provided that a conservative
of flaw fields with near vacurn c ~ t i o n s , lineiaation @an k fatnd. Tkig can M y be &ved
sueh tm daws m u d hypmmnk vehicles, Fromiking re- fw friangular mshcs, whereas for quadrilaFunl meshes
s& wm. shown for 2-Dsupmonk and hypmnic in- it is more difftcult aad still subject of ongoing rewarch.
viscid flows in comparison with the classical Roe flux The. paper presented various numerical sxampka for 2-
dSfawce @it whew, F ~ M3-D mults for a wing D flows demens- the ability of the lgsihinldecom-
dme epplicntkrd w m presented which do not yet allow poshion .pproaeh. tn pathtar, the mwlts indicate the
the aswme8t of ths apjwoech for 3-Dmom complex improved resolution offlow disontinuities which are not
application, f%rthmm,derpilad mmds on &e con- aligned with mesh lines. Unformakly, the issue of ac-
vergea~ebdinvlor of tke methodwemmissing. curse predictionof turbulwt viscous flow was not ad-
dressed in the paper. Furthermore, no3-Dnsults were
shown. The rwidupl decomposition schemes have been
sIIccesBfuUy combined with implicit methods and d u -
tion adaptive teehniqws.
studies with respect to grid fineness and grid regularity computations gained at RockweU Science Center over
for transonic 2-D and 3-D viscous flows. It should be the past several years. One of the most important lessons
clarified whether with these new concepts substantial
progress can be made towards accurate drag prediction
they have learned from many 3-D applications is the
fact that in spite of all the advances that have been made
.
li
The paper by Becker et af. [U] addressed the adaptive cient algorithms to solve the spatial discretized Eu-
grid rsfinement for block structund solvers. In this con- IerINavier-Stokes equations has become very obvious.
cept, locally refined mesh blocks are patched into the Many solvers still uaed in current aerospace develop
existing mesh. The additional fine subblocks are con- ment programs exhibit slow convergence towards the
nected with the original mesh via the multigrid tech- desired steady state solutions which leads to high com-
nique. The level of local truncation error is used as error puter costs and long turn around times. Consequently,
indicator. Following the idea of Brandt, truncarion e m r there is a substantial amount of research work focused
estimates can be extracted directly from the multigrid on methods for convergence acceleration, Promising ap-
cycle. So far, the refinement procedure is set up outside proaches are the multigrid time-sfepping technique and
the flow solver. First results presented for 2-Dand 3-D the Newton iteration with fast iterative solvers. In struc-
inviscid and viscous test caws show fhe feasibility of tured codes multigrid techniques based on explicit mul-
the strategy of subblock refinement. However, consider- tistage schemes are widely used and they have been
able more work is required to establish a fully auto- proved to yield good convergence rates for many practi-
matic, robust and accurate adaption method. cal applications. However, for the numerical simulation
of high Reynolds number flows, the convergence of the
standard multigrid schemes considerably slows down.
Van der Vegi er al. [20] presented a hexahedron based
grid adaption procedure, The method uses the discontin-
This is due to the stiffnass of the numerical problem,
which is introduced tbroug4 the high-as* ratio cetls
ues Galerkin finite volume formulation with local grid
requid for the efficient solution of such flow fields.
enrichment. A directional grid adaption is employed
Therefore, one of the key issues concerning algodrhmic
which allows subdividing of cells, independently in
each of the three local grid directions. This anisentropic development is the design of appropriate multigrid com-
ponents, such as smoothing and grid transfer operators,
grid refinement is expected to be more efficient in cap-
which efficiently k k l e high aspect ratio cells.
turing local flow phenomena than isentropic refinement,
since many flow featwas are onedimensional. The sen- Interest in fast iterative methods has been mainly moti-
sor uses primitive variables and is constructed such that vated by unstructured solvers. It was shown that cou-
it prevents regions with discontinuities from constantly pling Newton's method with iterative solvem for the in-
dominating the local grid refinement procedure. The ca- ner iteration is an effective approaeh for solving the
pability of the adaptive method was demonstrated by large systems of nonlinear equations arising from the
calculations of the inviscid transonic flow around a ge- dmtimtion of Wer and Navia-Stokes equations. An
neric delta wing. From the author's viewpoint, the hexa- i n t e d n g featme of Newton's method is its ability to
hedral based adaptive solver is a good candidate for provide superlinear asymptotic convergence. On the
large eddy simulations (LES), becaw it offers the o p other hand, efficient iterative schcmes based on New-
portunity to accurately capture viscous sublayers with ton's iteration require excessive memory allocationsfor
successively fine grids through load grid refinement. three dimensional applications. Therefore, strategies
LES results, however, were not shown. have to be developed which eliminate the large storage
requirements but still remain the favorable convergence
characteristics of Newton's method.
Grid adaptive procedures based on point redistribution
wen diseussed in the papers [121, [271, [331. This tech-
nique was mainly used in the framework of moving The paper by P u l h et al. [NI gave an excellent over-
grids for unsteady calculations. view of the potentialities and drawbacks of Newton's
method applied to CFD solvers. For practical re880115,in
each Newton iteration the large block banded matrix is
In conclusion, various adaptive saategies were pre- solved by an iterative matrix solution method. In partic-
sented. The important issue of developing a suitable in- ular, the paper addressed the class of Krylov subspace
dicator for adaption was addressed. Various error esti- methods known as GMRES. It p e n t e d practical as-
mators have becn proposed and successfully applied to pects and implemeatation issues of these methods. %
inviscid flows. However, further research is needed to main components of the Newton-GMRES approach,
establish efficient and robust adaptive methods for vis- such as evaluation of the Jacobian, matrix-vector multi-
cous flows. ply and matrix preconditioning, were discussed with &
2.6 Fast Implicit and Iterative Sdvers
spect to global convergence behavior, memory e-
ments and accuracy. Trade-offs between futl Newton
As numerical flow simulations pave thein way into the and approximate Newton and other pertinent approxi-
practical aerodynamic design process, the need for effl- mations were investigated. The Newton-GMRW solver
was analyzed in the framework of a s t r u c t d and un- calculated and the effects of turbulence are modelled by
smciNul(fd 2-D Navier-Stokes codb. In both cases very a so-called Mbulence model. However, in many cases
prornising rasulta w e n shown. Calculations with similar the quality of the sulution may strongly depend on the
methods were ria0 carrid out in papers [91, [ 151.It can turbulence model wed in the calculation and at best
be concluded that optimal shtegks which .msw favor- quaionable. Wts may be obtained for more complex
able convergence chamterlstics will lead to excessive flow phenomena such as massive flow separation. The
memory requirements. No 3-Dcdculation with New- rapid increase of unnputer resources motivated the re-
ton-Krylov subspace techniques were presented at the senrch on direct numerical simulation (DNS) or large
conference. eddy simulation (LES) of turbulent flows. In the case of
DNS, the unsteady Navier-Stokes equations are solved
directly. No turbulence model is required since all scales
The paper by C d i e r et al. [a]proposed a new im- and turbulence motions pre;sent ace resolved numeri-
plicit algorithm called DDLU factorization. Compared cally. Due to excessive computer mources required
to the classical AD1 factorization, this s u a m enabtes a
even for simple geomelries, this simulntion technique is
redwtion in both CPU time and memory. Tln new im out of question for practical applications. kIowever, it
pliit technique was applied to a 3-D supersonic test provides a very important methodology for turbulence
case on a relatively come, mosh. For a romprehensive research. In contrast to DNS,the large eddy simulation
aswssment of this technique further test calculations are of turbulent flows resolves only the large scale structure
requid. of the turbulence, while the effects of smaller eddies are
described by a statistical subgrid medel. As the resolu-
The paper hy Me&k e?al. [ 181 was devoted to conver- tion of the fine scale turbulence motion is not requid,
gence acceleration of the Navier-Stokes equation far fewer grid poiuts are neaded making LES feasible
through a timederivative pmcon&tioning of the gov- for practical problems at relevant Reynolds numbers in
erning equations. U s i q physical srgusnants, a general- the near future. On the other hand, in order to ensure im-
ized pwmnditionqr was developed, ensuring conver- proved results oomppred to the solution of the Reynolds
gence &amctmiatics which am independent of the averaged Navier-Stokes equations (RANS), besides the
Mach number. The uniform convergence was demon- establishment of a suitable subgrid model, accurate res-
septed for a variety of applications covering a wide olution of the viscous sub-layers in the near wall regions
range of Mach numbers. In many low =peed cases, the is needed. This substantially increases the number of
preconditioned system s b w d a much imprwed con- grid points for LES compand to RANS solvers. Fur-
wrgence rate while having no detrimental effects in re- thermore, since time accurate solutions are calculated in
gimes where the original method alrsady worked ef& the fnunework of LES,significant further development
ciently. So. preconditigning of the governing equations of the classical CFD methods is needed. In addition to
may &er the possibility to develop an efficient unified the validation of a subgrid model, more sophisticated al-
flow solver for the whole Mach number regime. Further gorithms SWh p~ &ti% grids, h i g k order discretiza-
research is r e q u i d to establish this approach. tions. efficient unsteady solvers d parallel computing
have to be mad^ available for LES before this technique
can be usgd as a tool for flow simulations.
At the conference none of the papers devoted to conver-
gence acm.lenttion addressed the key problem of com-
prtiog realistic Reymldp n u m k flows. Thsse flows re- Numerical aspects of DNS and LES were addressed by
quire c o q u w i o n a l m e h with veay high a s p t ratia papem [51,[281,[211 and [221. As already mentioned
01 irregular c e b leading to very sfiff discrete equations. above. the papas [SI and [21]were devoted to the ex-
The developslGnt of numerical shutegies to overcome ploitation of Pprauel computers. whercas p s p r [201pre-
the stiffnesr and to ensw fast conveqgence in these.flow sented a grid adaption method specially designed for
situations is one of the grand challenges in algorithmic LES.The focus of Gums et al. [22]was the investiga-
research. tion of subgrid scale models w i t h a classical explicit
finite dfierence method. The aim of the presentation
2.7 ~ t B l o w s , L E s I D N s was to show some examples of what can be achieved
The key problem of accurate numerical simulation of with today's supercomputm sml standard codes using
complex Aowe is the U p t i o n of transition and m u - eddy-viscosity models.
lence. CumnUy, in all industrial relevant calculations,
the Reynoldp a v e r 4 Navies-Stokes equations are In summary, from the papers delivered at the conference
solved, in which only bsterigiiOally sfationary flow is it is very difficult to estimate whether a large eddy simu-
T-11
lation for a practical problem, such as a clean wing at a the distinct flux vector and flux Merence splitting con-
relevant Reynolds number, will k o m e feasible in near cepts while remining their inemsting features. The pro-
future. In order to reach that goal, significant research posed method is a m b i i o n of the Van Leer scheme
work on both algorithms and subgrid scale modeh is and the Osher scheme with some modificati~nsand ex-
needed. A few preliminary approaches for algorithmic tensions. The ability of the new method to resolve vis-
improvement were shown at the. conference. cous hypersonic reacting flows was illustrated by vari-
ous results including internal and external flow
'28 C h ~ R ~ F l o n s configurations. The time integration is pedormed by an
The effective use of CFD for viscous hypersonic react- u n f d implicit scheme, which in the c m n t impke-
,i n g flows is one of the present challenges. In the past, mentation leads to somewhat slow convergence rate and
substantial effort has been devoted to this research area needs to be improved for further applications.
and key requirements for efficient solution algorithms
have been identified [30]. These are sharp capturing of In conclusion the two papers on reaoting flows c o v e d
strong shocks, robustness in regions of strong flow ex- the key issues for developing efficient numerical tbols
pansion, high resolution of viscous regions, efficient for the simulation of complex flows. Very promising re-
treatment of adverse grid and flow situarioas in the case sults were prescntcd, illustrating that effective @*
of complex 3-D geomchles ' , and effective integration of
tions in terms of both accuracy and efficiency for com-
stiff equations introduced by the large chemical source plex d g u r a t i o n s are now feasible.
-turns.
2.9 I . f n s t e d y ~
The two conference pepers devoted to reacting flows ad- For steady flows, substaatial CFD capability has been
dressed these algorithmic issues. The paper by Rade- achieved over the last two decades and EulerNavier-
spiel et al. [30] reviewed reant progress made with flux Stokes solvers are intensively used in d y n a m i c de-
vector splitting methods to ensure high resolution and sign. In contrast, although some isolated unsteady flow
r o b m e s s for hypersonic viscous raeaing flow simula- calculation5have bcen carried out for various classes of
tions. 'Avo promising approaches -fly published in problems, numerical simulation of unsteady flow fields
the literanue were discussed and compared. Both based on EuledNavier-Stokes equations is certainly not
schemes use scalar dissipation functions and W i con- routine for industrial npplieations, due to the excessive
ceptual d i f f e r e m appear in the resolution of shock compuqional effort involved in these calculations.
waves. Implementation details and recommendations From the algorithmic point of view, new innovative eon-
for their effective use for viscous flows were given. Fur- cepts are required, which substantially cut down the
thermore,the capabilities of the multigrid method based costs of time accurate simulations. This is especially k
on explicit multistage time-ste,pping schemes were in- portant for viscous flow calculations. where a very fine
vestigated for reacting flows. A number of modest mod- mesh near the wall is required to resolve the boundary
ifications of the standard multigrid method successfully layer. Is- that are central to unsteady CFD are the use
used for subsonic and transonic flow problems were re- of efficient haplicit time integration with favorable sta-
ported in order to ensum fast convergence for high bility and accuracy characteristics, moving grids, adep
Mach number flows with strong shocks. The stihes of tive grids with I d grid relineme.ntlconrsening and par-
the equations introduced by the large chemical source allcl computing. Moreover, for aercelastic applications
terms is removed by a point implicit treatment. Various efficient coupling strategiesare required.
computations for diffwent complex flow problems were
presented. They impressively demonstrate that with the
T h e accurate calculations have been sddressad by scv-
reported algofithmic improvements converged flow so-
lutions for reacting flows over complex 3-D configum- eral paws (e.g. 1273, [321, Wl, [341, W1). papa
by Pentaria et d.[32] focused on the solution of the un-
tions are now feasible.
steady incompressible 2-D Navier-Stokes equations us-
ing a projection metkodology developed for cokcated
The paper devoted by Caquel et aL [31] focused on the grids. Standard numerical schemes. such as approximate
extension of a hybrid upwind spitting method to mn- factarkation techniques, were employed. The numerical
equilibrium flows.Eased on thGexpuicncc that the clas- results presented for some test casm were encouraging.
sical Van Lccr flux vector scheme is not suitable for vis- however, no remarks on the efficiency of the method
cous calculatioas and the Roe type flux We- were given.
solvers arc not robust for hypersonic flows, a new up-
wind approach was prcseated which basically combines
I'he paper deli& by ANun [33] was devotad to grid 3. CONCL.UJ"GREMARKS
adaption for ~ ~ t c p inviscid
dy airfoil flows. The solu- In chapter 2 each specific subject of the meeting has al-
tion odaptivc grids are gemrated by a new tcansfinite in- ready been fully commented, so that only g e n d con-
tqol&on technique. An inferestinp approach was pre- cluding remark8 m given here.
sented, in which adaption is perfowed by adapting the
In the evaluator's opinion the theme of the symposium
interpolation par- instead of the physical grid pa-
"propresS and Challenges in CFD Algorithms and
sitions. For unsteady calculations. &I adaption is per-
M e t h W was too e n m p i n g and too ambitious for
formed gFaduplly by imposing a so-called adaption ve-
a 3 I t 2 day long AOARD conference. Many papers of
locity onto each grid point. The grid interpolation
great interest and high technical standard were deliv-
strategy was shown to be well d t e d for srnrctured mov-
ing grids. It is very flexible and requires only little CPU
end. They addressed specific challenges in CFD, pro-
posed new methods or modifications to known method-
time. Steady and unsteady airfoil computations wen-
and presented smaller or larger progress. On the
presented illuslrating the improved rasults from the ologiesother hand, howevcr, quite a large number of papers of
adapted meah. For the ddatim, an upwind E h
lower quality were presented, which either did not focus
solver with the dual-time implicit approach was used,
on curmnt key issues of algorithmic restarch or mainly
which is considerably more e@cient than the basic ex-
reinvented well known results. Probably this situation is
&it solver. The paper focused on two-dimensional in-
very similar to all other large CFD cooferences. But
viscid appliestons, so that the flexibility and efficiency
measured against the ambitious theme of this sympa-
of the proposed grid adaptiou strategy are still to be ver-
sium. it has to be clearly stated that in many areas the
ified for both viscous and threeduneasl
' 'onal flows. A
Seville cont%renw did not reflect the actual status of
time-vaxying grid technique was also presented by [%I.
CFD and its reeent.pmgre.ss. Considering Jameson's ex-
Here, the time integration was carried out with a m n d
cellent survey paper, it is obvious that several important
order implicit scheme.
algorithmic deveiopments and recent improvements
were not addressed. For example, no paper was devoted
A more sophis~cticstedmoving grid technique was pre- to d y n a m i c shapp optimiition and multidisci-
sented by Jones e$ al. [271, with the goal of computing plinary analysis, topics which are increasingly important
pircraa store trajffitorieg The technique is basmi on for future CFD applications in industry.Furthcrmorc. in
fully mstrurhtrcd or hybrid meshes. It WBS pointed out some areas such as unstruchKed grids and adaptive
that the geomepic conse~~ation law has to be satisfied schemes,CFD is much hrther developed than reported
within the framework of moving grids in order to guar- at the confaence. Since many leading experts, ape
antee consistent results. So far, only hvo-diiensional cially those from the US., did not contribute to the con-
unsteady malts have been achievd ference, it is haid to expact that the high demands of the
symposium could be met.
The papr by Kuis Calavem et al. [3S]a d d r e d para- Nevertheless, several important dmtions of algorith-
metric studies of a time accunte Euler code for osoillat- mic research were address& which m expecred to im-
iag wings. A rather standard central scheme with dfi- prove the capability of Cm, for complex applications in
cial dissipation and explicit multistage time stepping the industrial environment. Thwe included parallel
scheme was used Effects ef grid density and artificial computing, a d v a n d dhmtbtion techniques, fast iter-
viscosity on the time acemm solutions were discussed ative solvers and powerful acceleration techniques,
ShOWhg heXp.atd bGh&m. The Code has been h- adaptive schemes and flexible strategies for discretizing
plernented on a powerful parallel computer, namely the the computational domain. Interesting and new aspects
National Wind lhnncl of NAL in Japan. It was demon- of these techniques were discussed, substantiating their
strated that parallel computing is a n e c e s q in@ent extended potentiab and improved abilities. In most
for effective thnedimcnsional unsteady flow calcula- cases, however, the superiority of the more sophisticated
tions. methods to the well established standard schemes was
only demonstrated €or simplified test problems, for
which the classical methods also perform quite well.
In summary a view central issues for unsteady computa- Very often results were shown for 2-Dinviscid and lam-
tions were discussed by the conference pqms. How- inar viscous flows. Three-dimensional calculations were
ever, no major progress in the development of a l p restricted in most cases to inviscid flows or simplified
rithms for efficient thtabdrm ' ensional time accurate
geome~es.Only a few more d s t i c calculations were
calculations was presented. presented. To make a step forward, it is very important
to apply the advanced mculodologies to those problems,
T-13
In conclusion, considerable research work is still needcd 8. Ramakdshnan, S.V., Szema, ICY., Chen, C.L.,
to establish CFD as an effectiw tool in the aerodynamic Shankar, V.V..,Chakravatthy, S.R., "Experiments
design process. The most important, but probably also with Unstructured Grid Computations".
the most limiting factor, is turbulence modelling. a sub-
ject which was outside the scope of this symposium. 9. Delanaye, M., Geuzaine. Ph.. Essers, J.A., "A
With respect to algorithms, furthar development and im- Secon Order Accurate Finite Volume Scheme Solv-
provement remain essential but have to be directed to- ing Euler and Navier-Stokes Equations on Adaptive
wards the real challenges in CFD. which include: unstructured Grids with an Implicit Convergence
Acceleration F'rocedure".
accurate viscous flow simulation at relevant
Reynolds numbers 10. Villedieu, Ph., Estivalem, J. L.,"A New Positivity
effective treatment of complex configura- Preserving Second Order Accurate Kinetic Scheme
tions, such as a complete airrraft for the Multidimensional E u b Equations on
Unstructured M e s W .
efficient simulation of more complex flows
with multiple space and time d e s , such as 11. Onate, E., F i h w , J., "Meshless Techniques for
unsteady flows or reacting Bows Computer Analysis of High Sped Flows".
large eddy simulation for practical applica-
12. Pirumov, U.G., Kryukov, LA.. Ivanov, I.E.,
-- tions
aerodynamic shap optimization
"Numerical Simulation of Internal and External Gas
Dynamic Flows on Structured and Unstructured
Adaptive Grids".
multidisciplinary analysis and design
The Seville symposium was a step in the right direction. 13. Briggs, R. D., Shahpar, S.,"An Investigation of the
For some topics, it showed some good promise but there Effects of the Artivicial Dissipation Term in a
is still considerable work to be done to meet the chal- Modem TVD Scheme. on the Solution of a Viscous '
lenges of industrial CFD. The symposium provided a Flow Pmhlem".
valuable forum for exchange of information about re-
cent developments and achievements. 14. Vinckier, A., Jaeobaen, J., Wagnw, S., 'Multidi-
mensmnal Upwinding with Flux Filters for the
4. LITERATURE Euler and Navier-Stokes Equations".
I . Jameson, A., "Present Stahls, Challenges and 15. Paill&, P., Carette, J.C., Issman, E., Van der
Future Developments in CFD". Weide. E.,Dcconinclz H..Degrez, 0. "Implicit
Multidimekonal Upwind Residual Distribution
2. Rubben. P., "CFD Research in the Changing US.
Schemes on Adaptive Meshes".
Aeronautical Industry".
16. Van Ransbeeck, €?, H i d , Ch.. "Multidimensional
3. Knight. D., "Parallel Computing in CFD".
Upwind Dissipation for 2-DE!-D Eulw/Navier-Sto-
kes Applications".
ZLJ
17. A g l a A.R., Wcat, U., Misinhoglu, A., "A Wentrement Hybrides pour la Simulation
PCCIIE.B-E Iteration for High Order and Fast Solu- d'Ecoulements en Degtquilibte ~ r m i q u e et
tion of 3-D Navier-Stow Equations". Chimique".
18. Mer&, C.L.,Frcnltaaswaran, S., "Convergence 32. Pentaris, A., Tsangaris.S.,"A Projection Methodol-
Acceleration of Naviw-Stokes Computatims ogy for the Simulation of Unstaedy Incompressible
Thraugb %e Derivative Prsconditioning". VISCOUS Rows Using the Approximate Factoriza-
tion 'Itchnique".
19. Pullim, T.H., Ropers, S. Barth, T., "Practical
AsspecrS of Krylov Subspace Basad-IWatiw Meth- 33. Allen, C.B., "An Implicit Upwind Scheme with
ods in CPD". Orid Adaption for Unsfcady Euler Aemfoil Rows".
21. .%fen& M., Kuwten. H..Broczc, J., Geurts, B., 35. RU~Z R.P., ~ i m s en..
, **patmetric
studies
"Parallel Algdthms h DNS of Conpressible of a lime AccurateFinite Volume Buler Code in the
Row". N.W.T.ParaUel Computer".
22. COmra. P., "A SfI'ai&&~iVd 3-D Mdti-Block 36. Badcock, K.J., Richards, B.E.. "Parallel Implicit
Unsteady Navis-Stokes Solver for Dirnct and Upwind Methods for the Aerodynamics of Aero-
h g e W y Simulatioos of Transitional and Turbu- space Vehicles".
lent CompressibleFbws".
23. Orsizag, SA., Quian. Y.% S d ,S., "Applications
&Lattice Boltzmarm Medmds to Fluid M&anics".
24. Becker, K., Rill, S., "Structured Adaptive Sub
B l ~ Rcflnsment
k fbI 3-D Rows*.
1. SUMMARY Des ite the advances that have been made, CFD is still
not geing exploited as effectively as one would like in the
This paper presents a perspective on corn utational fluid design process. This is p a d due to the long set-up and
dynamics as a tool for aircraft desi t addresses the
requirements for effective industriaf%e, .and trade-offs
P high costs, both human andlcomput?tional of complex
flow simulations. The essential requuements for indus-
between modelling accuracy and computational costs. Is- trial use are:
sues in algorithmaesi ari discussed in detail, together
8"
with a unified,ap roac to the design of shock captunng
algonthms. Fin aYly, the pa r discusses the use of tech- 1. assured accuracy
niouec drawn from
...~~controf%eow, to detemune ootimal
~~~~~~~ ~~~~~~
~~ ~
aerodynamic shapes. In the future mul$disclplinaj anal- 2. acceptable computational and human costs
ysis and optimization should be combined to provide an
integratednumerical design environment.
3. fast turn around.
Paper presented ut the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spuin, from 2-5 October 1995, and published in CP-578.
1-2
3. THE COMPLEXITY OF FLUID FLOW AND potential flow or Euler solutions for an airfoil can be ac-
MATHEMATICAL MODELLING curately calculated on a mesh with 160 cells around the
section, and 32 cells normal to the section. Using multi-
3.1 The Hierarchy of Mathematical Models grid techniques 10 to 25 cycles are enou h to Gbtain a
conver ed result. Consequently airfoil c&uIations can
Many critical henomena of fluid flow, such as shock be perkrmed in seconds on a Cray W,and can also
waves and turI? ulence,, are essentlally non-linear. They be performed on 486classdpersonal computers. Corre-
also exhibit extreme disparities of scales. m l e the ac- spondingl accurate three- mensiond mviscid calcula-
tual thickness of a shock wave is of the order of a me? nons can ge performed for a wing on a mesh, say with
free path of the gas articles, on a macrosco ic scale its
thickness . ~~ ~~~~ ~~~ ~~~~~ ~ l
s essenti$Iv, zero. In turbulent ow enerev
i. 192x32~48=294,912 cells, in about 5 minutes on a sin-
gle processor Cra YMP, or less than a minute with eight
is transferred from large scale motions to pro essive?
sm4ler eddies until the scale becomes so sma I that thef 1
processors, or in or 2 hours on a workstation such as a
motlon
motion is dissipated by viscosity. The ratlo of the length Hewlett Packard 735 or an IElM 560 model.
scale of the global flow to that of the smallest persisting
perbistlng- VISCOUS simulations at high Reynolds numbers require
eddies is of the order Rei, where Re is the Reynolds num- Careful two-dimensional studies
her, typically in the range of 30 million for an aircraft. In
order to resolve such scales in all three space directions a
computational grid with the order of Rei cells would be ary layer, in addition to 32 intervals between the boundary
required. This is beyond the ran e of any current or fore- layer and the far field, leading to a total of 64 intervals.
seeable computer. Consequeniy mathematical models In order to prevent degradations in accuracy and conver-
with varyin degrees of simplification have to be intro- ence due to excessively large aspect ratios (in excess of
duced in ode, to make corn utational simulatlon of flow f.000) in the surface mesh cells, +e chordwise resolu-
feasible, and to produce viagle and cost-effective meth- tion must also be increased to 512 intervals. Reasonably
ods. accurate solutions can be obtained in a 512x64 mesh in
Figure I (su plied by Pradee Ra') indicates a hierar- 100multigrid cycles. Translated to three dimensions, this
chy of mode[ at different leve% o(simp1ification which would imply the need for meshes with 5-10 million cells
have Droved useful in practice. Efficient flight is gen- (for example, 512x64x256= 8,388,608 cells as shown
erally achieved by the-use of smooth and <treeamlined in Figure 2). When simulations are performed on less
shapes which avoid Bow separation and minimize vis- fine meshes with, sa ,500,000 to 1million cells, it is very
cous effects, with the consequence that useful predicuons hard to avoid mesh Jependency in the solutionsas well as
can he made using inviscid models. Inviscid calculatlons sensitivity to the turbulence model.
with boundary layer corrections can provide quite accu-
rate redictions of lit and drakwhen @e flow remcns
attacRed, but iteratlon between e inviscid outer solutlon
and the inner boundary layer solutlon becomes mcreas-
in ly difficult with the onset of separation. Procedures for
sofving the full viscous equations are likely to be needed
for the simulation of arbitrary com lex separated flows,
which may occur at high angles ofattack or with bluff
bodies. In order to heat flows at high Reynolds numbers,
one is enerally forced to estimate turbulent effects by
Reynolis averaging of the fluctuating components. his
requires the introduction of a turbulence model. As the
available computing power increases one may also as-
pire to large eddy simulation ES) in which the larger
p.
scale eddies are directly calcu ated, while the,influence
of turbulence at scales smaller than the mesh interval is
represented by a subgrid scale model.
a single eddy, the mesh interval should then be 1/50 of The selection of sufficiently accurate mathematical mod-
the boundary layer thickness. Moreover, since the eddies els and a jud ent of their cost-effectiveness ultimately
are three-dimensional, the same mesh interval should be rests with i n g q Aircraft and spacecraft desi ns nor
used in all three directions. Now, if the boundary layer mally pass throug the three phases of conce tuafdesigi
thickness is of the order of 0.01 of the chord length, 5,000 prelimnary design, and detailed desi n torrespond-
!ngl, the appropriateCFD models witviry in complex-
intervals will be needed in the chordwise direction, and
fora win with an aspect ratjo of IO,50,000 intervals will &
ity. the Conceptual and prelinary desi n phases, the
emphasis will be on relatively smple modis which can
be needA in the spanwise direction. Thus, of the order of
50 x 5 , 000 x 50, 000 or 12.5 billion mesh points would give results with very rapid turn-around and low computer
be needed in the boundary layer. If the time dependent costs. in order to evaluate alternative confieurationc and
behavior of the eddies is to be fully resolved using time
steps on the order of the time for a wave to pass through a
mesh interval, and one allows for a tolal lime e ual to the Ild be placed on-numerical p r e c -
time required for waves to travel three times ?he length tions bas forced the extensive use of wind tunnel tesung
of the chord. of the order of 15,ooO time steps would be at an early stage of the desi n This practice was very
needed. Performance beyond the teratlop ( IOi2 opera- exoensive. The limited n u d e ; of models that could be
~~~~~ ~ ~~~
tions per second) will he needed to attempt calculations fahcated also limited the range of design variations that
of this nature, which also have an informauon content far could be evaluated. It can be anticipated tha! in the fu-
beyond what is needed for enginering anal sis and de- ture, the role of wind tunnel tesung in the design process
sign. The desi er does not need to know x e details of will be more one of verification. Experimena &search
the eddies in tE boundary layer. The primary purpose
of such calculations is to im rove thc prediction of aver-
to improve our understanding of the physics of complex
flows will continue, however, to play a vital role.
a ed uantities such as skin kction, and the prediction of
gfobagbehaviorsuch as the onset of separation. The man
current use of Navier-Stokes and large eddy simulations
r
is to gain an improved insight into ~e ph sics of turbulent
flow, which may in tum lead to the deve opment of more
comprehensive and reliable turbulence models.
4. CFDALGORlTHMS
4.1 Difficulties of Flow Simulation
The corn utational simulation of fluid flow presents a
number of' severe challenges for al orithm design. At the
3.3 'Aubnlence Modelling level of inviscid modeling, the in%erent nonlinearity of
the fluid flow equations leads to the formation of singu-
It is doubtful whether a universally valid turbulence larities such as shock waves and contact discontinuihes.
model, capable of describing all complex flows, could he Moreover, the geometric configurations of interest are
devised [52]. AI ebraic models [30,9] haveprovedfairly extremely com lex, and generally contain sh edges
satisfactory for %e calculation of attached and slightly which lead to %e shedding of vortex sheets. ?xtreme
separated wing flows. ,These models rely on the boundary
layer concept, usual1 incorporating separate formulas for
7
the m e r and outer a ers, and they require an estimate
of a length scale whici depends on the thickness of the
e ients near stagnation points or win tips may also
to numerical errors that can have iobal influence.
Numericallv generated entrow mav be convected from
the leading-edge, for cxampl'e: cauiing the formauon of
boundary layer. The estimation of this quantity by a a numericall induced boundary layer which can lead to
search for a maximum of the vorticity times a distance separahon. &e need to treat extenor domans of infinite
to the wall, as @ the Baldwin-Lomax model, can lead to extent IS also a source of difficulty. Boundary condihons
ambiguities in internal flows, and also m complex vorh- imposed at artificial outer boundahes may cause reflected
cal flows over slender bodies and highly swept or delta waves which significantly interfere with the flow. When
wings 140, 1151. The Johnson-Kin model [88], which viscous effects are also included in the simulation, the
allows for non-equilibrium effects &rough the inaoduc- extreme difference of the scales in the viscous boundary
tion of an ordinary differential equation for the maxmum
shear stress. has improved the rediction of flows with
shock induced separation [148, $11,
r
layer and the outer flow. which is essential1 inviscid, IS
another source,ofdifficulty. forcing the use o meshes with
extreme variations in the mesh intervals. For these rea-
sons, CFD has been a driving force for the development
Closure models depending on the solution of transport of numerical algorithms.
eouations are widelv acceoted for industrial anolications.
f i e s e rnodelseliminatethk need toesumate a llngescale
by detecting the ed e of the bounday layer Eddy viscos-
P
ity models typical y use two equauons for the turbulent
kinetic energy k and the dissipation rate E , or a pair of
4.2 Structurpd and Unstructured Meshes
uivalent quantities [89, 178. 160, 1, 121, 351. Models The al oritbm designer faces a number of critical deci-
8 !his type enerall tend to resent difficulties in the
region very cfose to &e wall. &ey also tend to be badly
sions. %%efirst choice that must be made IS the nature
of the mesh used to divide the flow field into discrete
conditionedfornumerical solution. The IC-I model [154] subdomains. The discretization procedure must allow for
is designed to alleviate this problem by taking advantage the treatment of complex configurations. The principal
alternatives are Cartesian meshes, body-fitted curvilinear
of the linear behaviour of the leneth scale 6 near the wall. meshes, and unstructured tetrahedral meshes. Each of
In an alternative a proach to theldesi of models which these a proaches has advantages which have led to their
are more amenahre to numerical sogtion, new models use. d e Cartesian mesh minimizes the corn lexity of
requiring the solution of one transport equation have re- the algorithm at interior points and facilitates %e use of
centl been introduced [IO, 1591. %e performance of high order discretization procedures, at the expense of
the alygebraic models remains competlhve for wing flows, greater corn lemy, and possibly a loss of accuracy, in the
but the one- and two-equation models show promise for treatment oPboundary,conditions at curved surfaces. his
broader classes of flows. In order to achieve greater uni- difficultymay be alleviated b using mesh refinement ro
versality, research is also bein ursued on more complex oedures near the surface. d h theu aid, schemes w f i c i
Reynolds stress transport m&i, which require the solu- use Cartesian meshes have recentl been develo ed to
tion of a larger number of transport equations. treat very complex configurations rho,149,22,9!].
Another direction of research is the attempt to devise have been widely used and are par-
more rational models via renormalization group (RNG) to the treatment of viscous flow be-
theory [182,155 Both algebraic and two-equation k - allow the mesh to be compressed near
models devised by this approach have shown promising W~ this approach, the problem of
results [I 161. mesh generation itself bas proved to be a major pacing
1-4
item. The most commonly used procedures are alge- equations. In the finite volume method [llZd, the. dis-
braic transformations [7,44.46,156], methods based on cretization is accomplished by dividing the omam of
the solution of elliphc equations, pioneered by Thompson the flow into a large number of small subdomans. and
~170,171,157,158],andme~odsbasedonthesolutionof applying the conservation laws in the integral form
yperbolic equahons marehmg out from,the body [161
In order to treat very complex codguratlons it general y 2.
proves ""P ent to use a multiblock 1177, 150 roce
dure, wi separately generated meshes m eact! {loci
which may then be atcbed at block faces, or allowed
to overlap, as m the ghimera scheme [19,20]: While a Here f is the flux appearing in equation (1) and dS is
number of interactive software systems for d enera- the directed surface element of the boundary aR of the
tion have been developed, such as EAOLE, %UbGEN. domain R. The use of the integral form has the advantage
complex configuratlon may requue mon s of effort.x
and ICEM, the generahon of a sapsfacto grid for a very
thediscretizationprocedurefor~~equationsoffluidEow,
which can be expressed as differential conservation laws. 3a: Cell Centered Scheme.
In the Cartesian tensor notation. let z,be the coordinates.
p, 0, T,and E the pressure, density, temperature, and
iod energy, and ut d e velocity corn onents Usin the
convention that summation over i=1.13 is imlied%y a
repeated subscript j , each ConseiVation equati6n has ihe
form
aw + ='af.
- o. (1)
at 82,.
For the mass equation
3b: Vertex Scheme.
w=p, fj=puj.
Figure 3 Structured and Unstructured Discretizations.
For the i momentum equation
d
-wv+
dt
f.S=O, (4)
where U,, is the viscous stress tensor. For the energy
equation faces
w=pE, f 3 = ( p E+p) U,
aT
- U3kUk - l
C zv
where V is the cell volume, and f is now a numerical
estimate of the flux vector through each face. f may be
3 evaluated from values of the flow variables in the cells
separated by each face, using upwind biasing to allow for
where nis the coefficient of heat conduction. The pressure the+ections,of wavepro agahon. With.hexahedral cells,
is related to the density and energy by the equation of state equatlon (4) is very S&I to a finite Merence scheme
in curvilinear coordinates. Under a transformation to
1 curvilinear coordinates (j. equation (1) becomes
P=(r- 1) p (E - p U , ) (2)
4.4 Non-oscillatory Shock Capturing Schemes Following the pioneering work of Godunov [ 5 13, a variety
of dissipative and upwind schemes designed to have good
4.4.1 Local Extremum Diminishing (LED) Schemes shock capturing properties have been developed during
the past two decades [162, 23, 98, 100, 146, 130, 56,
The discretization rocedures which have been described 129, 166, 5, 68, 183, 62, 180, 13, 12, 111. If the one-
in the last section read to nondissipative approximations dimensional scalar conservation law
to the Euler equations. Dissipative terms may be needed
for two reasons. The first is the possibilit of undamped av a
- + --f(v) =o
oscillatory modes. The second reason is txe need for the at ax (9)
clean capture of shock waves and contact discontinuities
without undesirable oscillations. An extreme overshoot is represented by a three point scheme
could result in a negative value of an inherently positive
quantity such as the pressure or density. The next sec- -dvj-
-C.+t + (Vj+, - Vj) + cT (U+, - Vj) ,
tions summarize a unified ap roach to the constructlon of dt 3 3-4
nonoscillatory schemes via tie introduction of controlled
diffusive and antidiffusiveterms. This is the line adhered the scheme is LED if
to in the author's own work.
The development of non-oscillato schemes has been a c3+.+ t >
- 0, 2 0.
rime focus of algorithm researchxr com ressible flow.
P
Eonsider a general semi-discrete scheme o the form A conservative semidiscrete approximation to the one-
dimensional Conservation law can be derived b subdi-
(7) viding the line into cells. Then the evolution of tie value
v j in the jth cell is given by
A maximum cannot increase and a minimum cannot de- dvj
crease if the coefficients C j k are non-negative, since at a Ax- dt + hj+f - hj-f=O,
1-6
where hj+l is an estimate of the flux between cells j and 4.4.3 Hi h Resolution Switched Schemes: Jameson-
j + 1. Th2esimplest estimate is the arithmetic average Scfmidt-Turkel (JST) Scheme
(fj+l + fj) /2, but this leads to a scheme that does not
satisfy the positivity conditions. To correct this, one may Higher order non-oscillatory schemes can be derived by
add a dissipative term and set introducing anti-diffusive terms in a controlled manner.
An early attem t to roduce a hi h resolution scheme
by this approac\ is tRe Jameson-lchmidt-Turkel (JST)
scheme [ 8 5 ] . Suppose that anti-diffusive terms are intro-
duced by subtracting neighboring differences to produce
In order to estimate the required value of the coefficient a third order diffusive flux
aj+4, let aj+4 be a numerical estimate of the wave speed
Af
d 3. + l1= c r3. +2l { A u 3. + I7 - -21 (Auj+* + A u j - ~ ) } , (15)
where
Avj+f=vj+l - v j 1
and the LED condition (10) is satisfied if
The idea is to use variable coefficients .+ ,
and E (4)
3+4 J T
which produce a low level of diffusion in regions where
If one takes the solution is smooth, but prevent oscillations near dis-
1
a.+1=-
3 I 2 laj+jll continuities. If is constructed so that it is of order
3+4
one obtains the first order upwind scheme Ax2 where the solution is smooth, while E$; is of order
unity, both terms in dj+ 4 will be of order Ax3.
The JST scheme has proved very effective in practice in
This is the least diffusive first order scheme which satisfies numerous calculations of com lex steady flows, and con-
the LED condition. In this sense upwinding is a natural ditions under which it could ge a total variation dimin-
approach to the construction of non-oscillatory schemes. ishing (TVD) scheme have been examined by Swanson
It may be noted that the successful treatment of transonic and Turkel [165]. An alternative statement of sufficient
otential flow also involved the use of U wind biasing.
h i s was first introduced by Murman and ??ole to treat the conditions on the coefficients E !'Iand E !4) for the JST
3+t 3+i
transonic small disturbance equation [ 1231. scheme to be LED is as follows:
Another important re uirement of discrete schemes is
that they should excluie nonphysical solutions which do
not satisfy appropriate entropy conditions [95], which Theorem 1 (Positivity of the JST scheme)
require the convergence of characteristics towards ad- Suppose that whenever either vj+l or vj is an extremum
missible discontinuities. This places more stringent the coefficients of the JST scheme satisfy
bounds on the minimum level of numerical viscosity
[113, 169, 128, 1311. In the case that the numerical flux
> -1 j+i-o.
-
function is strictly convex, Ais0 has recently proved [2]
that it is sufficient that 3+t -21
"'+f
3 I 1
e (4)
In order to construct 63!*) and E 3!4)- 4 with the desired prop- Set
-#
erties define
R(u,v) =[ Ii'%%l 9
if U P O or v #O
0 ifu=v=O,
(18)
and
j+l = aj+j (1 - Qj+t) . To assure the correct sign to satisf fhe LED criterion the
€ (4)
d
flux limiter must now satisfy the a ditlonal constraint that
This formulation thus unifies the JST and SLIP schemes. $ ( T I 5 2.
The USLIP formulation is essentially e uivalent to stan-
4.4.5 Essentially Local Extremum Diminishing (ELED)
R
dard upwind schemes [ 130,1661. Both t e SLIP and US-
LIP constructions can be implemented on unstructured
Scheme with SoftLimiter meshes [75,79]. The anti-diffusive terms are then calcu-
lated by taking the scalar product of the vectors definin
The limiters defined by the formula (22) have the disad- an edge with the gradient in the adjacent upstream an
downstream cells.
9
vantage that they are active at a smooth extrema, reducing
the local accuracy of the scheme to first order. In or-
der to prevent.this, the SLIP scheme can be relaxed to
a
give an essentlally local extremum diminishin (ELED)
scheme which is second order accurate at smoot extrema
by the introduction of a threshold in the limited average.
4.4.7 S stems o Conservation Laws: Flux Splitting and
dux-D d r e n c e Splitting
Therefore redefine D (U, U > as Steger and Warming [ 1621first showed how to generalize
D (U,VI =1 -
max( IuI + 1v1 ,€Axr>
Iq, (23)
the concept of upwinding to the system of conservation
laws
aw a
-+-f(w)=O
a t ax
4,
where T= q 2 2. This reduces to the previous definition
by the concept of flux splitting. Suppose that the flux is
if lul + 1vl > €AxT.
In any region where the solution is smooth,Avj+s -Avj- 4
split as f =f + f - where
+ and E
have positive and
negative ei envalues. Then the first order upwind scheme
is of order Ax2. In fact if there is a smooth extremum in is producef by taking the numerical flux to be
the neighborhood of vj or v ~ +a ~Taylor
, series expansion
indicates that Avj+ 3, Avj+ 1 and Avj- ;are each individ-
ually of order Ax2, since $=O at the extremum. It may
be verified that second order accuracy is preserved at a This can be expressed in viscosity form as
smooth extremum if q 2 2. On the other hand the lim-
I I
iter acts in the usual way if IAvj+ j or lAvj- j > €AxT,
and it may also be verified that in the limit Ax + 0
local maxima are non increasin and local minima are
non decreasing [79]. Thus the sci?eme is essentially local
extremum diminishing (ELED). = 1 (fj+l + f j )
- - dj+f,
The effect of the “soft limiter” is not only to improve the 2
accuracy: the introduction of a threshold below which where the diffusive flux is
extrema of small amplitude are accepted also usually re-
sults in a faster rate of convergence to a steady state, and
decreases the likelyhood of limit cycles in which the lim-
iter interacts unfavorably with the corrections produced
by the updatin scheme. In a scheme recently proposed
by Venkatakrisnan a threshold is introduced precisely Roe derived the alternative formulation of flux difference
for this purpose [ 1741. s littin [ 1461 by distributing the corrections due to the
&x dilference in each interval upwind and downwind to
obtain
4.4.6 Upstream Limited Positive (USLIP) Schemes dwj + =o,
Ax-
dt
+(fj+l - f j > - + < f j- f j - I >
By adding the anti-diffusive correction purely from the
upstream side one may derive a family of upstream limited where now the flux difference fj+l - fj is split. The
ositive (USLIP) schemes. Corespondin to the original correspondingdiffusive flux is
ELIP scheme defined by equatlon (20), akSLIP scheme
is obtained by setting
Aj+i may be calculated by substituting the weighted av- Thus these schemes are closely related to schemes which
erages introduce separate s littings of the convective and res-
sure terms, such ase!lt wave-particle scheme [141,8[ the
advection upwind splitting method (AUSM) [ 106, 1761,
and the convective upwind and split pressure (CUSP)
schemes [76].
into the standard formulas for the Jacobian matrix A=%. In order to examine the shock capturing properties of these
various schemes, consider the general case of a first order
A splitting according to characteristic fields is now ob- diffusive flux of the form
tained by decomposing Aj+4 as
1
dj+f=Taj+tBj+f(wj+l - wj) > (34)
A ~ +=TAT- l , (28)
where the columns of T are the eigenvectors of Aj+4, where the matrix Bj+4determines the properties of the
scheme and the scaling factor aj+t is included for con-
and A is a diagonal matrix of the eigenvalues. Now the
corresponding diffusive flux is venience. All the previous schemes can be obtained by
representing Bj+jas a polynomial in the matrix Aj+t
defined by equation (26). Schemes of this class were
considered by Van Leer [99]. According to the Cayley-
Hamilton theorem, a matrix satisfies its own characteristic
where equation. Therefore the third and hi her powers of A can
I A ~ + ~I =T 1 ~ T - 11 be eliminated,and there is no loss o4enerality in limiting
Bj+ to a polynomial of degree 2,
and 1A1 is the diagonal matrix containing the absolute
values of the eigenvalues. Bj+t=aol+ alAj+j + ~ 2 A : + f . (35)
The characteristic
4.4.8 Alternative Splittings upwind scheme for which Bj+l=
Characteristic splitting has the advantages that it intro- substituting Aj+4=TAT-', A2 =TA2T-l. Then QO,
3+f
duces the minimum amount of diffusion. to exclude the ~ 1 and
, a2 are determined from the three equations
rowth of local extrema of the charactenstic variables,and
Bat with the Roe lineafizatjon it allows a discrete shock
structure with a single interior point. To reduce the com- QO + + Q ~ X ; = I X ~ ~ k=1,2,3.
,
putational complexity one may replace IAl by a1 where The same re resentation remains valid for three dimen-
if Q is at least equal to the spectral radius max IX(A) I, sional flow gecause Aj+i still has only three distinct
then the positivity conditions will still be satisfied. Then eigenvalues U , U + c, U - c.
the first order scheme simply has the scalar diffusive flux
1
dj+f=TQj+tAwj+t. (29) 4.4.9 Analysis of Stationary Discrete Shocks
'vy
differences of the state and flux vectors
1 . 1
dj+t=y"j+lc (wj+l- wj) + z ~ j + 4( f j + l - fj) 9 (30)
I !
I
I I
where the factor c is included in the first term to make
~ j * + and
~ bj+* dimensionless. Schemes of this class I
I
are fully upwind in supersonic flow if one takes Q ~ * + ~ = O I
I
and Pj+t=sign ( M ) when the absolute value of the Mach j+l I j+2 I I
cases shows that a discrete shock structure with a single the corresponding error in the tem erature may lead to a
interior point is su ported by artificial diffusion that sat-
isfies the two cond!tions that reactions.
8
wrong prediction of associated e ects such as chemical
the relationship
The eigenvalues of A h are U, A+ and A- where
a * c = ( l + p >( C - U ) ,o< U < c
Thus there is a one parameter family of schemes which
support the ideal shock structure. The term p<fR - f A )
contributes to the diffusion of the convective terms. Al-
lowing for the split (3 l), the total effective coefficient of Now both CUSP and characteristic schemes which pre-
convective diffusion is a c = a * c + E. A CUSP scheme serve constant stagnation enthalpy in steady flow can be
with low numerical diffusion is then obtained by taking constructed from the modified Jacobian matrix A h [80].
a=JMI,leading to the coefficients illustrated in figure 5 . These schemes also produce a discrete shock structure
with one interiorpoint in steady flow. Then one arrives at
four variations with this ropert ,which can conveniently
be distin uished asthe E! and d-CUSP schemes, and the
E- and Ifcharactenstic schemes.
c
rections 601. They also show prouusing results in calcu-
lations o nozzles with multiply reflected oblique shocks.
From acontrol volume centered on each face,using formu-
las (38) or (39) 11441. This is com utationally expensive
because the number of faces is mu& larger than the num-
ber of cells. In a hexahedral mesh with ti large number of
vermes the number of faces approaches three times the
4.5.1 Hi h Order Godunov Schemes, and Kinetic F l u number of cells.
Sphing
This motivates the inn@ucuon of dual meshes for the
A substantial body of current research is d e l e d toward evaluatlon of the velocity denvatlves and the flux bal-
the implementation of truly mul~-dImensionalupwind ance as sketched in figure 6. The figure shows both
dk
schemes [59,135,101 Reference [132] provides a thor-
ough review of recent velopments in this field. Some of
the most impressive simulationsof time dependent flows
with strong shock waves have been achieved with higher
order Godunov schemes [1801. In these schemes the aver-
age value in each cell is updated by applying the integral
conservation law using interfFe fluxes, predicted from
the exact or approxlmate SOluhOn of a hemann problem
between adiacent cells. A hieher order estimate of the
solution IS then reconstructed h m the cell averages, and
slope luniters are ap lied to the reconstruction. An ex-
ample is the class ofessentially nonroscillatory (ENO)
schemes, which can attain a very lugh order of accu-
racy at the cost of a substantial increase tn computational
complexiy [32. 153, lS1,.152]. Methods b&d on re-
construction can also be unplemented on unstructured 6a: Cell-centered
meshes 113, 121. Recently there has been an increasing scheme. uij evaluated 6b: Cell-vertex scheme.
interest in kinetlc flux splrtting schemes, which use solu- at vertices ofthe primaryoij evaluated at cell cen-
tions of the Boltzmann equatlcon or the BGK equation to mesh ters of the primary mesh
predict the interface fluxes [42,36,45, 136, 1811.
Fi ure 6 Viscous discretizations for cell-centered and
cei-vertix algorithms.
Tbe discretization of the viscous terms of the Navier cell-centered and cell-vertex schemes. The dual mesh
Stokes equations requires an approximation to the vc- connects cell centers of the orimarv ,mesh. If .there is
. ~. a~.
2 kink in the ~~&a&mesh:&ule'du&xlls should be formed
~~~~~~~
locity derivatives in order to calculate the tensor u,j, by assemdng contiguous fractlons of the neighboring
defined by uation (3). Then the viscous terms may be primary cells. On smooth meshes comparable results are
included i n x e flux balance (4 In order to evaluate the obtained b either of these formulations [114,115, 1071.
derivatives one may apply the auss formula to a control If the mest has a kink the cell-vertex scheme has the
volume V with the boundary S advantage that the derivatives 2 are calculated in the
interior of a regular cell, with no loss of accuracy.
A desirable property is that a linearly varying velocity dis-
tribution, as in a Couette flow, should produce a constant
where nj is the outward normal. For a teuahedral or stress and hence an exact stress balance. This roperty is
hexahedral cell this gives not necessarily satisfied in general by finite digerence or
finite volume schemes on curvilinear meshes. The char-
acterization k-exact has been proposed for schemes that
are exact for polynomials of degree k. The cell-vertex fi-
nite volume scheme is linearly exact if the derivatives are
evaluated by equation (39). since then 2 is exactly eval-
where a, is an estimate of the avera e of U, over the uated as a constant, leading to constant viscous stresses
face. If u varies linearly over a tetmiedral cell this is uij, and an exact viscous stress balance. This remains
exact. Alternatively, assuming a local transformation to true when there is a kink in the mesh, because the sum-
computational coordinates tj- one may apply the chain mation of constant stresses over the faces of the kinked
control volume sketched in figure 6 still yields a perfect
balance. The use of equation (39) to evaluate 2, bow-
ever, requires the additional calculation or storage of the
Here the transformation derivatives e can be evaluated
nine metric quantities 2 in each cell, whereas equation
(38) can be evaluated from the same face areas that are
derivatives e In this case
varying functlon.
e
by the same finite difference formulas as the velocity
is exact if U is a linearly
used for the flux balance.
In the case of an unshuctured mesh, the weak form (6)
leads to a natural discretization with linear elements, in
1-12
which the piecewise linear approximation ields a con- R(w"+'). The resulting equation
stant stress in each cell. This method yieldys a represen-
tation which is globally correct when averaged over the
cells, a result that can be proved by energ esumates for el-
liptic problems [ 1641. It should be. n o t d however, mat it
yields formulas that are not necessarily locally consistent can be linearized as
with the differential equations, if Ta lor seriesexpansions
are substltuted for the solution at tie vertices appear@
in the local stencil. Figure 7 illustrates the mscretlzatlon
of the Laplacian uZz+ uyywhich is obtained with linear
elements. It shows a particular triangulation such that If one sets p=l and lets At + m this reduces to the
the approximation is locally consistent with uzz + 3uyy. .
Newton iteration which hm been successfully used in
Thus the use ofan irregular uian lation in the boundary two-dimensional calculations 1173, 50 In the three-
layer may significantly degrade tre accuracy. dimensional case with, sa an N x ". x N mesh, the
bandwidth of the matrix tkat must be inverted is of or-
der N'. Direct inversion requires a number of operations
proportional to the number of unknowns multlplied by
the s uare of the bandwidth of the order of N7.This is
h rohhtive, and forces recourse to either,an approximate
Factorization method or an iteranve solutlon method.
Alternatin direction meth+, which introduce factors
corresponiing to each coordinate, are widely used for
structured meshes [17, 1371. They cannot be imple-
mented on unstructured tetrahedral meshes that do not
contain identifiable mesh dxections, although other de-
compositions are possible [log]. If one chooses to adopt
-- b h
the iterative solutlon techruque, the rinci al alternatives
are variants of the Gauss-Ssidel a n i Jacogi methods. A
symmetric Gauss-Seidel metpod with one iteration per
tlme step is essentially eqtyalent to an approximate
lower-upper (LU)factorizatlon of the imphcit scheme
Figure 7 Example of discretization..u + uyuon a trian- [86,125,31,184]. On the other hand, the Jacobi method
gular mesh. The discretizationis locally equivalent to the with a fixed number of iterations per time step reduces
to a multista e explicit scheme, belongin to the en
approximation U,,=*, 3~,,="d-6hU,.+~~ . eral class of fun e Kutta schemes [33 Sciemes o?thiI
type have rovdv& effective for wide varie of prob
lems, and %ey have the advanta e that they c a n L applied
equally easily on both structurdand unstructuredmeshes
[84,67,69, 1451.
4.7 Time Stepping Schemes If one reduces the linear model problem corresponding to
If the space discretization rocedwe is implemented sep- (40) to an ordinary differential equation by substitutinga
arately, it leads to a set ofcoup!ed ordinary differentlal Fourier mode fi=e'PzJ, the resultin Fourier symbol has
equatlons, which can be written in the form an imapnary a t proportional to fhe wave speed. and
a negatlve reaf'part roportional to the diffusion. n u s
the tune stepping sc?I erne should have a stability region
dw which contams substantial intervals of both the negative
- + R(w)=O, real axis and the hagin? yis.To achieve this it pays
dt
to treat the convectlve an dssipatlve terms in a distlnct
where w is the vector of the Bow variables at the mesh fashion. Thus the residual is split as
points, and R(w) is the vector of the residuals, consisting
of the flux balances defined by the space discretization R(w)=Q(d
+ D(ur),
scheme, together with the added dissi ative terms. If the where Q(,,,)is the convective and D(w) the dissi-
obiective is simolv to reach the stedv state and details
ative art. Denote the time level nAt by a superscri t n
h e n g e multistage time stepping scheme is f o r m u f k
as
r~~~~~ schemes
~ ~ ~~~~
chosen to increase the stability interval along the has to be transferred back to grid k - 1 with the aid of
negatlve real axis. ..
- . ovtimized
an intemolationoveratorL .7 . .,k .. With uroverlv
These Schemes do not fall the mework coefficikntsmultiitagetimestepping schemes can be very
of Runge-Kutta schemes, and they have much larger sta- effiC*ent,fiven of the 'd P'JCess. A W-cYcle of
bility regions [69]. live schemes which have been found the type illustrated In F@ure(lfProves to be a Particularly
to be particularl effective are tabulated below. The first
is a four-stage scieme with two evaluations of dissipation.
Its coefficients are
Applied to the h e a r differential equation in the different coordinate directions. The need to resolve
the boundary layer generally forces the intiduction of
dw mesh cells with very high aspect ratios near the bound-
--=(2W and these can lead to a severe reduction in the rate
dt yionvergence to a steady state. Pierce has recently ob-
the schemes with k = l , 2 are stable for all aAt in the left tained impressiveresults using diagonal and block-Jacobi
half plane (A-stable). Dahlquist has shown that A-stable preconditloners which include the mesh intervals [ 1331.
linear multi-step schemes are at best second order accurate An alternative approach has recently been proposed by
I381. Gear however, has shown that the schemes with
< 6 are stim stable [49], and one of the higher order
4
sckmes may o er a better compromse between accuracy
Ta'asan 1681, in which the equations are wntten in a
canonical form which se arates the equations describ-
J:
ing acoustic waves from ose describing convection. In
and stability, depending on the application. terms of the velocity components U,v and the vorticity
Equation (40) is now treated as a modified steady state w , temperature T,entrop s and total enthal y H, the
problem to be solved by a multigrid scheme using variable eauations describinn s t e d two-dimensional &w can be
local time steps in a fictitious time to. For example, in the
case k=2 one solves
where
aw
-=R*
at*
(w) , [
where
and the last two terms are mated as k e d source terms.
The first term shifts the Fourier symbol of the equivalent
model problem io the left in the complex plane. While
this pr6motes stability, it may also reuire-a limit to be
imposed on the magnitude of the local time step At* rel-
ative to that of the implicit time step At. This may be
relieved by a oint im licit modification of the multi- D3 =
a
U--U-
a
stage scheme [ fl9].-In %e case of problems with moving
boundaries the e uations must be modified to allow for
az ay
movement and delormation of the mesh. a 8
Q = u-tv-
This method has proved effective for the calculation of az ay
unsteady flows,that mi ht be associated with,wing flutter
and also in the ciculation of unsteady incompress- Here the first two quatio,ns describe ?,elliptic system if
!&?lows 1181. It has the advantage that it can be added the flow is subsomc, wbde the remaning equations are
convective. Now se arately optimized mulugrid proce-
as an optioh tdacom uterprogrdwhich uses an ex licit
multi 'd scheme, alfowing it to be used for the e d i e n t dures are used to sage the two sets of equations, which
are essentially decoupled.
calcuEon of both steady and unsteady flows.
such as local solution gradients may be used. Alterna- 5.2 Euler cnlculationsfor Airfoils and Wings
tively, the discretization error may be estimated b com-
paring quantities calculated with two mesh wid& say The results of transonic flow calculations for two well
on the current mesh and a coarser mesh with double the known airfoils, the RAE 2822 and the NACA 0012, are
mesh interval. Procedures of this kind may also be used presented in figures (22-25). The H-CUSP scheme was
to provide a posteriori estimates of the error once the again used. The Limiter defined by equation (23)was used
calculationis completed. with 4=3.The 5 stagetime steppin scheme (42)was aug-
mented by the mulhgrid scheme &scribed in section, 4.2
This kind of local ada tive control can also be applied to accelerate convergence to a stead state The equatlons
to the local order of a h e element method to produce were discretized on meshes with 6-topoiogy extending
a prefinement method, where p represents the order of out to a radius of about 100 chords. In each case the
the polynomial basis functions: Finally, both h- and p calculations were performed on a se uence of succes-
refinement can be combined to produce an h-p method in sively finer meshes from 40x8 to 320x84 cells, while the
~~~~~
. ~ ~ - ~
which h and D are locallv ootimized to vield a solution
~~~~~ ~
~~~
sults which con& the pro ernes of the al orithmswhich of a lifting aufoil. &e convergence lustones show the
have been reviewed in theyast section. &ese have been mean rate of change of the density, and also the total num-
drawn from the work of the author and his associates. ber of supersonic points in the flow field, which provides
The also illustrate the kind of calculation which can be a useful measure of the global conver ence of tiansonic
pert?&ed in an industrial environment, where rapid turn flow calculations such as these. In ea& case the conver-
around is mportant to allow the quick assessment of de- y e history is shown for 100 cycles, while the pressure
sign changes, and computational costs must be Limited. smbuGon IS displayed after a suflicient numkr of c
cles for its convergence. The pressure dtstnbutlon of&
RAE 2822 airfoil conver ed in only 25 c cles. Conver-
5.1 One-dimensional shnck gence was slower for thekACA 0012 aidil. In the case
of flow at Mach .8 and 1.25' angle of attack, additional
In order to ve the discrete structure of station-
9.
ary shocks, calcu atlons were erformed for a one-
dimensional problem with initial &ta containing left and
c cles were needed to damp out a wave downstream of
de weak shock wave on the lower surface.
right states compatible with the Rankine Hugniit con$- As a further check on accuracy the dra coefficient should
tions. An intermediate state consisting of e anthmetlc be zero in subsonic flow. or in shock f r e transonic
~~~ ~
~~~~~~~~~~~ flow.
average of the left and ri ht states was introduced at a Table 2 shows the corn' uted drag coefficient on a se-
single cell in the center offhe domain. With this interme- P,
quence of three meshes three exam les The first two
are subsonic flows over the RAE 282fand NACA 0012
diate state the svstem is not in eouilibrium. and the time
airfoils at Mach .5 and 3" angle of attack. The third is the
~~~~~~ ~~~~~~ ~~~~ ~~ ~~~~ ~~~ ~ ~~~~~ ~~~~ ~~~~~
formation for the aerodynamic design could be obtained I No. ot Nodes 11 SecondslCycle [ Speedup I
with a relatively inexpensive computational model.
I --
Figure 9 Comparison of Ex erimental and Com uted
Drag Rise Curve for the YF-2!3 (Supplied by R. J.\usb
Jr.)
namic ehciency. each step. The main disadvantage ofthis approach is the
need for a number of flow calculations proponional to the
The simplest ap roach to optimization is to define the number of design variables to estimate the gradient. The
eometry througg a set of design payneters, which may, computational costs can thus become prolubitive as the
for example, be the weights ai applied to a set of shape number of design variables is increased.
functions b, (z) so that the shape is represented as An alternative approach is to cast the desi n problem as a
search for the shape that will generate the &red ressure
distribution. This approach recognizes that the iesigner
usually has an idea of the the kind of pressure distnbu-
Then a cost function I is selected which might, for exam- tion that will lead to the desired performance. Thus. it is
ple, be the drag coefficient or the lift to drag ratio, and I useful to consider the inverse problem of calculating the
is regarded as a function of the parameters CY,. The sen- sha that will lead to a given pressure distribution. The
sitivities% may now be estimated by making a small megod bas the advanta e that only one flow solution is
required to obtain the &sired design. Unfortunately, a
variation CY. in each design parameter in turn and recal- bysically realizable shap may not necessarily exist,,un-
culating the flow to obtain the change in I . Then Pess the pressure hstnbutlon satlsfies certllln consmmts.
Thus the problem must be very carefully formulated,oth-
erwise it may be ill posed.
The difficulty that the target pressure may be unattainable
The gradient vector may now be used to determine a may be circumvented by treating the inverse problem as
a special case of the optimization problem, with a cost
direction of im rovcient. - m e simplest procedure is to function which measures the error m the solution of the
make a step ine! negative gradient direction by setting inverse problem. For example, if pd is the desired surface
pressure, one may take the cost function to be an integral
CY"+'=CY" - A60, over the the body surface of the square of the pressure
1-18
error,
equations of the flowfield arc introduced as a consuaint ad'ointpartial differentialequation.Ifthese uations are
in such a way that the final ex ression for the gradient solved exactly they can provide an exact grzent of the
does not requue reevaluation oI%e flowfield. In order lo inexact cost function whch results from the discretization
achieve this 6w must be elimina!ed from (43). Su pose of the flow equations. On the other hand any consistent
that the govenungequauon R which expresses the J p e n - discretization of the adjoint partial differenual equation
dence of w and 7 within the flowfield domain D can be will yield the exact adient in the limit as,the mesh is
written as refined. The trade-ohetween the compleuty of the ad-
R (w,7 )=O. (44) joint discretization. the accuracy of the resulting estimate
of the gradient, and its impact on the computauonal cost
Then 6w is determined from the equation to approach an optimum solution is a subject of ongoing
research.
(45) The me optimum shape belongs to an infinitely dimen-
sional space of design arameters. One motivation for
developing the theory !or the partial differential q u a -
Next, introducing a Lagrange Multiplier $, we have tions of the flow is to provide an indication in principle
of how such a solution could be ap roached if sufficient
6I =
= -$:[
aIT
-6w+-~5-$~
aw
arT
a3
T [8;;;])6w+[$-$
aR T aR
[-])63.
computational resources were ava&ble. Another moti-
yation is that it highlights the possibilit of generatin
dl posed formulauons of the problem. &or example, i
one attempts to calculate the sensitivity of the pressure
at a particular location to changes in the boundary shape,
+
aF
there is the ossibility that a sha e modification could
Choosing (I, to satisfy the adjoint equation cause a shocf wave to pass over &t location. Then the
sensitivity could become unbounded. The movement of
the shock, however, is continuous as the shape changes.
Therefore a uantity such as the drag coefficient, which
is detennine8by integrating the pressure over the surface,
also depends continuously on the shape. The adjoint
equation allows the sensihvity of the drag coefficient to
the first term is eliminated. and we find that be determined without the ex licit evaluauon of pressure
sensitivities which would be &posed.
61=867, (47)
The discrete adjoint equations, whether they are derived
where directly or by discretization of the adjoint partial differen-
tial equation, are linear. Therefore they could be solved
by direct numericalinversion. The cost of direct inversion
can become prohibihve, however, as the mesh is relined,
The advantage is that (47) is independent of 6w, with the and it becomes more efficient to use iterative solution
result that the gradient of I with respect to an arbitraty methods. Moreover, because of the similarity of the ad-
number of desien variables can be determined without the joint y t i o n s to the flow equations, the s F e iterative
~~~~~~~~~ ~~ ~~~~
need for additi&fiow-field evaluations. In the case that metho s wluch have been proved to be efficient for the
(44)is a artial differential equation, @eadjoint equation solution of the flow equations are efficient for the solution
(46) is go
a partial Merenual equauon and appropnate
boundary conditions must be deteimined.
of the adjoint equations.
The control theory formulation for optimal aerodynamic
After makine a s t e ~in the neeative eradient direction. desi n has roved effective in a vanety of ap lications
the gradient Fan be iecalculadand thi process repeated [73.57, 144. The adioint equations have also &en used
by Ta'asan, Xuruvilaand Sdas [167 who have imple-
to follow a path of steepest descent unul a minimum is
reached. In order to avoid violaung consmamts, such as sented bv the flow euuations
j,
mented a one shot approach in which e constraintrepre-
~ is onlv muired to be ~ satisfied
a minimum acceptable wing thickness, the gradient ma by the fihal converged solution, and computational costs
be projected into the allowable subspace within whic; are also reduced by applying multi 'd techniques to the
the constraints are satisfied. In this way one can devise geometry modifications as well as &?solution of the flow
procedures which must necessarily converge at least to a ind ad'oint equations. pironneau has studied the use of
ocal minimum. and which can be accelerated bv the use
-7 - ~ -
~~~~~ ~~~~ ~ ~~~~~ ~~~~
The elements of Q are,the coefficients of K,and in a The weak form of the equation for 6w in the steady state
finite volume discretizahongey are just the face areas of becomes
the computational cells projected in the 2 12 2~. and 2 3
directions. Also introduce scaled contravanant velocity
components
ui=Qijuj.
The msformed equations can now be written as where
~F,=C;~W
+ 6Qijfj,
aw + -=o
- OF+
at R-8
which should hold for any differential test function 4.
This equation may he added to.the variation in the cost
where function, wluch may now be wntten as
W=Jw
and
U 2 4 onBw. (49)
At the far field boundary Bp, conditionsare specified for
incoming waves. as in the two-dimensional case, while (54)
outgoing waves are determined by the soluhon.
The weak form of the Euler equationsfor steady Bow can
be written as
d=-XBTG, 6S=Bd=-XBBTG
61= J / G ( E , q ) aS(E,q) &dq
where X is sufficiently smalland positive. The coefficients
of B can be renormalizedto produce unit row sums. With
where the gradient G 7) is Obtained bY evaluating the a uniform mesh s acing in the computational domain this
integralsin equation (57). Thus to reduce I we can choose formula is uivafent to the use of a gradient modified by
two passes3the explicit smoottung procedure
6S=-XG
where A is suliiciently small and non-ne ative In order
to imuose a thickness constraint we can &fine a baseline
surf&e So (6C) below which S (E, C) is not allowed to
fall. Now we take X=X (4C) as a non-negative function Withasimilarsmoothingpmcedureinthek discretization.
such that Implicit smoothing may also be used. The smoothing
(58) equation
S(S,C) + 6S(E,C) 2 So (t,0.
Then the constraint is satisfied, while
a ac
0 - -E-+
The costate solution (il is a legitimate test function for
the weak form of the flow uations only if it is (tiffer-
ac
enuable. Smoothness shea *O be. PreFrvd
redesigned shape. It is therefore c~ciallymportant to
the If one sets 6S=-Xc, then to first order the change in the
mtmduce appropriate smoothmg procedures. In order cost
to avoid disconunuitiesin the adjomt boundary condition
which would be caused by the appearance of shock waves,
the cost function for the target pressure may be mcdified
r
The method has been used to cany out a stud of swept
wing designs which might be a opriate for on range
"osport aircraft. Since three Kensional calciations
requm pbstantial computational resources, ,it is ex-
and the smootp quantity 2 replaces p - pd in the adjoint tremely important for $e practical implementatlon of the
boundary conhhon. method to use fast soluhon algorithm for the flow and the
1-22
ndinini
__.__eouatinns.
.~ In this case the author's FL087 com-
~~~ ~~~~ ~ ~~~ ~
rogram FL067. This program uses a cell-vemx formu-
puter pro am has been used as the basis of the design fation, and has recently b e n mod!fied to. incorporate a
method. KO87 solves the t&ee dimensional Euler equa- local extremum dmnishmg a1 onthm with a ve
tions with a cell-centered finite volume scheme, and uses level of numerical diffusion 174. m e ? run to fuVc'Z
residual averaeine and multierid acceleration to obtain vergence it was found that a better estmate of the drag
ve rapid stedy k e solutio~s,~us,dly in25 to 50multi- coefficient of the redesigned wing is 0.0094 at Mach 0.85
griycyc!es [66, 701.. Upwind biasing is used to produce with a lift coefficient of 0.5, giving a lift to drag ratio
non-oscillatory soluhons, and assure the clean capture of of 53. The results from FL067 for the initial and final
shock waves. This is introduced through the addition wings are illustrated in Figures 29 and 30. A calculation
of carefully controlled numerical diffusion terms, with a at Mach 0.500 shows a drag coefficient of 0.0087 for a
magnitude of order Az3 in smooth parts of the flow. The lift coefficientof 0.5. Since in this case the flow is en-
adjoint equations are treated in the same way as the flow tirely subsonic, this provides an estimate of the vortex
equations. The fluxes are first estimated by. central differ- drag for this planform and lift distribution, which is just
ences, and then modified b downwind biasing through what one obtains from the standard formula for induced
numerical diffusive terms whch are supplied by the same drag, CD=CL*/ETAR, with an aspect ratio AR=9, and
subroutinesthat were used for the flow equatlons. an efficiency factor c=0.97. Thus the design method has
The study has been focussed on wings designed for cruis- reduced the shock wave drag coefficient to about 0.0007
ing at Mach 3 5 , with lift coefficients in the range of .5 to at a lift coefficient of 0.5. Figure 31 shows the result of
.55. In every case, the wing planform was fixed while !he an analysis for an off design point with the Mach number
sections were free to be chan ed arbitranly by the desi n increased to .86 with the same lift coefficient of .5. ? i s
method, with a restriction on lie minimum ttuckness. d e results in a flat-topped pressure distribution terminahng
with a weak shock of near1 uniform strength across the
whole s an The drag coe&cient is ,0097. The penalty
of ,000Lfis so small that this might be a preferred cruising
condition.
A second wing was designed in exactly ,the same manner
as the first, starting from the same inmal geometry and
with the same constraints, to give a l i i coefficientof .55
at
~.~Mach .85. This oroduces stroneer shock waves and is
~~~~~ ~~ ~~~~
r~~~~~~
0.6. This section,which has a thickness to chord ratio of therefore a more severe test ofthe hethod. In this case the
9.5 percent, was used at the ti Similar sections with an total inviscid drag coefficient w+ reduced from0.0243 to
increased thickness were usefinboard. The variation of 0.0134 in 40 design cycles. Agam the performance of the
thickness was non-linear with a more rapid increase near fmal design was verified by a calculation with FL067, and
the rmt, where the thickness to chord raho of the basic when the-result was fully converged the drag coefficient
section was multiplied by a factor of 1.47. The inboard was found to be 0.0115. A subsonic calculation at Mach
sections were rotated upwards to give the initial wing 3.5 .500showsadragcoefficientofO.O107foraliiftcoefficient
degrees twist fromroot to tip. ?e two-dimensional pvs- of 0.55. Thus in this case the shock wave drag coefficient
sure distribution of the basic wing sectlon at its, desip is about 0.0008, For a representative transport aircrafl the
int was introduced as a target pressufedistnbutlon uni- parasite drag coefficient of the wing due to skin friction is
& n l y across the span. This target is resumably not about 0.0045. Also the fuselage drag coefficient is about
realizable, but serves to favor the estabhsK,ent of a rela- 0.0050, the nacelle drag coefficient is about 0.0015. the
tivel benign pressure distribution. The total inviscid dra empennage dra coefficient is about 0.0020.and excres-
coedcient, due to the combination of vortex and shoc f cence drag coekcient is about 0.0010. This would give
wave drag, was also included in the cost function. Since a total drag coefficient C~=0.0255for a l i i coefficient
the main objective of the study was to minimize the dra of 0.55. coresDondine to a lift to drag ratio LID=21.6.
the @get pressure hstribution was reset after every fo
design cycle to apispibuhon derived by smwthmg the ex- a
' h s would be subsktial improvcmh over he values
ism pressure distnbuhon. Thm allows the scheme more obtained by currently flying transport amraft.
freefom to make changes which reduce drag. The cal-
culations were performed with the lift coefficient forced
to approach a fixed value b adjusting the angle of attack 6.5 Optimization of Complex Configurations
every fifthiteration of the d w solution. It was found that
the computational costs can be reduced b usin only 15 In order to treal more complex configurations one can use
a numerical rid generation procedure to produce a bod
multigrid cycles in each flow solution, aniin ea& adjoint
solution. Althou h this is not enough for full ,conv,er- F
fitted mesh or the initial geometry, and then modify
mesh in sub uent design cycles by an analyhc perturba-
de
gence, it proves su%icient to provide a shape mdficahon tion formul3n the two-dimensional case, for example,
wluch leads to an improvement.
with computational coordinates c, q. let the boundary dis-
Figures 27 and 28 show a wing which was designed for a olacement at n=O be 6 ~ (€1. h 6 m (0.Then the mesh
lift coefficient of .SO at Mach .85. In order to prevent the points along the radial cobrdinateiinis €=constantcan be
final wing from becoming too thin the threshold So (c, 7 ) ) replaced by
was set at three quarters of the height of the bump S (t,7))
defining the initlal wing. This calculation was performed
on a mesh with 192 intervals in the direction wrapping
around the wine. 32 intervals in the normal n direction
and 48 interval; in the spanwise C direction, giving a yielding
total of 2949 12 cells. The wing was specified b 33 scc-
tions, each with 128 points, giving a total of 4234 design
variables. The plots ihow ttie inifial wing geome
7 "d
pressure distribution, and the modified geometry an pres
sure distributionafter 40 desi n cycles The total inviscid
drag coefficient was reduced from 0.0210 to 0.01 12. The
initlal design exhibits a very strong shock wave in the Such a procedure has been implemented by J. Reuther for
inboard region. It can be seen that this is corn letely the three-dimensional Euler equations, and ap lied to the
eliminated, leaving a ve weak shock wave in &e out- optimization of wing-body configurations[ 14!].
board r r . TO verifyxe solution, the final geometry
was an yzed with another method, using the computer It is also possible to show that in the continuous limit
the field integral in equation (57) can be eliminated. Let
1-23
the change in the coordinates 2, at k e d be 62i (5). be significantly improved by innovative concepts, such
Then, using the fact that the fluxes f, (w)satisfy the flow as the idea of time inclining. It can be anticipated that
eauation (48). it is oossible to show bv a direct calculation interdisciplinary applications in which CFD ii cou led
with the com utauonal analysis of other properties ofthe
P
desien will D av an increasinelv imoortant role. These
appkations'miy include sm%ral, thermal and electro-
~~~~~~~~ ~~~~ ~~~~~
Now
[4] J. J. Alonso, L. Martinelli, and A. Jameson. Mu!ti- [ZO] J.A. Benek, T.L. Donegan, apd N.E. Sub+ Ex-
grid unsteady Navier-Stokes calculauons with tended Chunera gnd embeddm scheme with ap-
aeroelastic a phcations. AlAA aper 95 0048, plications to viscous flows. AI.& Paper 87-1 126,
AIAA ~ 3 r dlerospace Sciences Geeting, Reno, AIAA 8th Computational Fluid Dynamics Confer-
Nevada, January 1995. ence. Honolulu, HI, 1987.
[5] B.K. Anderson, J.L.Thomas,,+d B. Van Leer. A [211 M. Ber er and A. Jameson. Automatic ada tive
companson of flux vector sphtUngs for the Euler grid reAement for the Euler equations.AIAA !our-
equations. AIAA Paper 85-0122, Reno, NV,Jan- nul, 23561-568,1985.
uary 1985. 1221. M. BergerandRJ. Levewe. Anadaptivecartesian
.
[6] W.K.Anderson, J.L. Thomas, and D,L. Whiffield. mesh algorithm for the Euler e uations in arbitrary
geometnes. AIAA Paper 89-1830.1989.
s
Multigrid accelerationof the flux s lit Euler equa-
tions.AIAA Paper 86-0274,AIAA 4th Aerospace
Sciences Meeting, Reno, January 1986. [23] J.P. Boris and D.L. Book. Flux corrected transport,
1 SHASTA, a fluid trans ort al orithm that works.
[7] T.J. Baker. Mesh generation by a sequence of trans- J. Comp. fhys., I1:38& 1975.
formations. Appl. Num. Math., 2515-528,1986.
[24] A. Brandt. Multi-leveladaptivesolutionsto bound-
[8] N. Balakrishnan and S . M. Deshpande. New up- value problems. Math. Comp., 31:333-390,
wind schemes wjth wave-parUcle s littin for in- %7.
viscid compressible flows. Reporr 81
dian Institute of Science, 1991.
Id
12, In-
M.O. Bristeau, R. Glowinski, J. Periaux. P. Pemer,
0. Pironneau, and C. Pokier. On the numencal
a
[9] B. Baldwin and H. L o a . Thin layer a proxima-
tion and algebraic model for separate turbulent
flow. AIAA Paper 78257.1978.
solution of nonlinear roblems in fluid d y n m c s
by least squares and Knit, element methods (U),
application
applicatioi to transonic flow simulations. Comp.
comp.
Meth Appl. Mech. andEng., 51:363-394,1985.
[IO] B.S. Baldwin and T.J.Barth. A one-e uation tur-
bulence trans rt model for high ReynJds number [26] R.J. Busch, Jr. Corn utational fluid dynamics in the
wall-boundeg0flows. AIAA Paper 91-0610, AIAA desi n of the No&o /McDonnell Dou las YF-
29th Aerospace SciencesMeeung, Reno,NV,Jan-
uary 1991.
23 h
,F prototype., A
h paper 91-1629, AIAA
21st Flud D narmcs, Plasmgynamics & Lasers
Conference. Aonolulu. Hawau. 1991.
[ l l ] T. J. Barth. Aspects of unstructured ds and fi-
P.
nite volume solversfor the Euler and avier Stokes
equations. In von Karman Institutefor 8.
[271 C. Canuto, M.Y. Hussaini, A. uarteroni, and
D.A. Zan Spectral Methods in urd DyMmics.
namics Lecture Series Notes 1994-05, Springer+erIag, 1987.
1994.
[28] M.H. Carpenter, D. Gottlieb, ,and S . Abar-
[ 121 T.J. Barth and P.O. Frederickson. Higher order so- banel. Time-stable boundary conditions for f i ~ t e -
lution of the Euler equauons on unstructured gnds difference schemes solving hyperbolic systems:
uadratic reconstrucuon. AIAA paper 90- Methodology and a lication to hi h order com-
E3 3 anuary 1990. act schemes. C
barch 1993.
tA
Ir&
oR
pe! 93-9, hmpton, VA,
[13] T.J. Barth and D.C. Jes rsen The design
and application of upwmrxhemes on unstruc- 1291 D.A. Caughey. A diagonal implicit multigrid algo-
tured meshes. AIAA paper 89-0366, AIAA 27th rithmfor the Euler equations. AIAA Paper 87-453,
Aeros ace Sciences Meeting, Reno,Nevada, Jan- 25th Aerospace Sciences Meeting, Reno, January
uary 1689. 1987.
[14] J.T. Batina. Implicit, flux-split Euler schemes for
unsteady aerodynarmc analysis mvolvln unstruc-
tureddynarmcmeshes. AIAApaper90-0836.ApnI Boundary Layers. Acadenuc Press, 19 4. f
[30] T. Cebeci and A.M.O. Smith. Analysis o Turbulent
[36] J.P. CroisiLle and P. Villedieu. Kinetic flux spli,tting [53] W.Hackbusch. On the multi-grid method a plied
schemes for hypersomc flows. In M. Napobtano to hfference equations. Compuring, 2029f-306,
and F. Sobetta, editors, Pmc 13th International 1978.
Congress on Numerical Methods in FluidDynam-
ics, Dazes 31C-3313. Rome, July 1992. Springer [54] M.Hafez, J:C. South,andE.M. M-an. Artificial
compressibility method for numencal soluuons of
the transonicfull otentialequation.AIAA Journal.
[37] R.M. Cummings,Y.M.Rirk,L.B. Schiff,andN.M. 17:838-844,197!.
Chadeqian. Navier-Stokes,predictionsfor the F- 18
wing and fuselage a1 large- incidence. J. ofAircraj?, [55] M.G. Hall. Cell vertex multi d schemes for solu-
29:<65-574.1992. tion of the Euler uauons. E P m c . IMA Confer-
[38] G.Dahlquist. A s cial stability roblemforlinear ence on Numeric3 Methods for Fluid Dynamics,
multistep m e t h z B l T , 3:274! 1963. Reading, Apnl 1985.
[39] J.F. Dannenhoffer and J.R. Baron. Robust
adautation for comvlex transonic flows.
Papk 86-0495, A L h 24th Aerospace Sciences
ad [56] A. Harten. High resolution schemes for hyperbolic
conservation laws. J. Comp. Phys.. 49357-393,
1983.
Meeting, Reno,January 1986.
[40]D. Deganimd and L. Schiff. Computation Of ~ U I ~ U - [57] P.W. Hemker and S.P.Spekregse. Mulugrid solu-
lent supersom flows around pointed bodies havin tion of the steady Euler equatlons. In Pmc. Ober-
crossflowseparation. 1. Comp. Phys.. 66173-19t wol ach Meeting on Multigrid Methods. December
1986. 19d.
[41] B. Delaunay. Sur la sphhre vide. Bull. Acad. Sci- [58] J.L. Hess and A.M.O. Smith. Cgculation of
ence USSR VII: Class Scil, Mat. Nat., pages 793- non-Wing otential flow abou! arbitrary three
800.1934. dimensionafbcdies. Douglas Arcraft Report ES
40622,1962.
[42] S.M. Deshpande. On the Maxwellian distribution,
symmetnc form and entropy conservation for the [59] C. Hirsch, C.Lac01, andH. Deconinck. Convection
Euler equations. NASA TP 2583,1986. algorithms based on a dia onahation procedure
[43] A. Eberle. A finite volume method for calculat- forthernulti-dimensional~~re uauons ~n~mc.
ing transonic potential flow around wings from the AIM 8th CO utational Flu!d hnamics C a
ence. Dazes 3 - 6 7 6 , Hawau, June 1987.
minimum pressure inte al. NASA TM 75324, Paper ST1163.
1978. Translated from &B UFE 1407(0).
[44] P.R. Eiseman., A multi-surface method of coor- [60] C. Hirsch and P. Van Ransbmk. Multi-
dinate generauon. 1. Comp. Phys., 33:118-150, dimensional upwindin and artificial dissi ation.
1979. Technical re ort Publfshed in Fmntiers o Com-
putational duid'Dynarnics 1994, D.A. J u g h e y
and M. M. Hafez, editors, Wiley, pp. 597-626.
[61] D.G. Holmes and S.H.Lapon. Ada tive Irian-
ular meshes for compressible flow sofutions. In
hvceedings First International Conferenceon Nu-
merical Grid Generation in Corn utational Fluid
[46] L.E. Eriksson. Generatjon of boundary- namics, pages 413-424, Landhut, FRG,July
conforming gnds ,around wing-bod configura-
Al
tions usin transfimte mterpolauon. AA Journal,
201313-k320,1982.
386.
[62] T.J.R. Hughes, L.P. Frapca, andM. Mallet. A new
finite element formulauon for computauonal fluid
d namics.1, Symmetric forms of the compressible
d l e r and Navier-Stokes equations and the second
law of thermodynamics. Comp. Meth. Appl. Mech
namics Conference, San Diego, CA, June 1995. and Eng., 59:223-231,1986.
[48] R.P. Fedorenko. The s d of conver ence of one [63] A. Jameson. Iterative solution of transonic flows
iterative process. US&?Comp. Matf and Math. over airfoils and wings, including flows at Mach
Phys., 4:227-235,1964. 1. C o r n on Pure and Appl. Math., 27:283-309,
1974.
[49] C.W. Gear. The numerical integration of stiff ordi-
n a y differentialequations. Report 221,. Universi
of h o i s Department of Computer Science, 1967 [64]A. Jameson. Transonic potential flow calculauons
in conservation form. In Pmc. AIAA 2nd Com U
[501 M. Giles, M. Drela. and W.T. Thompkins. New- rational ~ ~ uDynamics
i d Conference, pages I&
ton solution of direct and inverse transonic Euler 161. Hartford, 1975.
equations. AIAA Paper 85-1530, Cincinnati, 1985.
[65] A. Jameson. Solution of the Euler equations by
[51] S.K.Godunov. A difference method for the nu- a mulugnd method. Appl. Math. and Comp.,
merical calculation of discontinuous solutions of 13:327-356.1983.
hydrodynamic equations. Mat. Sbomik, 47271-
306,1959. Translated as P R S 7225 by U.S. Dept. [66] A. Jameson. Solutionof the Euler e uations for two
of Commerce, 1960. dimensional transon~cflow by a mitigrid method.
[52] M.H. Ha. The im act of turbulence modelling on Appl. Math Comp., 13:327-356,1983.
the numerical prJctionof flows. In M. Na olitano
and F. Solbetta, editors, Pmc. o the 13th fnterna- [67l A. Jameson. Multigrid al orithms for compress-
f
tional Conference on Numerica Methods in Fluid
DyFmics, pages 27-46, Rome, Italy, July 1992.
ible flow calculations. In % e c o d Eumpean Con-
ference on Multigrid Methods, Cologne, October
Spnnger Verlag, 1993. 1985. Princeton University Report MAE 1743.
1-26
[69] A. Jameson. Transonic flow calculations for air- 'ds. In S . MZCormick, editor, MultigridMethods,
craft. In F. Brezzi. editor, Lecture Notes in Marh- Kay,A lications and~upercomputing.~ecture
ematics, Numerical Methods in Fluid Dynamics, Notes in Ere and Ap lied Mathematics, volume
pages 156-242. SpringerVerlag, 1985. I IO,pages 413-430,Af)pril1987.
[70]A. Jameson. Multi 'd al onthmsforcompressible [841 A. Jameson, ,W. Schmidt, and E. 'hrkel. Nu-
flow calculations. F W . 8ackbuschand U. Trotten- mencal solutlon of the Euler equations by finite
ber , editors, Lecture Notes in Mathematics, Vol. volume methods using Runge-Kutta time stepping
I 2 h ~ g e 166-201.
s PrFeedin softhe2ndEuro- schemes. AIAA Paper81-1259,1981.
pean onference on Mulugrid detbods, Cologne,
1985,Springer-Verlag, 1986. . . A. Jameson. W. Schmidt. andE. Turkel. Numerical
1851
solutions of the Euler equations by finite volume
[71]A. Jameson. A vertex based multigrid algorithm methods with Run e Kutta time ste ping schemes.
for three-dimensional corn ressible flow calcula- AIAA paper 8I-12%, January I98!I
tions. In T.E. Tezduar and F.J.R. Hu hes, editors [86] A. Jameson and E. Turkel. Implicit schemes and
Numerical Methodr for Cam ressib& Flow - Fi: LU decompositions. Math. Comp., 37385-397.
f
nite Di erence, Element Al k tVolume Techniques,
1986. S M E Publication AMD 78. 1981.
[87]M. Ja a r m and A. Jameson. Multigrid solution of
1721 A. Jameson. Aercd namic desi n via control the- the dvier-Stokes uations for flow over wings.
ory. J. Sci. Comp., $233-260, b88. AIAA aper 88 073,AlAA 26th Aeros ace Sci
[73] A. Jameson. Automatic design of mansonic air-
encesbeeting,Reno, Nevada, January 1888. -
foils to reduce the shock induced pressure drag. In [881 D. Johnson and L. King. A mathematically simple
Proceedings of the 3Ist Ismel Annual Conference turbulence closure model for attached and sepa-
on Aviation and Aemnautics, Tel Aviv, pages 5-17, rated turbulent boundary layers. A A 4 Journal,
February 1990. 23:1684-1692,1985.
[74] A. Jameson. lime dependent calculations using [891 W.P. Jones and B.E. Launder. The calculation
multi 'd, with applicatlons to unsteady flows ast of low-Reynolds-number phenomena with a two-
airfoipand wings. AIAA aper 91 1596,
loth Corn utational Fluid gynamics Conference,
A%A e uationmodelof turbulence. Int. J. ofHeat Tron.,
1%:1119-l130,1973.
Honolulu.!hawaii. June 1991.
[90]W.H. Jou. Bwin Memorandum AERO-B113B-
[75]A. Jameson. Artificial diffusion, upwind biFing, L92-018,Septemfer 1992. To Joseph Shang.
limiters and their effect on accuracy and mulugnd
convergence in transonic and hypersonic flows. [91] T.J. Kao, T.Y. Su,andN.J. Yu. Navier-stokescal-
AIAA Paper 93-3359,AIAA 11th Computational culations for transport wing-body configurations
Fluid Dynamics Conference, Orlando, FL. July with nacelles and shuts. AIAA Paper 93-2945,
1993. AIAA 24th Fluid Dynamics Conference, Orlando,
July 1993.
[76] A. Jameson. Artificial diffusion. upwind biasing, [921 G.E. Karniadakis and S.A. Orszag. Nodes, modes
limiters and their effect on accuracy and,mulu- and flow codes. Physics Today, pages 3442,
d convergence in transomc and hy BONC flow.
.EM paper 93-3359. AIAA I ~ t gmputatiqnal
h
Fluid D namics Conference, Orlando, Flonda,
March 1993.
[931 M.H. Lallemand and A. Dervieux. A multi-
July 1993. 'd finite-element method for solvin the two-
%ensional Euler equations. In S.F. &Connick,
[77]A. Jameson. Optimum aerod namic desi n via editor, Pmceedin s of the Third Cop er Mountain
boundary control. Technicdre ort, A6ARp Con erence on &It, rid Methods, &care Notes
FDPNon K m an hstiFtespecial 8,urseon opu-
mum Desi n Methods m Aerodynmcs, Brussels,
d
in ure and AppJiedhathematics, pages 337-363,
Copper Mountam. Apnl1987.
April 199j
[94]A.M. Landsber ,J P Boris. W. Sandberg, and T.R.
[781 A. Jameson. MAE Technical Report 2050, F'rince- Young. Naval sh$ ;uperstructure desi
ton University, F'rinceton, New Jersey, October
1995.
three-dimension flows using an
le1 method. High Perfomnce Com uting ?99$
e&i2tmgF
Gmnd Challenges in ComputerSimu%tion, 1993.
[791 A. Jameson. Analysis and design of numerical
schemes for, gas d y n m c s 1, e f i c i a l diffusion, [95]P. D. Lax. H perbolic systems of conservation
upwind biasmg, b t e r s and theu effect on multi- laws. SIAM dgional Series on Appl. Math., U,
grid convergence. Int. J. of Comp. Fluid Dyn., 1973.
4171-218,1995.
[80]A. Jameson. Analysis and design of numerical
schemes for as dynamics 2, artificial diffusion
and dscrete dock shuchue. Int. J. of Comp. Fluid [97]P.D. LaxandB. Wendroff. Systems ofconservation
Dyn.. To Appear. laws. Comm Pure. Appl. Math., 13917-237,1960.
1811 A. Jameson and T.J. Baker. Improvements to [98] B. Van Leer. Towards the ultimate conservative
the aircraft Euler method. AIAA Paper 87-0452, differencescheme. U. Monotonicit and conserva-
AIAA 25th Aerospace Sciences Meeting, Reno, tion combined in a second order scgeme. J. Comp.
January 1987. Phys., 14:361-370,1974,
1-27
[99] B. Van Leer. Towards the ultimate conservative [ 1151 L. Mdnelli, A. Jameson, +d E. Malfa. Numerical
difference scheme. ID upstream-centered finite- simulation of three-dimennonal vortex flows over
difference schemes for ideal compressible flow. J. delta wing configurations. In M. Napolitano and
Comp. Phys., 23963-275,1975. E Solbetta, editors, Pmc. 13th International Con-
frence on Numerical Methoh in Fluid Dynamics,
ages 534-538. Rome, Italy, July 1992. Springer
eerlag, 1993.
[116] L. Martinelli and V. Yakhot. Y G - b F e d ,turbu-
lence transport approximations w~thapphcauons to
transonic flows. AIAA Paper 89-1950, AIAA 9th
Com utational Fluid Dynamics Conference, Buf-
[ 1011 B. Van Leer. P r o ~ s &multi+imension_al-up-
s f a l o , b , June 1989.
wind differencing. In M. Napolltano and F. Sol-
betta, editors, Pmc. 13th International Conference [I171 D.J. Mavri lis and A. Jameson, Multifid solu-
on Numerical Methotds in Fluid D f u z m y jages tion of the%avier-Stokes equauons on tnangular
1-26, Rome, July 1992. !jpnnger erlag 19 3 meshes. AIAA Journal, 28(8):1415-1425, August
1990.
[lo21 B. Van Leer, W. T. Lee, and P. L. Roe. Charac-
teristic tlme stepping or local preconditionin of [118] D.J. Mawiplis and L. Martinelli. Multigrid solu-
the Euler equauons. AIAA aper 91 1552, A h A tion of compressible turbulentflow on unstructured
loth Corn utational Fluid &namicE Conference, meshes using a two- uation model. AIAA Paper
Honolulu,%awaii, June 1991. 91-0237, January 1 9 3 .
[ 1031 D. Lefebre, J.Peraire, and K.Mor an. Fhte ele- [119] N. D. Melson, M. D. Sanehilt, and H. L. Atkins.
ment least squares solutions of the,
E! uler equations lime-accurate Navier-Stokes calculauons w~th
using linear and quadrauc ap roxunauons. Int. J. multigrid acceleration. In Pmceedings of the Sixth
Comp. FluidDynamics, 1:l-$3. 1993. Copper Mountain Conference on Multigrid Meth-
ods, Copper Mountam, Apnl1993.
[ 1041 S.K.Lele. Compact finite difference schemes with
tral-like resoluuon. CTR Manuscnpt 107,
?KO.
[lo51 J.L. Lions.
[121] E Menter. Zonal twc- uation k w turbulence
models for aerod namicYows. ~ I A A Pcper 93-
[106] M-S. Liou and C.J. Steffen. A new flux splitting 2906, AIAA 2 4 d Fluid D y n m c s Meehng, Or-
scheme. J. Comp. Phys., 107:23-39,1993. lando, July 1993.
[ 1221 E.M. Murman. Analysis of embedded shock waves
[lo7 E Liu and A. Jameson. Mu!tigrid Navier-Stokes calculated by relaxation methods. AIM Journal,
calculationsfor three-dimensional cascades. AlAA 12626433.1974.
aper 92-0190, AIAA 30th Aeros ace Sciences
beeting, Reno, Nevada, January 1982. [ 1231 E.M. Murman and J.D. Cole. Calculation of plane
steady transonic flows. AIAA Journal, 9:114-121.
[ 1081 R. Lohner and D. Marfin. An implicit linelet-based 1971.
solver for incompressible flows. AIAA a er 92
0668, AIAA 30th Aerospace Sciences?&etinL [ 1241 R.H. Ni. A multiple grid scheme for solving the Eu-
Reno,W.January 1992. lerequations. AlAA Journal,201565-1571,1982.
[lo91 R. Lohner, K. Morgan, and,]. Peraire. Improved [ 1251 S. 0bayas.F and K. Kuwakara. LU factorization
adaptive rehement strateges for the finite ele- of an imphcit scheme for the compressible Navier-
ment aerod namic conligurations. AIAA Paper Stokes uations. AIAA Paper 84-1670, AIAA
86-0499, Al%A 24th Aerospace Sciences Meeting, 17th F l 8 Dynamics and Plasma Dynamics Con-
Reno, January 1986. ference, Snowmass.June 1984.
[I101 R. Lohner, K. Morgan, J. Peraire, and O.C. [126] J.T. Oden. L. Demkowicz, T. Liszka, and
Zienkiewicz. F~mteelement methods for high W.Rachowicz. h- ad tive finite e!ement meth-
s ed flows. In Pmc. ALAA 7th Computational ods for compressibye an? incompressibleflows. In
Xid Dynamics conference. Cincinnati, OH, S. L. Venneri A. K.Noor, editor, Pmceedings of
1985. AIAAPaper85-1531. the Symposium on Computational Technology on
Flight Vehicles, a es 523-534, Washington, D.C.,
[l111 R. Liihner and P. Parikh. Generation of three.- November l d & r g a m o n .
dimensional unstructured grids by the advancing [127] S. Orszag and D. Gottlieh. Numerical analysis of
front method. AIAA Paper 88-0515. Reno, W , spectral methods. SIAM Regional Series on Appl.
January 1988. Math., 26,1977.
[ 1121 R.W. MacCormack ,and A.J. ,Paullay. . Com uta 11281 S. Osher. , Riemann solvers, fhe entro condi-
tionalefiiciency acheved by m e sphmng of E n i i tion, and difference approxunatlons.S l A r J . Num.
differenceoperators.AIAAPaper 72-154.1972. Anal., 121:217-235,-1984.
[ I 131 A. Majda and S: Osher. Numerical viscosity and [129] S. Osher and S. Chakravarthy., Hi h resolution
theentro condrtlon. C o r n onPureAppl. Math., schemes and the entropy condmon. h U J . Num
32:797-& 1979. Anal., 21:955-984,1984.
[114] L.Martine1liandA. Jameson.Validationofamulti- [130] S. Osher and E Solomon. Upwind difference
d method for the Reynolds averaged equauons. schemes for hyperbolic s stems of conservation
E 4 A paper88-0414.1988. laws. Math. Comp., 38:336-374, 1982.
1-28
U1311 S. Osher and E. Tadmor. On the convergence of [1481 C.L. Rumsey andYN. Vatsa. A corn arison of the
difference approximations to scalar conservation %
predictive capabilities of several tur ulence mod-
els using upwind and centered - difference com-
laws. Math. Comp., 5019-51, 1988.
puter codes. AIAA Paper 93-0192, AIAA 31st
[I321 H. Pailhe and H. Deconinck. A review of multi- Aerospace Sciences Meeting, Reno, January 1993.
dimensional upwind residual distribution schemes
for the euler equations. To appear in CFD Review, 11491 S.S. Samant, J.E. Bussoletti, ET. Johnson, R.H.
1995. Burkhart,B.L. Everson, R.G. Melvin, D.P. Youn
L.L. Erickson, and M.D. Madson. TRANAIR k
corn uter code for transonic anal ses of a r b i t r ~
[1331 N. A. Pierce and M. B. Giles. Preconditioning on
stretchedmeshes. Report9YlO 1995, OxfordUni- contf)gurations.AIAA Paper 87-0834.1987,
versity Computing Laboratory, Oxford, December
1995. [ 1501 K. Sawada and S. Takanashi. A numerical investi-
ation on wing/nacelle interferences of USB con-
U341 0. Pironneau. Optimal Sha e Desi n or ENiptic EgFation. In Pmceedin s AIAA 25th Aemspace
System. Spnnger-Verlag, dw
Yorf, f984. Sciences Meeting, Reno,&, 1987. AIAA paper
87-0455.
[ 1351 K.G. Powell and B. van Leer. A genuinely multidi-
mensional upwind cell-vertex scheme for the Eu- [151] C.W. Shu and S. Osher. Efficient implementa-
ler equations. AIAA Paper 89-0095, AIAA 27th tion of essentially non-oscillato shock-ca turing
Aerospace SciencesMeeting, Reno, January 1989. schemes. J. Comp. Phys., 77:43v471,198!.
[136] K. Prendergast and K. Xu. Numerical hydrody- [1521 C.W. Shu and S. Osher. Efficient implementa-
namics from gas kmetic theory. J. Comp. Phys., tion of essentially non-oscillato shock-ca turing
109:5366, November 1993. schemes 11. J. Comp. Phys., 83:%-78,1988.
[ 1371 T.H. pulliam and J.L. Steger. Implicit finite differ- [1531 C.W. Sbu, T.A. Zang. G. Er1ebacher.D. Whitaker,
ence simulations of three-dimensional compress- and S. Osher. High-order EN0 schemes ap lied
ible flow. AlAA Journal, 18:159-167,1980, to two- and three-dimensional compressible iow,
Appl. Num Math., 9:45-71, 1992.
[I381 J.J. Quirk. An altemauve to unstructured gnds [I541 B. R Smith. A near wall model for the IC - 1 two
for compuung as dynamics flowsnbout arbitranly equation turbulence model. AIAA paper 942386,
com lex two-cknensional bodies. ICASE Report
92-$Hampton. VA, February 1992. 25th AIAA Fluid D namics Conference, Colorado
Springs, CO, June 1'994.
11391 R. Radespiel, C. Rossow, and R.C. Swanson. An
efficient cell-vertex multigrid scheme for the three- [I551 L.M. Smith and W.C. Reynolds. On the Yakbot-
dimensional Navier-Stokes equations. In Pmc. Orszag renormalization ou for deriving turbu
AIAA 9th Corn utational Fluid D namics Con- lence statistics and modef PRys. FluidSA, 4:36L
ference, pages %9-260, Buffalo, I$Y, June 1989. 390,1992.
AIAA Paper 89-1953-CP. [1561 R.E. Smith. Three-dimensional aleebraic mesh
U1401 M.M. RaiandP. Moin. Direct numerical simulation eneration. In Pmc. AlAA 6th Corn uktional Fluid
of transition and turbulence in a spatially evolvin bynamics Con erence, Danvers, d A , 1983. AIAA
boundary layer. AIAA Pa er 91 1607 CP,
loth Corn utational Fluidgynakcs Conference,
d Paper 8 3 - 1 9 d
Honolulu,&. June 1991. [ 1571 R.L. Sorenson. Elliptic generauon of compressible
three-dimensional gnds about realistic aircraft. In
(1411 S.V.RaoandS.M.Deshpande. Aclassofefficient J. HauserandC.Taylor, editors,Infemotiono/Con-
kinetic U wind methods for com ressible flows. feynce on Numerical Grid Generation in Com U
Report9fFMI1, IndianInstituteo! Science, 1991. fafionalF/uidDynomics,Landshut, F. R. G., 1886
[ 1421 J. Reuther and A. Jameson. Aerod namic shape op- [ 1581 R.L. Sorenson. Tbree-dimensional elliptic gjid
eneration for an F-16. In J.L. Steger and
timization of wing and wing-bod; c o d
using control theory. AIAA paper 9 5 - 0 E % : B.Generation
F. Thompson, editors, Thne-Dimensiona/ ~ r i d
or Corn lex Confi umrions: Recent
Aerospace Sciences Meeting and Exibit, Reno,
Nevada, January 1995. Progress, lh8.AGhiDograpf?
H431 J. Reutherand A. Jameson. Aerod namic shape op- [159] P. Spalart and S. Allmaras. A one-equation tur-
J
timization of wing and wing-bo y c o d urations
usin control theory. AIAA aper 95-01f7, AIAA
bulent model for aerodynamic flows. AIAA Paper
92-0439, AIAA 30th Aerospace SciencesMeeting,
331Lf Aerospace Sciences d e t i n g , Reno,Nevada, Reno, Nv, January 1992.
January 1995. [160] C.G. Speziale,, E.C. Anderson, and R. Abid. A
[1441 H. Rieger q d A. Jameson: Solution of steady critical evaluatlon of two-equatlon models for near
three-hmensional comgressible Euler and Navier- wall turbulence. AIAA Paper9C-1481, June 1990.
Stokes e uations b an im licit LU scheme. AIAA ICASE Report 90-46.
p e r ' 8 h 6 1 9 , A h 26% Aeros ace Sciences [I611 J.L. StegerandDS Chaussee. Generation ofbodv- .-
ceeting, Reno, Nevada, January 19118. ~~
Figure 21: Navier-Stokes Predictions for the F-18 Wing-Fuselage at Large Incident,
22a: RAE-2822 Airfoil 22b: NACA-0012 Airfoil
Figure 22: 0-Topology Meshes, 160x32
'1
- 4
- 3
- 5
-a
-8
-I
- 5
'1
P
2 k C, after 35 Cycles.
wob
24b Convergence.
Ci=0.3654, Cp0.0232.
Figure 2 4 NACA-0012Airfoil at Mach 0.800 and C Z = ~ . ~ ~ ~ H -Scheme.
CUSP
-
% %
a
I:
3 I $ - -
Wod
25a: C, after 35 Cycles. 25b Convergence.
Ci=0.3861, Cd=0.0582.
Figure 25: NACA-0012Airfoil at Mach 0.850 and a=l.OoH-CLJSPScheme.
1-33
a
*
3-
/*+ ' ...
*..
e 4 -
................:
;
-
*/
:4*..
2 1 **.
I . 4:
si
..............
9-
:/
*
c
*+:*a. .....
*.
3- :
t
(1'
9-
5-
2- $11
a
4
.*
% 9 -
8.
'i 1 ,
'1 i
3 -
29a: span station z=O.OO 29b span station 24.312
- -
U"?
P
I.'
f!
t
f '1
!J
8
$1
7-
1-
U"?
'i
,j
- -
30c: span station z=0.625
f!J
- -
e
+c
s""*x++++++++++++++++++++++
a - +
;F ............... .......... ++++
+.
. + 8 -
+:
" ".
9 i t
.
.
t
."" 9 - ," . .
'I".
ii
c
t
t s
+
i
'1
SJ
I E
- -
31c: span station z=0.625
'1
S i
Paul E. Rubbert
The Boeing Company
Boeing Commercial Airplane Group
P.O. BOX3707, M / S 67-UC
Seattle, WA 98 124-2207
U.S.A.
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
2-2
We in industry are puisuing those goals by factors and forces that powered the research
focusing on processes (ref. 2). We now engine. This description appears to be
understand that the key to developing better universal. It looks the same, no matter
airplanes is to analyze, understand, and whether you reside in industry, academia, or
improve the processes by which airplanes in a government laboratory. It works the
are created. Similarly, the key to developing same way. Only the names of the players
better CFD is to analyze, understand, and may differ.
improve the processes by which CFD is
created. We also now understand that the Key players are the money givers. Their
leading principle of good processes is role is to divide up money into various large
customer focus and customer satisfaction. buckets, each directed at a particular
That principle applies equally to the category of research, and to distribute it.
processes that produce airplanes and the We all know who those people are. They
processes that produce CFD. are the ones to whom we write research
proposals. Money givers can be found in
And so it seems to me that the most NASA, in the National Science Foundation,
significant pacing item in the world of CFD in the Department of Defense, and in similar
is the need to analyze, understand and institutions in Europe. They also are present
improve the process by which CFD in industry.
capabilities are created. I call that process
the research engine. There is more leverage Most money givers are not close to the real
in fixing up the research engine and adapting details of airplane design, or to the detailed
it to the changes in the world than in processes that use CFD as a tool. They
anything else I can think of. And so that is operate at a higher, more strategic level. But
what I am going to talk about. they still need criteria by which to decide
how to divide up the money. It is instructive
to take a look at what some of those criteria
En in n H wi were.
Worked
One such criteria was to divide up the
The research engine as we know it today money based on historical precedent. That
involves industry, academia, and was, and still is, practiced far and wide. It is
government. Those three components a symptom of zero accountability and zero
interact with each other as a system. And ability to discern what is important.
like most systems, one component cannot be
changed without affecting the others. It Money givers are also susceptible to being
doesn't work for industry to change and the influenced by the visionary utterances of the
others not to change. We are all in this people who inhabit the lower left box of
together. figure 1, the research leaders. Research
leaders are in the business of creating and
The need to change pervades the entire marketing visions of how to make the world
research infrastructure. It involves better. Many of them have become very
information systems and the methods by good at creating visions for research that
which we communicate, including the holy will be looked favorably upon by the money
grails of technical societies, publications, givers. They treat the money giver as the
and technical conferences. It involves the customer. One result of that, of course, is
changing of value systems, which is almost that the research funding decisions that get
a cultural characteristic. And reward made can be quite unrelated to the true
systems. Changing is not easy. needs of the people who design airplanes for
a living.
I would like to begin by examining how the
research engine functioned in the era that we Money givers also are desirous of evaluating
are leaving behind. Figure 1 (see Page 2-3) the caliber of the researchers to whom they
presents a description of the fundamental will give money. It is rarely possible to
2-3
monies captured
0 Size of your empire
~~
point to a feature on an airplane and say today travels largely by other means.
“this research contributed to So, it was
---.‘I Another consequence of the numbers game
necessary to establish other measures in is that it encourages researchers to attack
order to create a value system which could problems that they know how to solve rather
be applied to individuals. than the problems that need to be solved.
And so our entire research infrastructure was
One popular measure was to look at the caught up in a value system that was largely
prestige bestowed upon a researcher, not by unrelated to what was important to the
his customers, but by his peers, the other engineers who design and build airplanes for
researchers. Can you imagine what a living. What counted was paying homage
automobiles would be like if the criteria for to a value system that controlled access to
designing them was to please the other the annual pot of money necessary to
designers, rather than the people who want support the research leader and hisher staff.
to use cars to drive about in?
A standard part of the job of being a
Another popular measure has been to count research group leader was also to make all
the number of refereed papers that are of the important decisions concerning the
produced by a researcher. One consequence detailed content of the annual research plan.
of this is that our journals and conferences After all, since research leaders are normally
have become littered with papers whose real exposed to new and emerging technology
contribution is low or nonexistent. The that a design engineer is not, it was quite
journals have evolved into being primarily a obvious to research leaders that they, and
scorekeeping system. Scientific information not design engineers, should be in charge of
2-4
defining the annual research plan. And so were dependent upon outside contract
the design engineering community was funding as a source of research money.
excluded from participating.
Communication over the fence was mostly
The researchers themselves focused their one way. It consisted primarily of attempts
work on paying homage to the value system, by researchers to interest the engineering
, because that is what entitled them to go back community in the products of their research.
to the well for next year's funding, and to The system coined a name for this, calling it
become eminent in the eyes of their peers. "technology transfer."
So there it is. A stable, self-sustaining The favored means of lofting the results of
research engine that was capable of CFD research over the fence was to send it
, functioning quite smoothly, all by itself in across on the wings of a scientific
its own little world. It did so for many publication. The publication was the
years. Its weakness, of course, is that it had messenger that told of its charms and 'I
, been almost disconnected from the attributes. And to make sure that at least
community of people who we now some folks in the airplane company would
understand to be the customers of CFD see it, the researcher empowered his delivery
, research, namely the practicing engineers system to honk, to attract attention. Such
who design airplanes for a living. honking is frequently heard at technical
conferences and symposia. In fact, that
Figure 2 (see Page 2-5) exhibits the seems to have become the prime motivation
interfaces, such as they were, that existed for conference attendance. Overlooked was
between the research engine and the the fact that airplane design engineers rarely
aeronautical industry. One such interface attended those conferences.
involved the money givers, who were visited
periodically by clouds of collective wisdom The boards of the fence have names
passing overhead. Those clouds appeared in inscribed upon them, entitled "conferences,"
the form of high level advisory committees, "journals," "perceptions," "value systems,"
wishes of the U.S.Congress, or of industry "reward systems," etc.. Those pillars of
executives, depending upon where the tradition and conventionality are turning out
money giver happened to reside. It is not to be among the factors that impede our
entirely coincidental that these clouds are ability to create a research engine that is
shown to be comprised of the condensation more properly connected to the customer.
of hot air rising from airplane companies. In
any event, the resulting fallout from these It was a very eye-opening experience to us
clouds caused the money givers to in the United States when NASA instituted
occasionally re-balance their research some dramatic changes in communication.
portfolios. They changed the format of some of their
conferences from one wherein the
The other interface lay between the researchers did all of the honking to one in
researchers and the practicing engineers who which industry did most of the talking and
reside in airplane companies. This interface researchers did most of the listening. Lo and
is characterized by the fence in figure 2. behold, it was discovered that the research
Interestingly, the site of the fence was not community was not in fact immune to
always in front of the door of the airplane learning about what was important. We
company. It frequently could be found found that they could even learn from people
inside the airplane company, standing who didn't have PhDs and a lengthy record
between the internal company research of refereed publications. The power of two-
department and the practicing engineers who way communication began to be unlocked!
designed the airplanes. In those cases, the
company research departments paid most So, somewhere along this journey of change
allegiance to the research engine and acted we must abandon or at least supplement our
as an integral part of it, particularly if they old, one-way habits of communication as
2-5
I I
.-\
i I
-1
L
a
Research Leaders
What lira my
--
job hecnme?
ability to draw upon all of our be communicated, and when. And then we
intellectual resources, both in have to institute mechanisms to make it
developing the research plan and in happen.
executing it.
nimbleness in translating the output I have been fortunate enough to have
of research into products and enjoyed the privilege of running a research
operation that encompassed the entire span
processes. of a research food chain, from foundational,
a value system that causes more of enabling algorithm technology, and fluid
the "right" things to occur. mechanics, to production software, and
customer support. In that position, I was
and perhaps most important of all, a able to experiment, so I learned about some
value system that supplies high things that don't work and other things that
levels of human motivation and do work in properly connecting the two ends
sense of worth, one that leads people of the R&D food chain.
from within to do more of the right
things, and to make it fun once again One thing that doesn't work well at all is to
to be a researcher. have research leaders at the head of the food
chain simply ask the folks at the bottom of
Even though my vision of how to the food chain what they want or need, and
accomplish all of that is yet incomplete, I then to blindly carry out their wishes. That
find within myself a growing conviction leads mostly to short-term, evolutionary
about some of the things that tomorrow's improvements of limited vision. It leads to
research engine must contain. One of those tactical research rather than the strategic
things is a better understanding of the proper research which belongs at the head of the
distribution of roles, responsibilities, and food chain, and it places the researchers in
core competencies that should prevail across the position of "the boiler mom'' staff. They
the R&D food chain comprising industry, have much more to offer than that. Many
academia, and government. What that people have yet to learn the true meaning of
distribution should be can be derived by the words "customer focus" that have
testing it against the axioms that accompany entered our language.
the new indusmal paradigm, an exercise that
certain segments of the research What does work, not only well but
establishment find to be somewhat incredibly well in connecting the two ends
threatening. One outcome of that testing is of the R&D food chain, is to do the
the finding that a best and proper role for following four things:
academia, and for much of NASA, is to
concentrate on the foundational, 1. Eliminate the constraints imposed
overarching, enabling technology research in a researcher's mind by the value
which comprises the head of the R&D food system under which he/she was
chain. educated. Make it O.K. to do
Another of my convictions is that we must things that are outside of the limits
find a much better way of connecting the top imposed by an overly narrow value
and the bottom of the R&D food chain. This system. Create a mind-set and a
curiosity within the researcher to
is something that we as a country have not wander freely up and down the
yet learned to do well at all. And yet the R&D food chain and even into
issues involved are central to achieving a manufacturing.
research engine that contains the attributes
that we desire. This is easier said than done. But it can be
Connecting the two ends of the food chain is done. I've done it! It has to be done,
an issue in communication. We have to because, more than anything else, it is the
develop an understanding of what needs to key that unlocks the power and the
potential of the highly educated, highly
2-8
opposition. This requires an act of courage that will be "touched and felt" and usually
on the part of the research leader and the should not be comprised.
money givers. But in my experience it
rarely fails to produce handsome dividends. Some people (managers and software
The only problem, if it should even be called specialists in particular) will be troubled
a problem, is that at this stage nobody yet with the idea of producing code that is
knows in exactly what form that dividend undocumented, which probably does not
will be experienced. That appears only in adhere to standards, and which contains
step four. shortcuts. That is because they interpret the
code to be the product. They fail to realize
Step 4 is what I call "vision building." This that the primary product of research at this
is the key activity that converts push to pull stage is vision, not code!
in the R&D food chain. The primary cause
of failed research - and I define failed to The best way I found to build vision was for
mean research that doesn't get picked up and the researchers to again return to the
used by anybody - is that the vision from customer site. They would identify real
the head of the food chain that propelled the design problems being faced by the design
research, and the vision from the bottom of engineers and they would set up and run
the food chain about what those folks think demonstrations of their new CFD
is useful, have no common intersection. If technology on those problems. This led to
those two visions, originating from opposite side-by-side comparisons of new versus old
ends of the food chain, cannot be made to ways of doing things. It frequently did not
intersect, the research will not be accepted. contribute much at that point to the
It will be ignored by the people who call the engineering project's near term design goals
shots in determining what CFD gets used in because the code was still developmental,
the design of airplanes. fragile, hard to use, perhaps containing a few
bugs, and not yet trustworthy.
And so, a key element in the successful
operation of an R&D food chain is the What it did do, and do well, was to build
process that I call "vision building," a vision within the minds of design engineers.
process for bringing together the separate A typical reaction to a set of these
visions that originate at the two ends of the calculations would be ' I s 0 that is what you
R&D food chain. What does it take? can do! Well, if you add this and that, I can
Throwing publications or codes over the use it for
-----.'IThat is vision building! At
fence, which is the traditional approach to that point the engineer becomes an advocate
vision building, doesn't work well at all. of the research. This is when "push"
Presenting "gee whiz" papers at conferences changes to "pull" in the R&D food chain.
doesn't work. Arguing back and forth
doesn't work well either. Neither does The other thing that must happen is that the
voting. I've tried them all. researcher must be able to now let go of his
original vision, the one that led him to
What works is for the research community produce the CFD technology that is being
to produce something that an engineer can demonstrated. He must allow himself to be
"touch and feel," usually a CFD code influenced by the engineer-now-becoming-
capable of performing a small number of the-customer. He must adopt a new and
computations that illustrate what can be better vision.
done. This is not the time or the place for
well-documented code, user friendly input Vision building must be a two way street. It
formats, or polished and orderly software. is a coming together, in the middle, of what
Rather, the researcher at this point is were originally different visions at opposite
engaged in a race to discovery and ends of the food chain. It is not for one end
understanding before his fragile support of the food chain to convince the other end
system runs out of patience. Shortcuts are that its vision is best. It demands two-way
acceptable and encouraged, with one communication. It is intense. It requires
exception. That exception is execution
efficiency. This is one of the key measures
2-10
REFERENCES
1. Raj, P., "Requirements for Effective
Use of CFD in Aerospace Design,''
NASA CP 3921, pp 15-28, May 1995
2. Rubbert, P. E., "AIAA Wright
Brothers Lecture: CFD and the
Changing World of Airplane Design,"
ICAS-94-0.2, September 1994
3- I
The paper presents an overview of parallel com- This section presents an anecdotal discussion of
puting in computational fluid dynamics. A tax- the earliest refernce to parallel computing, de-
onomy of parallel computing architectures and scribes Flynn’s and Bell’s classifications of paral-
programming paradigms is described. Issues in lel computer architectures, and briefly discusses
parallel computing are discussed including do- the message passing and data parallel program-
main decomposition and load balancing, perfor- ming paradigms.
mance, scalability, benchmarks and portability.
Examples of experience with parallel computing
in the aerospace industry is described.
2.1 Introduction
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
3-2
War 11, was a parallel computer with 25 inde- puter. Modern single-processor workstations
pendent computing units (20 accumulators, 1 or micro-computers are examples of this cate-
multiplier, 1 divider/square rooter, and 3 table gory. Single Instruction Stream/Multiple Data
look-up units) performing different tasks for the Stream (SIMD) computers have several compu-
solution of the specific problem. Moreover, the tational units which can perform the same op-
ENIAC used decimal arithmetic internally (as op- eration (e.g., adding two numbers) simultane-
posed to the binary arithmetic used on modern ously on different parts of the data stream. An
computers) and operated on all ten decimal dig- example is the Cray C-90. Multiple Instruc-
its of a number in parallel. The ENIAC was pro- tion Stream/Single Data Stream (MISD) implies
grammed in hardware, i.e., using a plugboard to simultaneous different operations by separate
wire connections between the units. However, computational units on the same data stream.
the parallel computing capability of the ENIAC Examples of this type are rare. Multiple Instruc-
was never fully realized in practice. After two tion Stream/Multiple Data Stream (MIMD) in-
years of operation, it was reconfigured as a serial dicates multiple computational units operating
centralized computer [2]. simultaneously on multiple data streams. Ex-
amples are the Thinking Machines Corporation
There are four distinct levels of parallelism
CM-5, the Cray T3D and, indeed, the E N I A C .
[l]. The highest level is job, where the com-
puter system operates simultaneously on unre-
lated tasks (e.g., a CFD simulation for an F- Table 1: Flynn’s Taxonomy
18 and a CEM simulation for a B-2). The sec-
ond level is program, where the computer sys- Acronvm Definition
tem operates simulaneously on different parts SISD Single Instruction Stream -
of the same program (e.g., the parallelization Single Data Stream
of a DO loop across multiple processors). The SIMD Single Instruction Stream -
third level is instruction, where the different in- Multiple Data Stream
structions are performed in parallel (i.e., fetch- MISD Multiple Instruction Stream -
ing one instruction from memory while perform- Single Data Stream
ing an arithmetic operation). The fourth level is MIMD Multiple Instruction Stream -
arithmetic and bit, where parallelism is achieved Multiple Data Stream
within an individual arithmetic or bit instruc-
tion. This paper focuses on the second level I I
Flynn [3] originated a classification of paral- Figure 1: Bell’s taxonomy of MIMD architec-
lel architectures which has become widely tures (with examples)
accepted (Table 1). Four distinct categories
Flynn’s classification, although useful for
are defined based on the data stream which is
broadly categorizing parallel computers and
the sequence of instructions and/or data exe-
widely cited, is nonetheless incomplete, and var-
cuted or operated on by a processor. Single
ious other classifications have been introduced.
Instruction Stream/Single Data Stream (SISD)
Bell [4] subdivides the MIMD category into two
is the conventional serial architecture employ-
subcategories as indicated in Fig. 1. Multi-
ing a single stream of data and a single pro-
processors are parallel computers with a single
cessor. This is also known as the von Neu-
address memory (shared memory), i.e., the
mann computer (or architecture) or a serial com-
. . central memory (RAM) is organized into a sin-
3-3
gle logical address domain which is accessible Other classifications of parallel computers have
to all of the processors (Fig. 2). Processors been developed, e.g., Shore [5], and Hockney
P 1 , . . . ,P n can access the same data in mem- and Jesshope [l].
ory (i.e., the same address location), albeit not
simultaneously. Examples are the Cray C-90 2.3 Parallel Programming
and SGI Power Challenge XL. Multicomputers
are parallel computers with multiple distributed There are two basic types of parallel program-
memory address spaces (Fig. 2). Processor ming paradigms (or environments). As the
P1, . . . ,P n have dedicated, independent mem- name suggests, message passing involves the
ories M1, . . . ,M n which are not directly acces- explicit use of send and receive functions by the
sible by each other. Examples are the Intel applications programmer. These functions com-
Paragon, IBM SP2 and networks of individual municate information between the memory as-
workstations. If processor P1 needs to access signed to individual processors. Many manufac-
data in the memory assigned to processor Pn, turers of distributed memory parallel computers
it sends a message to Pn requesting the data, have developed specialized message passing li-
and P n complies. The transfer of data from braries (e.g., nCUBE, Intel), although standards
the memory of one processor to the memory are emerging (see $3.5.2 and 3.5.3). The data
of another is denoted message passing, and parallel paradigm involves a single program
is a principal characteristic of multi-computers. which controls the distribution of data across
All communications between processors occur all processors, and the operations on the data.
through a communications network C in Fig. Typically, the data parallel language supports
2. Many different types of communications net- array operations and permits entire arrays to be
work topologies have been developed (see Fig. 3 used in expressions. Manufacturers of shared
of Bell [4]). memory parallel computers have developed spe-
The relative advantages and disadvantages of cialized compiler directives for data parallel pro-
multi-processors us. multi-computers have been gramming (e.g., Cray C-90 and SGI). An emerg-
widely studied, and numerous research (and pro- ing standard for a data parallel language is High
duction) machines of both types have been con- Performance Fortran ($3.5.1).
structed [4]. Although greatly oversimplified,
the main issues are as follows. For a multi- 2.4 Examples of Parallel Computers
processor, the shared memory eliminates the
computational cost and program complexity of Table 2 lists a number of current parallel com-
message passing. However, a multi-processor puters. It should be emphasized that the infor-
with a single shared memory is not scalable, mation shown does not fully describe the capa-
i.e., the architecture cannot simply be scaled bilities (and limitations) of a parallel computer.
to an arbitrary number of processors and arbi- Other relevant factors include memory band-
trary memory size. This arises from the limi- width, cache memory, 1/0 bandwidth, compiler
tation on data transfer rate (bandwidth) be- technology, debugging software, etc. Further-
tween memory and processors. This has led to more, the performance specifications change fre-
a subdivision of multiprocessors into two cate- quently due to product upgrades.
3-4
LEGEND
GByte Gigabyte ( lo9 byte)
MFlops Millions of floating point operations per second (theoretical maximum)
NOTES
2. Memory does not include secondary memory storage (e.g., Solid-state Storage Device (SSD)
on the Cray C-90/T-90).
3. Dates in parentheses indicate manufacturer’s published date for availability.
3-5
3.3 Scalability
tP/(t" t t P ) ,
1
(4)
O(N3/n)2/3 = ( 1 )’I3
(5)
maximum performance. Future enhancements
to the NAS Parallel Benchmarks include the de-
Cx N3/n N3/n velopment of a version using High Performance
Thus, for a fixed number of cells, the relative Fortran and Message Passing Interface (see be-
cost of communications can increase as the num- low).
ber of processors increases’.
3.5 Portability
3.4 Benchmarks
In recent years, significant effort has been de-
Numerous benchmarks have been developed for voted to the development of standardized en-
parallel computers2. All benchmarks have limi- vironments for development of parallel codes.
tations, of course, and the overemphasis on (and Three specific areas of activity are discussed
misuse of) benchmarks has naturally led to a here, namely, development of a standard For-
somewhat skeptical attitude towards them. This tran for parallel computing (HPF), a standard
is perhaps best epitomized by Bailey’s “Twelve for heterogeneous, network-based parallel com-
Ways to Obfuscate the Performance of a Parallel puting environments (PVM), and the more re-
Machine” [121. cently developed standard message-passing in-
terface MPI. There are many other similar re-
‘An alternate definition of efficiency (denoted as search efforts in progress; however, space does
scaled emency, and it counterpart, scaled speedup)
has been proposed whereby the ratio of communications
not permit their discussion here.
cost to computational cost remains fixed as n is increased. 31n contrast, for example, to the LWPACK benchmark
This is achieved by increasing the problem size ( L e . , N) [15] for the matrix of order 100 which is written in FOR-
3.5.1 High Performance Fortran puters are fairly similiar. Below, we provide a
brief description of PVM. Description of other
High Performance Fortran [17, 18, 191 is systems are available (e.g., [24, 25, 26]), and a
a data parallel language which extends For- reasonably comprehensive listing has been com-
tran 90 to provide additional support for the plied by Turcotte [27]. Comparisons of the rel-
data parallel programming style while main- ative merits of different systems have also been
taining compatibility5 with Fortran 90. Devel- published (e.g., [28]).
opment of HPF was initiated in 1991 through
PVM (Parallel Virtual Machine), created
the establishment of the High Performance
by the Heterogeneous Network Project (Oak
Fortran Forum, and the language specifica-
Ridge National Laboratory, the University of
tion was published in May 1993. At present,
Tennessee and Emory University) initiated in
twelve vendors have announced support of HPF.
1989, consists of two software packages [29, 30,
Additional information may be obtained at
31, 32, 331. The first is a daemon p v m d 3 which
http:/ /www .erc.msstate.edu/hpff/home.html.
executes on all of the computers which comprise
A good introduction to HPF is provided by Fos-
the virtual parallel machine. PVM is designed
ter [21].
to enable any user with a valid login to install
HPF extends Fortran 90 to include specific com- and initiate pvmd3. The user specifies a list of
piler directives to control the alignment and dis- computers which comprise the virtual parallel
tribution of data on parallel machines, and in- machine, and starts p v m d 3 on each one. The
troduces new parallel features and additional PVM application can then be initiated from any
intrinsic library functions. For example, the of the computers. The second is a library of
PROCESSORS directive specifies the shape and PVM routines Zibpvm3.a which contains the user
size of an array of (abstract) processors, and callable routines for message pasing, spawning
the ALIGN directive aligns elements of different processes, coordinating tasks and modifying the
arrays with each other, thereby indicating that virtual machine.
they should be distributed across processors in
PVM has been successfully implemented on nu-
the same manner. New intrinsic functions intro-
merous computer architectures [33]. These in-
duced by HPF include NUMBER-OFJROCESSORS
clude heterogeneous and homogeneous networks
and P R O C E S S O R S S H A P E which allow a program
of computers, and also “individual” massively
to obtain information on the number of proces-
parallel computers (e.g., Intel Paragon and Cray
sors on which it executes and the connection
T3D). PVM is widely utilized in academia, in-
topology.
dustry and government laboratories. It is es-
Examples of applications written in HPF are timated that more than 10,000 individuals or
presented in Hawick and Fox [22] and Mueller installations have obtained the PVM software
and Ruehl[23]. A more extensive list is available and approximately 20% to 25% are actively us-
on http://www.npac.syr.edu/hpfa/bibl.html. ing it [34]. An index of PVM software may be
obtained by sending the message send index
from pvm3 to [email protected].
3.5.2 Parallel Virtual Machine (PVM)
An example of a PVM application is the Kor-
A recent major advancement is the develop- ringa, Kohn and Rostoker coherent potential ap-
ment of heterogeneous, network-based parallel proximation (KKR-CPA) method for computing
computing environments. Unlike fixed paral- the electronic properties, energetics and other
lel computer architectures (e.g., Cray C-90, In- ground state properties of substitutionally disor-
tel Paragon, etc.), these network-based paral- dered alloys [33]. An approximate three month
lel computers are created as a virtual machine effort converted the 20K line KKR-CPA code
using software tools such as PVM, Linda, P4 for PVM. The code achieved approximately 200
or Express. Typically, any number of different MFlops using a network of ten IBM RS/6000
networked computers may be connected to form (6 model 530’s and 4 model 320’s) worksta-
a parallel machine, although usually the com- tions, which is estimated to be approximately
82% of the maximum floating point capability
‘For a description of Fortran 90, see Metcalf and Reid
of this virtual system. Also, the PVM KKR-
POI.
3-9
CPA code achieved over 9 GFlops performance Due to its recent introduction, there are a rela-
using a network of twenty seven Cray C-90 and tively small number of applications to date us-
Cray Y-MP processors scattered across several ing MPI. A recent review by Skjellum, Lusk
sites. Furthermore, the PVM KKR-CPA code and Gropp [39] describes recent applications in-
was successfully demonstrated for a virtual ma- cluding unsteady incompressible viscous flows,
chine consisting of two Intel Paragons, a CM-5, groundwater modeling, volume visualization
an Intel i860 and IBM workstations, which were and traffic simulation. Native MPI implementa-
geographically distributed at several sites. tions are currently under development by several
parallel computer vendors [40].
Load balancing, latency and bandwidth are
clearly important issues for implementation of
a virtual machine with PVM or other similar
tools. In a heterogeneous environment, due con-
4 Parallel Computing in
sideration of the relative performance of indi- Aerospace Research
vidual hosts is obviously needed in domain de-
composition. Latency (i.e., the time required to Despite the extensive research on parallel com-
initiate a message) can be a critical issue, de- puting, only a small fraction of numerical sim-
pending on the ratio of communications to com- ulations of aerospace research problems employ
putation. Network bandwith may be restricted parallel computing. A survey of the citations
due to existing traffic. Recent enhancements for parallel and other computers for three jour-
to PVM [34] provide for improved performance. nals is presented in Table 3. The period July
For example, the message passing performance 1993 through July 1995 was surveyed for all ar-
of PVM on the Intel Paragon‘ is only 5% to 8% ticles presenting research involving significant
slower than the native functions [34]. numerical simulation. Approximately 44% of
these articles indicated that a serial or vector
machine (single processor) was employed, while
3.5.3 Message Passing Interface (MPI)
only 3.4% specifically noted that a parallel com-
puter was used. Approximately 52% did not in-
MPI (Message Passing Interface) is a mes-
dicate that machine used. If the statistics for
sage passing standard for homogeneous and het-
the first two categories are assumed statistically
erogeneous parallel and distributed computing
representative of the last group, than an overall
systems. The development of the MPI standard
estimate (upper bound) for the parallel applica-
is a multinational effort which was initiated in
tions is 7%.
1992 and is supported by ARPA, NSF and the
Commission of the European Community. The Why are so few research simulations performed
MPI standard was published in 1994 and is de- on parallel computers ? Certainly, research on
scribed in [35, 36, 371. A good introduction to parallel computing has shown the capability for
MPI is provided by Foster [21], and a brief de- solving a wide range of fluid dynamics problems.
scription is presented in [38]. At the Parallel CFD ’95 Conference, applica-
tions of parallel computing were presented for
An MPI program includes one or more processes
reacting flows, Euler and Navier-Stokes solvers,
which communicate with each other through
spectral methods, multigrid methods, and adap-
calls to MPI library routines. There are two
tive schemes. Numerous other applications
types of communications, namely, point-to-
have been developed (e.g., see, for example,
point communication between pairs of processes,
[41, 42, 43, 441).
and collective communication between groups of
processes. Several variants of “send” functions I posed this question to a number of experts in
are provided to enable users to achieve peak per- parallel CFD. The answers tended to be fairly
formance. Two basic types of communications similar, and not at all surprising. All focused
topologies are provided: a Cartesian grid and an on the issue of calendar time required to solve a
arbitrary process graph [38]. particular problem. As one person stated, “The
machine which you use to solve a problem is ir-
‘Using the functions p v m p s e n d o and p v m p r e c v 0
introduced in PVM Version 3.3.
relevant. The only thing that matters is how
quickly you can get the problem done.” At the
3-10
present time, many CFD researchers who are (NOWs) for production analysis and design.
not using parallel computing view parallel CFD Two examples are Pratt & Whitney (East Hart-
as 1) lacking a decisive advantage performance ford, CT, and Palm Beach, FL) and McDonnell
advantage (e.g., MFlops) over conventional se- Douglas Aerospace (St. Louis, MO).
rial (and vector) computers in many instances,
Pratt & Whitney (P&W) initiated their Net-
2) difficult to program efficiently, and 3) lacking work of Workstations concept [46] in mid-1989.
in portability.
The decision was motivated by two factors.
All of these factors are likely to diminish in the First, P&W had an installed base of worksta-
near future, and thus the use of parallel comput- tions which had been acquired principally for
ing in aerospace research should increase. Mi- design/drafting work, but which were effectively
croprocessor CPU performance continues to im- unused in the evenings and on weekends. Thus,
prove by a factor of 1.5 to 2.0 per year7 [45], and there was a surplus of compute cycles which
consequently parallel machines are now compa- could be employed for analysis and design, pro-
rable or faster than traditional vector super- vided that the computational tasks could be de-
computers. For example, the Cray T3D (512 composed and parallelized. Second, their exist-
processors) is on average 41% fasters than the ing Cray X-MP, purchased in 1986, was both
Cray C-90 (16 processors) for the three sim- severely overloaded and limited in capability
ulated CFD applications in the NAS Parallel (e.g., memory). Hence, there was a significant
Benchmarks. The Cray T-3D (1024 processor) incentive to invest resources in development of
is 128% faster. The IBM SP2-WN (160 proces- a new paradigm for CFD analysis and design.
sors) was also significantly faster than the Cray
The P&W approach consists of several parts.
C-90 (16 processors) [16]. Also, the emergence
The flow solver is NASTAR, a 3-D struc-
of standards in parallel programming languages
tured grid multi-block Navier-Stokes code. Do-
(e.g., HPF) and message passing functions (e.g.,
main decomposition is straightforward, z. e.,
PVM, MPI) simplify the development of parallel
each block is assigned to an individual processor
code and significantly enhance its portability.
(workstation). An example is shown in Fig. 3.
The momentum, energy and turbulence scalar
equations are solved using Successive Line 'Un-
5 Parallel Computing in der Relaxation (SLUR). The SLUR iterations
Aerospace Industry are performed independently within each block,
with periodic updating of the boundary condi-
Parallel computing has a major presence in the tions to transmit information between blocks.
aerospace industry. Within the past few years, The optimal updating strategy is found by nu-
several major aerospace corporations have de- merical experiments. The pressure correction
veloped extensive Networks of Workstations equation is solved to satisfy the continuity equa-
tion, and employs a parallelized Preconditioned
'The rate of improvement of microprocessor perfor-
mance is much faster than for the specialized processors
Conjugate Residual (PCR) algorithm. The ma-
developed for traditional vector machines (e.g., Cray C- jority of the computational effort is expended in
90) the pressure correction equation, and thus con-
' I . e . , the ratio of the execution time on the Cray C-90 siderable effort was focused on efficient paral-
to the Cray T3D was 1.41.
3-1 I
G. Amdahl, “The Validity of the Single [22] K. Hawick and G. Fox, “Exploiting High
Processor Approach to Achieving Large Performance Fortran for Computational
Scale Computing Capabilities,” in AFIPS Fluid Dynamics,” Tech. Rep. SCCS-661,
Conference Proceedings Spring Joint Com- Northeast Parallel Architecture Cent er,
puting Conference, vol. 30, pp. 483-485, November 1994.
1967.
[23] A. Mueller and R. Ruehl, “Extending High
[12] D. Bailey, “Twelve Ways to Fool the Performance Fortran for the Support of
Masses When Giving Performance Results Unstructured Computations,” in Proc. of
on Parallel Computers ,” Supercomputing 9th AGM Inter. Conf. on Supercomputing,
Review, pp. 54-55, August 1991. 1995.
[13] D. B. et al, “The NAS Parallel Bench- [24] N. Carrier0 and D. Gelertner, “Linda in
marks,” International Journal of Super- Context ,” Communications of the ACM,
computer Applications, vol. 5, no. 3, pp. 63- vol. 32, pp. 444-458, April 1989.
73,1991.
[25] R. Butler and E. Lusk, “User’s Guide to the
[14] D. Bailey, J. Barton, T. Lasinski, and H. Si- P4 Programming System.” Argonne Na-
mon, “The NAS Parallel Benchmarks,” tional Laboratory, Technical Report ANL-
Tech. Rep. NASA TM 103863, NASA Ames 92/17,1992.
Research Center, July 1993.
[26] A. Kolawa, “The Express Programming
[15] J. Dongarra, “Performance of Various Environment .” Workshop on Heteroge-
Computers Using Standard Linear Equa- neous Network-Based Concurrent Comput-
tions Software .” Report CS-89-95, Com- ing, Tallahassee, October 1992.
puter Science Department, University of
Tennessee, and Oak Ridge National Lab- [27] L. Turcotte, “A Survey of Software Envi-
oratory, 1995. ronments for Exploiting Networked Com-
puting Resources .” Engineering Research
[16] S. Saini and D. Bailey, “NAS Parallel Bem- Center for Computational Field Simula-
chmarks Results 3-95,” Tech. Rep. NAS 95- tions, Mississippi State University, January
011, NASA Ames Research Center, April 1993.
1995.
[28] T . Mattson, “Programming Environments
[17] H. P. F. Forum, “High Performance Fortran for Parallel and Distributed Comput-
Language Specification,” Scientific Pro- ing: A Comparison of p4, PVM, Linda
gramming, vol. 2, pp. 1-170, 1993. and TCGMSG,” The International Jour-
nal of Supercomputing, vol. 9, pp. 138-161,
[18] C. Koelbel, D. Loveman, R. Schreiber,
September 1995.
G. Steele, and M. Zosel, The High Perfor-
mance Fortran Handbook. Cambridge, MA: [29] V. Sunderam, G. Geist, J. Dongarra, and
The MIT Press, 1994. R. Manchek, “PVM: A Framework for
Parallel Distributed Computing,” Journal
[19] H. Performance Fortran Forum, “High Per-
of Concurrency: Practice and Experience,
formance Fortran Language Specification,”
vol. 2, pp. 315-339, December 1990.
tech. rep., Center for Research on Paral-
lel Computing, Rice University, November [30] G. Geist and V. Sunderam, “Network
1994. Based Concurrent Computing on the PVM
System,” Journal of Concurrency: Practice
[20] M. Metcalf and J. Reid, Fortran 90 Ex-
and Experience, vol. 4, pp. 293-311, June
plained. New York: Oxford Science Pub-
1992.
lications, 1990.
[31] J. Dongarra, G. Geist, R. Manchek,
[21] I. Foster, “Designing and Building Par-
and V. Sunderam, “Integrated PVM
allel Programs .” ht t p :/ / ww w .mcs.d.gov /-
dbPP/.
3-14
Computing,” Computers in Physics, vol. 7, [44] D. B. et al, ed., Proceedings of the Sev-
no. 2, pp. 166-175,1993. enth SIAM Conference on Parallel Process-
ing for Scientific Computing, SIAM, 1995.
[32] A. G. et al, “PVM3 User’s Guide and Ref-
erence Manual.” Oak Ridge National Lab- [45] J. Hennessy and N. Jouppi, “Computer
oratory, 1994. Technology and Architecture: an Evolv-
ing Interaction,” IEEE Computer, vol. 24,
[33] V. Sunderam, G. Geist, J. Dongarra, and
no. 9, pp. 18-29, 1991.
R. Manchek, “The PVM Concurrent Com-
puting System: Evolution, Experiences and [46] C. Fischberg, C. Rhie, R. Zacharias,
Trends,” Journal of Parallel Computing, P. Bradley, and T. DesSureault, “Using
vol. 20, no. 4, pp. 531-547, 1994. Hundreds of Workstations for Production
Running of Parallel CFD Applications,” in
[34] A. Beguelin, J. Dongarra, A. Geist,
Parallel Computing 1995, 1995.
R. Manchek, and V. Sunderam, “Recent
Enhancements to PVM,” The International [47] R. Comer, “Experiences at McDonnell
Journal of Supercomputing, vol. 9, pp. 108- Douglas in Converting CFD Production to
127, September 1995. Parallel Processing,” in Parallel Computing
1995, 1995.
[35] M. P. I. Forum, “MPI: A Message-Passing
Interface Standard,” Journal of Supercom-
puter Applications, vol. 8, 1994.
[36] M. P. I. Forum, “MPI: A Message-Passing
Interface Standard,” tech. rep., University
of Tennessee, 1994.
[37] W. Gropp and E. Lusk, “Message Passing
Interface.” http://www.mcs.anl.gov/mpi,
1994.
[38] L. Clarke, I. Glendinning, and R. Hempel,
“The MPI Message Passing Interface Stan-
dard,” March 1994. ftp://par.soton.ac.-
uk/pub/mpi/paper .ps.
[39] A. Skjellum, E. Lusk, and W. Gropp,
“Early Applications in the Message-Passing
Interface (MPI),” The International Jour-
nal of Supercomputer Applications, vol. 9,
no. 2, pp. 79-94,1995.
1 SUMMARY P viscosity
This paper describes the portable parallelization of the P density
FLOWer code, a large, block structured CFD solver for (T normal stress components
industrial use. Basic requirements for the parallelization
7 shear stress components
are identified, and the strategies applied for its parallel-
ization are explained. Special emphasis is put on the @ components of the energy dissipation
parallel heart of the program, the communications li- function
brary CLIC-3D. Results obtained on several platforms
I demonstrate the success of the method chosen and allow Indices
an assessment of today ‘s capabilities of parallel comput-
a k algorithmic ideal
ers in CFD applications. Parallel computations of air-
craft configurations of varying complexity prove that ijk discrete point
parallel computers have become operational in aircraft I laminar
development. t turbulent
LIST OF SYMBOLS X in x-direction
C specific heat at constant pressure Y in y-direction
8’’ vector of artificial dissipative fluxes Z in z-direction
E total energy
CO at infinity
-
~
F flux tensor 1. INTRODUCTION
H total enthalpy When looking on the progress made in CFD during the
k heat transfer coefficient last decade, one observes that improvements are made
NE? number of blocks in two directions: The algorithms became more flexible
P and faster, e. g. by multigrid techniques, and the hard-
n outward pointing unit normal vector
ware platforms increased in main memory and CPU per-
Pr Prandtl number formance. As far as the progress in computer power is
P pressure concerned, experts predict that only parallel architec-
4 velocity vector tures will allow further improvements leading to peak
performances of about 1 TFLOP/s [ I , 21.
f ii residual vector
Therefore, since this type of super computers might re-
S speed-up
quire a new type of application software, the develop-
T temperature ment of parallel flow solvers is mandatory, if one wants
t execution time to exploit. their abilities in the future. This could be
U velocity in x-direction treated as an isolated subject, when dealing with ques-
tions of basic research interest, but when concerning
V volume
large codes in industrial use, several constraints are lim-
V velocity in y-direction iting the development.
G vector of conservative variables First of all the effort spent for parallelization must be
W velocity in z-direction justified by the gain in compute power or the reduction
Y ratio of specific heats of computing costs, respectively. Secondly, large CFD
1
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, ffom 2-5 October 1995, and published in CP-578.
4-2
the flow field is split into regions for each of which the
'
(".?)
k = C 1' Pr,
(7)
generation of a structured grid is possible. Figure 1 is
showing schematically such a grid topology around a
transport aircraft. As one can see, the flow field is subdi-
vided into four areas of similar size around the wing
where the laminar viscosity F~is given by Sutherlands's body. Three subdomains are covered by one block each
formu I a (blocks 1 to 3), whereas the fourth region is further sub-
divided, due to the presence of an engine there (blocks 4
to 9). The engine is surrounded by a polar grid (blocks 8
and 9) which is adapted by blocks 6 and 7 to the general
0-Htopology (blocks 3 to 5).
In turbulent flows the eddy viscosity pt is computed The program then treats the blocks more or less inde-
from the algebraic Baldwin-Lomax model [5]. pendently of each other which can only be done prop-
erly by exchanging data of the current solution at block
2.2 Discretization and Time Integration interfaces before each time step.
The governing equations are discretized by the method Therefore, the blocks are surrounded by layers of
of lines separating the space and time coordinates. After dummy cells, which at block intersections correspond
the space discretization, a system of ordinary differen- with the physical cells of the neighboring domain.The
tial equations in time results involving each finite vol- FLOWer code allows an overlap width of two cells re-
ume. For any hexaeder of the structured grid one obtains sulting in second order accuracy of the scheme at those
the equation. boundaries. This is necessary, in order to treat the artifi-
cial dissipation terms correctly which otherwise could
spoil the solution as shown in [9].
Currently, the FLOWer code allows different exchange ~
I
+ +
with Ri,k and Dijk being the vector of the residuals and
7
the artificial dissipative fluxes respectively.
The time integration is carried out by an explicit, hybrid
multi stage Runge-Kutta scheme which is accelerated
by the techniques of local time stepping, enthalpy damp-
ing (Euler) and implicit residual smoothing [7].
This procedure is embedded into a powerful multigrid
algorithm [3, 81 which allows standard single grid com- Fig. 1 Schematic multiblock decomposition of the flow
putations as well as a successive grid refinement and field around a generic transport aircraft.
Decomposition into 9 blocks due to the adaption
simple or full multigrid, respectively. As is illustrated in
of an engine fitted polar mesh to a global 0 - H
[3], where a more detailed description can be found, I
topology.
high convergence rates can be obtained, using this tech-
nique. 3. PARALLELIZATION OF THE FLOWer CODE
I
procedure [91. Therefore, certain objectives must be and how to achieve portability. When parallelizing the
met, the most important of which are specified in the FLOWer code, general considerations led to the follow-
to1lowing . ing guidelines allowing to meet the requirements stated
above:
Portability
The FLOWer code is developed by a number of scien- Grid partitioning as parallelization strategy
tists working at different locations on a variety of com- The idea is to map the different blocks to different pro-
puters. Furthermore, it is applied by several users run- cesses where they are solved separately. Between thc it-
ning the program on other platforms than the eration steps the boundary data are exchanged via the
devclopers. Finally, the life time of the program will network.
certainly exceed that one of most of today's computers, This technique is not only said to be efficient when solv-
so that portability is a major requirement: ing partial differential equations 11 1 , 121, but moreover
Thc FLOWer code must run on any platform, it may be guarantees the conservation of the sequential develop-
scqucntial or parallel ! ment history, because it is directly based on the sequen-
tially well established multi block method.
Conservution of the development history
Whcn developing the parallel FLOWer code, its algo- Separation of computation and communication
rithm had already reached a high degree of maturity es- A strict application of this rule allows an algorithmic de-
tablished by various scientists during a long period velopment which remains independent from the paral-
within the DLR-CEVCATS code. Moreover, the users lelization or other hardware aspects. Additionally, the
had bccome experienced with its handling and in inter- code structure can be kept modular more easily which is
preting its results. Therefore, the parallelization had to highly desired from software engineering reasons. Fi-
respcct that development history: nally, the portability problem becomes much easier to
Thc FLOWer code must not be completely re-written handle, when concentrating the communication parts
due to its parallelization ! within separate units.
4. THE CLIC-3D COMMUNICATIONS LIBRARY node processes. The host itself does not participate in
the solution process which Is exclusively carried out by
4.1 Background the node processes. Consequently the user application
The communications library CLIC-3D (Communica- program is seperated into a host and a node program as
tions Library for Industrial Codes in 3 Dimensions) is shown in figure 3.
currently developed by the GMD within the German re-
search project POPINDA. It is based on the former
GMD Comlib library and supports general block struc-
tured PDE solvers, particularly involving multigrid al-
gorithms. Its development was based on the observation
that for this class of programs the communication pat- HOST
terns are generally quite similar, although the numerical
algorithms might differ considerably. data distribution
52
which may contain the complete sequential code. In
CLIC-3D case of the FLOWer code, the only differences in paral-
lel mode are:
The input data is not read in but received trom the
host process
Global operations involving all blocks are passed to
?l vendor’s systems
the CLIC library for performation
The data exchange at block boundaries is carried out
fully automatically by the CLIC library
Fig. 2 Software layers of the parallel FLOWer code Write statements are replaced by parallel output rou-
tines of the CLIC library
4.2 General Code Structure A schematic flow chart of the parallel FLOWer code is
Since the CLIC library is based on the PARMACS mes- given in figure 4.
sage passing system, it is designed for a host-node (mas- Further activities of the CLIC library consist in the anal-
ter-slave) programming model. The host process starts ysis of the given block structure, in order to allow a spe-
the distributed application on several nodes, performs cial treatment of grid singularities. For each segment
the input and output and transfers data to and from the
4-6
edge and point the adjoining blocks and the number of are received in the order they come in, and the buffers
adjoining cells is determined leading to a topological are unpacked. If necessary, the procedure is repeated for
classification. If the segment is part of the physical segment edges and corner points, so that finally all block
boundary, the boundary conditions of all adjoining interfaces are updated correctly.
blocks are determined, additionally. Finally, geometrical
Exchange ( 1 1 Segmcnl Daua
singularities are detected, so that the user can inquire all
data for a special treatment of irregular grid points.
HOST NODE 1 NODE 2
-
-C
pniccss I
ciinlrol strciuii
diaa sLrcam
pniccss 2
-.
-
control stream
data stream
ill1 all
hhicks hliicks
ill1 ill1
dlstrlbute data cu1s cuts
and 320000 cells were used, respectively, that were sub- Intel IBM NEC Gray
Pam&mXPlS SPZ Cmju-3 1936 C916
divided into I , 4 and 8 blocks in the small case and into 8132Pmc. 4 1 3 2 P m . 8132Pmc. 8I16Pmc. 818Pms.
1 , 4, 8, 16 and 32 blocks in the large case. Each block
was of equal size and was mapped to one CPU on the Fig. 9 Relative execution times on parallel computers
parallel machines leading to an ideal load balance.
5.2 Speed-up for Aircraft Configurations
For evaluating the potential of the parallelization of the
FLOWer code, speed-up measurements were carried out
for a more realistic configuration. The inviscid flow
around the generic DLR-F4 wing-body combination
shown in figure 10 was computed on a grid consisting of
approximately 410000 cells which was subdivided into
I , 4 and 8 equally sized blocks, respectively. For the
conditions of Mach number M = 0.75 and incidence a =
OD, 35 W cycles involving 4 multigrid levels were per-
formed on an IBM SPI computer.
The results obtained for different communication sys-
Fig. 7 NACA 0012 wing test case for performance
measurements. tems available there are plotted as speed-up versus pro-
cessor number in figure 11. As can be seen, PVM using
Figures 8 and 9 show the obtained computing times on
an Ethernet connection restricts the processor number to
various parallel and vector machines with respect to the be employed to only four indicating that workstation
time needed on a Cray C90 single processor. As can be clusters based on the Ethernet are not suitable for paral-
seen, the single processor performance of the NE!C SX-3
lel computations with the FLOWer code. The result can
is hard to beat, even by parallel vector computers using be markahly improved, when replacing the Ethernet by
up to 8 CPUs. On the other hand, the results show that the IBM high performance switch, but still the fastest
parallel RISC processor architectures, as the B M SP2
runs were obtained using the IBM MPUp communica-
or the NEC Cenju-3. are able to compete with or even to
tions system.
outperform the Cray C90 single processor using a mod-
4-8
In order to study the corresponding effects on the paral- 5.4 Feasibility Study for Large Problems
lel performance, 50 W-cycles were performed mapping Since parallelization is believed to be the appropriate
the 1 1 blocks to 1, 7 , 8 and I O processors of an IBM method of tackling the future grand challenge problems
SP2 respectively. The single processor result was ob- in design aerodynamics, attempts must be made, in or-
tained on a slightly more powerful wide node, whereas der to demonstrate the feasibility of this approach.
the parallel runs were obtained on weaker thin nodes. Therefore, the viscous flow field around the DLR-F4
As can be seen from figure 13, on 8 processors a speed- wing-body combination (compare figure IO) was com-
up of 6.6 can be gained, but a further increase does not puted on a grid generated by the Deutsche Airbus com-
lead to an improvement any more. This behavior is ex- pany consisting of 6.6 million grid points subdivided
actly what must be expected looking on the block struc- into 128 blocks of equal size. 800 multigrid cycles were
ture and the mapping strategy of the CLIC library. performed on a 129 processor IBM SP2 (1 host + I28
The work load per processor is determined by the num- nodes) which took less than three hours of response time
ber of grid points to be solved, and the largest number of ( 1 3 seconds per cycle). The convergence of the compu-
points on any processor constitutes the total execution tation is given in figure 14 in terms of the logarithmic
time of the parallel run. When mapping the 11 blocks to density residual versus the number of multigrid cycles.
less than 11 processors, there will always be more than
one block per CPU. Therefore, the CLIC library applies DLR-F4 wing-body combination
a mapping strategy that tries to distribute the blocks ac- 6.6 million points, 128 blocks
cording to their size, so that the work load on the nodes
is as equal as possible.
Up to 8 processors one is able to continously reduce the
maximum grid size per CPU by simply mapping the
largest block of the heaviest loaded node to an addi-
tional processor. But when employing 8 nodes, the max-
imum work load is determined by the absolutely largest
block which of course cannot be reduced any further by
mapping the block structure to more CPUs. Therefore,
the minimum computing time or maximum speed-up, .-
100 200 300 400 500 600 700 800 N
respectively, is to be obtained on 8 nodes and remains
constant afterwards, as illustrated by figure 13.
Fig. 14 Density residual versus number of multigrid
Any further increase of the speed-up would require an cycles. DLR-F4 wing-body combination (6.6
additional blocking of the largest block which is million cells) on 129 processors of an IBM SP2.
planned to be automatically supported by the CLIC li- A grid convergence study was carried out by repeating
brary in the future. the computation on four grids each differing in the num-
ber of total points by a factor of 8. The result is given in
figure 15 in terms of the total lift coefficient versus the
scaled grid size. As one can see from an extrapolation of
the development of the lift between the levels three and
s : 575000 cells one, the large grid size of 6.6 million cells is necessary,
8.0 - in order to get the lift within an accuracy of one percent.
Since the quality of the solution was spoiled by regions
IBM SP2 of highly distorted cells, a repetion of the study is
6.0 -
I
In the present paper we introduce and discuss an efficient par- When the incompressible Navier-Stokes equations
allel algorithm for the spectral multi-domain solution of the au 1 1
incompressible Navier-Stokes equations. Firts, the algorithm -+-(U.VU+V.(UU)) = -Vp+-AU (1)
at 2 Re
is given in its basic form for the 2-dimensional case and, later
on, a possible extension to 3-dimensional flows exhibiting a V.U = 0 (2)
homogeneous (periodic) direction is proposed. The algorithm
are solved by means of a projection method [2], with the diffu-
is validated both for its parallel performances, and its accu-
sive terms treated in an implicit fashion [3],the time stepping
racy.
procedure consists in a cascade of scalar elliptic kernels, to be
solved at each time step. Namely two (for the two-dimensional
equations) Helmohltz problems for the inversion of the diffu-
sive part, and a Poisson problem for the pressure need to be
1 INTRODUCTION solved at each time step. It is then clear that, in order to
achieve a globally efficient algorithm, it is of fundamental im-
portance to tackle effectively the mentioned scalar problems.
In the last years domain decomposition methods have gained For the sake of completeness in the following the adopted frac-
much attention in the CFD comunity. One of the most rel- tional step scheme (i.e. Van Kan's pressure correction method
evant features of such methods is concerned with the possi- [4])is given
bility of tuning the accuracy of the numerical discretization
according to the expected behaviour of the solution in each 0 - U" A ( O + U " ) = - V p " - - L 3( C J " ) + - L1( U " - L )
subdomain. Consequently, subregions of flow field contain- At 2Re 2 2
ing sharp boundary layer, can be enclosed within subdomains (3)
with high resolution, while low resolution can be assigned to
subregions where smooth solutions can be expected. clan = U ( ( , + 1)At) (4)
U"+1 -fi 1
These advantages can be fully exploited when discretizing the +-v(p"+' -p") = 0 (5)
equations with spectral methods which guarantee a fast decay At 2
of the error with the number of the nodes, termed as "spectral V.U"+' = 0 (6)
accuracy".
where L (U) represents the advective term
On the other hand domain decomposition methods might pro- f (U ' vu + v ' (UU)).
vide a natural stabilization strategy for the spectral discretiza- In the first step, a non physical intermediate velocity field fi
tion which is a "central one" in nature. In fact the local cell is computed. In fact, fi does not satisfy the incompressility
Peclet number can be locally diminished by reducing the mesh
spacing within the critical subdomain, without the introduc- condition. Then in the second step 0 is projected onto the
tion of any particular stabilization procedure. divergence free space to get an adeguate velocity approssima-
tion of U"+'.
From the computational point of view, the domain decompo- The scheme with the given boundary conditions is nothing else
sition techniques is well suited for parallel computing, even then a second order Crank-Nicolson Adams-Bashforth scheme
if in practical case several difficulties arises whenever good with an U ( A t 2 ) deviation in the tangent direction of the
performances have to be reached [l]. boundary. By applying the divergence operator to (6),it turns
out that the latter is equivalent to
In the first part of the present paper, a parallel algorithm for
the solution of the bidimensional incompressible Navier-Stokes
equations is presented. After a brief introduction of the time
splitting scheme used for the time discretization of the un-
steady incompressible Navier-Stokes equations, the attention
will be mainly focused on the the spectral multidomain a p
proach and on its parallel features. Performance results con-
cerning the parallel implementation on two different MlMD
parallel architectures will be presented. The second part of In the next section the attention will be focused on the way
the paper is concerned with the application of the algorithm each scalar elliptic problem has been tackled in the framework
to three dimensional unsteady problems. of a spectral multidomain discretization.
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
5-2
3 SPACE DISCRETIZATION where ker(y) is the kernel of operator 7, and its orthogonal
complement It-' is defined as:
In the present work, a Legendre spectral collocation technique
coupled with a domain decomposition method has been used A-' I {ti E H,L(R): I(ti,uo) = 0 v U0 E It-} (21)
for the space discretization of the differential equations. Ad-
ditional references can be found in ( [6], [ 5 ] )for the projection Therefore, the solution U E Hd(R)of problem (12) can be
decomposition method, and in ( [7]) for the spectral apnroxi- uniquely decomposed as
mation ni-thod.
U =UO +ti, uo E It- and ti E Ii' (22)
3.1 Elliptic terms Since the restriction yo of the operator y to Ii' is an isometric
isomorphism between It-' and H;I2(r) it follows that
The following problem, rappresentative of one of the elliptic
scalar problems mentioned in the previous section, is consid- v tic IP 3 4 H;l2(r):
~ ~ = ~ ; l t l (23)
ered hereafter:
Identity (22) can be reformulated as:
where a is a real constant 2 0 , and-where R k a n opencon- Thus, problem (12) can be easily proven to be equivalent to
nected set R c R'; in particular, R = U ~ , R with ; R, is a the set of the two following ones:
closed rectangle having either common side or common ver- Problem ( P l ) : find uo E IC such that:
tex with each neighbour; a 2 0 is either identically equal to
zero (i.e., for the Poisson problem related with the pressure) or 1(uo,uo) = (f,U O ) L l ( f l ) v U0 E Ii (25)
is equal to 2lAtRe (i.e., for one of the momentum equations),
and the equivalent weak formulation of (lo), (11) is: Problem (Pa): find t+b E H;"(r) such that:
l(u, U) = (f,U ) L 2 ( f l ) v U E HO'(0) (12) Problem P1 is nothing else than the solution of N decoupled
elliptic problems with homogeneous Dirichlet boundary con-
where Hd ( R ) is the real Hilbert space defined as follows: ditions on both 30 and r. To build its discrete conterpart,
a standard Legendre collocation method has been used ( [7]).
aU To this end, the unknowns are decomposed into a series of
H,'(R) {U E L'(R) : -E P ( R ) (13) Legendre polynomials:
ax 1
r = (R\Ro)\aR with $20 = uEIR, ( 16) where Lak is the k t h Lagrange polynomial for which
L a k ( x , ) = Sk,,. By taking into account the expression of U
Let H ; / ' ( r ) be the completion of the normed vector space S and U and by replacing the scalar product l ( . , .) by its discrete
defined as: counterpart, the differential problem reads:
(29)
=0 v2,J
dr=z
where W k are the Gauss Lobatto weights for the quadrature.
where 4r is the restriction of 4 on r. Using the definition of Lagrange polynomials (La,(x,)= a,,),
The linearity and continuity of the operator the disretized equations become:
into H ; " ( r ) and the fact that CF(R) is dense in Hd(R) leads
to the existence and uniqueness of a linear and continuous An efficient procedure to solve the given algebraic problem
operator y (trace operator) from Hd(R) onto H i " ( r ) defined will be given in the next section.
as As concern problem P2, if {(,} i = 1 , ..,m is a set of linearly
74 = 4r ~4 E ~ d ( 0 ) (19) independent functions which constitute a base for H ; " ( r ) ,
then the discrete version of problem P2 reads as:
The y operator allows to identify two closed mutually orthog-
onal subspaces M
/(ye1~ ~ t C t ~ ~ =~ ( f' >E~ Oj l ) ( ~ ) ~ z (vn j) = 1, .., (31)
Ii 2 ker(y) = {UO E Hd(R) : yuo = 0 ) (20) I=1
5-3
Typically M corresponds exactly to the number of points on A final remark concerns the importance of achieving an ef-
the interface. To set up an algebraic equivalent of (31) the ficient technique to invert the decoupled Dirichlet problems
operator yo-' should be explicitly formulated. In practice, the (PI). To this end, we make use of a modified matrix diagonal-
operator yo-' is never required if an iterative procedure is in- ization approach [9].The Legendre collocation approximation
troduced to solve problem P2. To illustrate this point, it must to one of the mentioned subproblems migh be re-written as:
be remarked that irk E Ir" must satisfy the orthogonality con-
dition: UD+DTU+alU = F (36)
I ( U k , u o ) = 0 v U0 E IC (32) where D is the collocated Lagrange second derivative matrix
which corresponds to the solution of N elliptic problems (25) acting on the subdomain internal nodes, U is the unknow
with Dirichlet boundary conditions: homogeneous on d n and matrix ordered by rows, and F is a modified right hand side
to be iteratively determined on r. matrix keeping into account the effects of the boundary values.
To provide a t each iteration k the condition on r for problem First, we determine the eigenvalues of D , its left and right
(32) the Green's formula is applied to (31) eigenvector system (ordered by columns) and the respective
inverses.
Er-' D E, = A (37)
E;' DT El = A (35)
where U k = y o 1 Cfl, of'(, is the solution a t iteration k of
Matrices E,, El, E:', E;' and the diagonal eigenvalue matrix
problem (32), where & represents the jump of the normal A are computed and stored in a pre-processing stage. Indicat-
derivatives on r. Rk is the residual a t iteration k, from which ing with fi = E;' U E1 and with F = E;' F El we invert
the updating of the boundary value Uktllr can be obtained the diagonalised problem:
within the chosen iterative procedure.
The convergence rate of the iterative procedure strongly de- Ai! + fiA - ai! = E; (39)
pends on the choice of the basis { E , } [8]. For the present work
the basis functions proposed by Ovtchinnikov [8] have been and recover the final solution as:
used. These constitute a nearly optimal basis, in the sense
that the condition number of system (31) is bounded by a U = E,~!E;' (40)
constant independent of M , where M is the dimension of the Having solved the eigenvalue problem in a pre-processin
subspace of H;"(I') generated by span{.$} i = 1, .., M . stage, the recursive solution cost turns out to be order n f
In view of the character of the algebraic problem (symmetric operations, n being the number of nodes used to discretized
positive defined) the conjugate gradient has been employed to each direction within a single subdomain.
solve problem (31).
3.4 Accuracy tests master process calculates the guess values for the Dirichlet
problems. These values are then trasmitted to the slave pro-
To test the accuracy of the proposed spectral multi-domain cesses: each of the slaves solves the Dirichlet problems for
algorithm we have considered the classical Taylor-Green ana- the assigned domains; it should be noted that, in this case,
lytical test case for the 2-dimensional incompressible Navier- the domain decomposition (which allow the slaves to operate
Stokes equations: in parallel) derives directly from the multi-domain approach.
After this first phase, the slaves transmit the calculated values
u(z, y) = - cos(n3:) sin(ny)e-t/2na (42) a t the domain interfaces to the master, which calculates the
new values by applying a Conjugate Gradient algorithm, and
u(2, y) = sin(7rs) cos(ay)e-t/2ffa (43) communicates these values to the slaves for the next iteration.
p(z, y) = -1/4 (cos(27rs) + ~os(27ry))e-’~’ (44) The main causes of inefficiency in using parallel architectures
are an uneven load-balancing and the communication over-
on the domain R = [0,2] x [0,2]. The following set of boundary heads. In general, the multidomain technique can generate
conditions have been applied: load balancing problems because the size and/or computa-
tion of blocks can widely differ; however, in our case each
0 on the edges I = 0 and 3: = 2 homogeneous Dirichlet domain has the same number of points. Thus, if the number
conditions for U and homogeneous Neumann conditions of domains is a multiple of the number of processor, we obtain
for U. an optimal load balancing. The communication overheads is
on the edges y = 0 and y = 2 homogeneous Dirichlet mainly related to the Conjugate Gradient algorithm: a t each
conditions for U and homogeneous Neumann conditions time iteration, data need to be exchanged between processors
for U. containing adjacent domain interfaces and the master proces-
sor. Because of the sequantial flow of these activities, it is
The tests have concerned both time and space accuracy. The not possible to overlap computation and communication, so
latter has been measured imposing an extremely small value the time spent for these communications can represent a not
for the time step. Different configurations have been consid- negligible part of the overall computing time.
ered and the error has always been measured according with The parallel version of the code has been developed for mes-
the L2(R) norm. The following table, showing the results of sage passing environments. In particular, the code has been
different tests with different domain partitioning configura- written in Fortran 77 plus PVM 3.3 communication primi-
tions, summarizes the accuracy measurements both for one of tives. In order to meet the goal of overlapping computation
the velocity components and for the pressure. and communication, non-blocking communication primitives
have been used. Note that the parallelism is exploited only
among slaves: the master and the slaves cannot operate in
parallel. Anyway, as the great part of the computation is de-
manded to the slaves, the obtained performances on various
homogeneous parallel systems are quite good.
2 4 6 8 1 0 1 2 2 4 6 8
Nproc Nproc
Fig. I: 12 domains with 15 x 15 nodes; speed up Fig. 3 16 domains; speed up
0.8 9 x 9 points e
15 X 15 points o
0.7
2 4 6 8 1 0 1 2
0.7 ' 2
I
4 6
I
8
Nproc Nproc
)roce8sors which handle the communication an behalf of the to take advantage of the given multi-domain solution method
processor
J P ~ C has to he taken into account). to solve them efficiently.
To further reduce the computing time, we have also used bet- In particular, let
erogeneous systems In fact, whenever the execution of differ- NI?--l
ent tasks constituting the same program is stnctly sequential,
heterogeneous processing can help in enhancing performance u:(z,y,z) = G:*(z,y)e'*'. t = 1,2,3, (45)
by placing a task on the most sutahle madune for that task. k=-N/2
To this goal, tests have been performed by placing the master
process on a vector computer for a more efficient calculation,
and the slave proceqses on a parallel homogeneous system with
scalar processors.
However, in our case the time spent by the master is a negligi-
ble part of the total comuputing time; so, the test performed N/2-1
by using an heterogeneous environment have shown no appre- i,(z,y,z) = t,,k(z,y)e'*', : = 1.2,3,' (47)
ciable improvements *=-NI1
and
5 %DIMENSIONAL EXTENSION N12-1
Convex
Meiko 0
0.7
2 3 4
Nproc
Fig. 7 Grid configuration in the normal p h e
Fig. 6 3 domains with 11 x 11 nodes; efficiency
Five subdomains. the first and the latter selected to embedd
the wall sublayer, are used. Each suhdamain contains 20 x 20
nodes, while in the Farier direction 24 modes are employed.
The present case has been run on a IBM RSGWO 360H work-
station with about 100Mflops peak performance. The CPU
requked for each full time iteration is of about 4.5 seconds
when the Galerkin capacitance matrix is computed and atored
2 Solve for fiit' the pressure correction, for k = in a pre-processing stage.
- N / 2 , ..., N I 2 - 1: ~.
1
(50)
3 For I = 1,2,3, update the velocity field, for k =
-N/2*...3N/2-1;
The subscript I has been introduced to stress the fact that the
collocated derivatives are computed in the two non-periodical
directions only. The term rks,,k represents the k t h mode
of the transform of the right-hand-side of the i l h momen-
tum equation. The treatment of the boundary conditions is I to 100
straightforwad and does not introduce any supplementary losIV4
-
difficulty. Despite its apparent complexity, this algorithm Fig. 8 Mean streamwise velocity near the WU
presents the advantage that all the computations of the ellip
tic terms take place in the transformed space (for the periodic After having reached a statistical steady state we measured
direction) leading to the full exploitations of the %dimensional some typical turbulent value to the quality of the ob.
algorithm. tained mults. In figurr ( 8 ) we compare the obtained velocity
5-7
. .
I"'"')
prsnmi
IU'Y'I pmnt
we1and Willma* p . A A A.,),
y
0,
2.0
'
4
The present authors are grafeful for the support and the com-
puter time provided by lRSlP (Istituto Ricerche c Sistemi
E
.
U i
0'\
lnformatici ParaUeli CNR, Napoli It.). We are also indebt
4i with Dr. Di Pietm for his help and assistance when setting
.-.
>
k, up the parallel version of the code. The first author likes to
l.0 ,: mention the contructive discussions and support of Prof J .
~ ."Y 0 0 ; -
Jim&=.
#'
.
".DO
. ~- .!.*
0
REFERENCES
oo------.~="-u
Y "
, , .,., . . , , . .
[l] G. De Pietro, A. Pin& and A. Vacca, A Parallel Imple.
mentation of Spectral Multi-Domain Solver, for lncom-
pressible Navier-Stokes Equations. In Proceedings of Par-
d e l CFD Conference '95,CaltTexh Pasadena Ca (1995).
Elsevier Amsterdam.
[2] A.Chorin, A.and Marsden, A Mathematical Introduction
to Fluid Mwhanics, Springer-Verlag. New York, 1979.
Andrew M. Wissinkt
Aerospace Engineering and Mechanics, University of Minnesota
107 Akerman Hall, 110 Union St. SE, Minneapolis, MN 55455
Anastasios S. Lyrintzis*
School of Aeronautics and Astronautics, Purdue University
1282 Grissom Hall, West Lafayette, IN 47907
Roger C. Strawd
US Army Aeroflightdynamics Directorate, Mail Stop 258-1
NASA Ames Research Center, Moffett Field, CA 94035
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithm"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
6-2
However, one drawback of TURNS is the amount respectively. The quantities &, &, Ey, and tZare the
of computation time it requires. An acceptable cal- coordinate transformation metric8 and J is the Jaco-
culation with TURNS requires a supercomputer of bian of the transformation. The pressure pis related
Cray-class. A typical quasi-steady coarse-grid Eu- to the conserved quantities through the perfect gas
ler computation by TURNS requires about 30 min- equation of state
utes CPU time on a Cray C-90, while an unsteady
computation requires 3-4 hours. Finegrid viscous p = ( y - l ) ( e - - ( uP ztvatwa)} (3)
computations require considerably more time.
Parallel computers, which include massively parallel The viscous flux vector S is incorporated in the code
supercomputers as well as workstation clusters, are but the calculations given in this paper are all invis-
begixning to replace traditional vector supercomput- cid (i.e. z = 0 in Eq. 1) so the viscous terms are not
ers for large scale computations due to their lower described here. Details can be found in [4].
cost and high peak execution rates. At present, The governing equations are applied to an inertial
TURNS is inefficient on parallel machines. The main reference system that moves with the blade. Because
bottleneck preventing better parallel efficiency is the the blade is rotating, the system is eontinuously un-
LU-SGS algorithm [16] used for the implicit time steady. In order to get a quasi-steady starting so-
step. The objective of our work is to study tech- lution, the blade must be held in in fixed pwition.
niques that will improve its efficiency. Thus, the This is done, in effect, by adding source terms to the
majority of this paper will focus on the LU-SGS al- right hand side
gorithm and some modifications thereof which im-
prove its parallel efficiency. Initial results of this 0
effort were presented in reference [13].
Although the TURNS code is primarily used for ro-
tor CFD calculations, the solution algorithm is the
.-+[ ?f] (4)
3. IMPLICIT OPERATOR
The TURNS code uses the two-factor LU-SGS
(Lower-Upper Symmetric Gauss Seidel) algorithm
where H = (e + p) and U, V , and W, are the con- of Yoon and Jameson [16]for the implicit time step.
travariant velocity components (e.g. U = It + & U + The LU-SGS algorithm has been used in a number
fyw + & w ) . The Cartesian velocity components U. of well-known CFD codes (e.g. INS3D [17], OVER-
v , and w are defined in the I, y. and z directions, FLOW [IS]) primarily for it's stability properties
with larger timesteps. Classic implicit methods such which can also he written
as Beam-Warming approximate factorization have a
large factorization error (of order At') which further
restricts the size of the time step. The two-factor
LU-SGS method has enhanced stability along with
a reduction in factorization error (order Ata) that
make it an attractive alternative. Unfortunately, the
LU-SGS method is difficult to parallelize.
The LU-SGS scheme resembles a typical LU factor- In the first step of (14), sweeps updating 6Q' are
ization scheme with diagonal preconditioning to in- performed in the positive direction (that is, from 1
crease robustness. The scalar diagonal terms are to j,,,.,, k,.,, Zma,) through the solution domain.
obtained by use of approximate Jacobians, avoiding The second step then computes 69" by sweeping
costly matrix inversions. The Jacobian terms A , B , hack through the domain in the opposite direction.
and C in Eq.5 are split into "+"
and "-"parts, with This algorithm can he vectorized using a hyperplane
positive parts constituting only the positive eigenval- approach, as outlined in [19]. Vectorization is done
ues and negative parts constituting only the negative across hyperplanes in which j+k+kconal. This is
eigenvalues. The positive matrix is backward differ- outlined in Fig. 1.
enced and the negative matrix is forward differenced,
as follows
(7)
steady state reacting flow problems, they found that, cessing than LU-SGS, the use of Jacobi sweeps leads
while the convergence rate of the operator is re- to a larger amount of computational work. It is
duced with the domain breakup, the affect is rel- well-known that a Jacobi method will have a the-
atively weak (e.g. with 64 subdomains, the number retically slower convergence rate than Gauas-Seidel.
of iterations increases by less than 20%). Thus, the Multiple sweeps (e.g. 4 6 ) are therefore required in-
domain decomposition strategy appears promising, side Eq. (15) to maintain a comparable convergence
and is used as a basis for the Hybrid algorithm, dis- rate to LU-SGS. Although DP-LUR can be executed
cussed in section 3.2. efficiently on a parallel machine, the added compu-
tational cost is a significant penalty, the specifies of
3.1 DP-LURMethod which are discussed in section 5.1. The question is
A modification of LU-SGS, referred to as Data- whether the computational penalty of DP-LUR is
Parallel LU Relaxation, bas been introduced by Can- the best that we can do.
dler et al. [21, 221 for solving hypersonic flow prob-
lems. Essentially, the modification involves trans- 3.2 Hybrid Method
ferring the nondiagonal terms to the right hand side The motivation behind development of the Hybrid
and using values from the previous iteration for these approach is to replace a source of inefficiency in
terms. The modified operator then becomes Jacobi- DP-LUR. The DP-LUR algorithm was developed
like and requires only nearest neighbor communica- primarily for data-parallel computations. Its con-
tion. This operator has been found to he very effi- vergence is independent of the number of proces-
cient in a data-parallel environment (e.g. [22, 231). sors used because the same Jacobi sweeping strat-
The DP-LUR modification of the LU-SGS algorithm egy that allows nearest neighbor communications
is given in (15). between the processors is also used for the compu-
tations on each processor. Doing the on-processor
computations with Jacobi sweeps is a source of in-
6Qy,k,r= D - I . hRHS" efficiency, since the computational work can be per-
formed more efficiently with the Gauss-Seidel sweeps
of LU-SGS. The strategy behind the Hybrid ap-
For i = 1,. . ., i,,, Do proach is to use the communications structures of
6QV) -D-l. the DP-LUR algorithm, to maintain load-balanced
J,k,l - (15) parallelism with nearest neighbor communications,
1
RHS"+ along with the more efficient LU-SGS algorithm for
A t 6QV-l) -AT bQ(i-l)+ the on-processor computations. The algorithm is
J-1 Jyl J+1 J+1
+ 6
Qb-I -';+I
'k-1
($-I)
t+l
6Q(i-1)+ referred to as the Hybrid approach because it re-
tains features of both the LU-SGS and DP-LUR al-
C~,6Qf~ - ~C&16Qf1y1)
" gorithms.
End Do
6Q7,,,, = SQ!'"") J . k . 1 - D-'
6Q!O' . hRHS"
JlLJ
For i = 1, . . ., imaS
Do
The main difference between the LU-SGS and DP-
LUR algorithmsis that a Jacobi sweeping strategy is *(i) (i-1)
used in DP-LUR while Gauss-Seidel sweeps are used 6Qj,k,i = 'Qj,k.i
ist on each processor that increase the performance erally, most newer machines (e.g. IBM SP-2) allow
substantially (e.g. from 5 Mflops/processor to 128 the user to choose the exact number of processors
Mflops/proceasor). Unfortunately, the only way to they want for their partition, so this will most likely
utilize the VU’s at this time is to rewrite the code in not be an issue on more modern machines.
CMFortran, a High-Performance+Fortran type lan-
The three dimensional quasi-steady starting solution
guage. Since TURNS is over 6K lines, rewriting the
is computed around the rotating blade in subsonic
code would require considerable effort and was one
conditions, with Mtip = 0.664, and a more tran-
of the main reasons we chose the MIMD implemen-
tation in the first place. In addition, rewriting the
sonic condition, with Mti 0.80. In both casen,
the freestream Mach numieris Mm = 0.17 and the
code to CMFortan would eliminate code portability. blade position is fixed at zero degrees azimuth an-
Consequently, the results presented here are deter-
gle (Fig. 5). It should be noted that the first case,
mined without utilizing the VU’s. Although this
Mtip = 0.664, is a realistic test case for rotor cal-
degrades the performance on the CM-5, it is not a culations. The Mrip = 0.800 case, however, is far
big drawback overall, because our future plans are
too transonic to be used in a practical helicopter a p
to run the code on parallel systems such as the IBM plication. It was added as an extreme test case to
SP-2 and workstation clusters, which do not have investigate the behavior of the implicit solvers with
vector units.
more nonlinear transonic flows.
The code is run for a test problem that computes
the quasi-steady flowfield around a symmetric OLS
blade. The OLS blade has a sectional airfoil thick- ufi4 444444
Q
ness to chord ratio of 9.71%and is a 1/7 scale model
Qd*d~
of the main rotor for the Army’s AH-1 helicopter. A S u n g SoluUoo
135 x 50 x 35 C H type grid is used, with the do-
main extending eight chords in all directions. The
upper half of the grid is shown in Fig. 4. We chose
Blade f l x d
at 0 deg Azhnuth
%
Figure 5: Quasi-Steady solution. Blade fixed at zero
degrees azimuth angle.
Table 1 - Timing Results on the CM-5 for TURNS Table 2 - Timing Results on the CM-5 for TURNS
with DP-LUR for subsonic test case. 135 x 50 x 35 with DP-LUR for transonic test case. 135 x 50 x 35
mesh, Mtjp = 0.664, density residual converged to mesh, Mtip = 0.800, density residual converged to
5 x 10-7. 5 x 10-7.
Iterations I Yo Comm. Iterations % Comm. I Tot. Time
5 sweeps 5 sweeps
436 10.4 % 9330 sec 464 10.0 % 9902 sec
440 15.3 % 2508 sec 457 14.8 % 2628 sec
438 21.0 % 1445 sec 465 20.8 % 1511 sec
6 sweeps 6 sweeps
351 9.2 % 8505 sec 379 10.0 % 9210
228 350 15.1 % 2233 sec 228 380 15.1 % 2424 sec
456 353 19.9 % 456 383 19.9 % 1402 sec
7 sweeps 7 sweeps
57 304 9.6 % 8229 sec 335 9.6 % 9068 sec
228 304 16.6 % 2110 sec 228 335 16.6 % 2383 sec
456 306 20.6 % 456 345 20.6 % 1380 sec
0
, .\. . . .
.. . .. .
. : ... . . .. . .. .
.....
.. .. ... ... .. . .. .
. . . .
., . .
. .
......
.................
.
.
.
i... .....
t . . . . ' . . am. . ' . ~' .~. ~. . ' . . . . : . .
loo
lkdon
JDO
........_.
.,~
-.
-
BGrrp
.. .
I'
LUSGS(m1PlW
....
100
.__
................--
. 5swaa
''_
5m
n
a- 4 s
Figure 8: Parallel Speedups of the time per iteration using the DP-LUR operator
6-8
Table 3 - Timing Results on the CM-5 for TURNS Table 4 - Timing Results on the CM-5for TURNS
with the Hybrid method for subsonic test case. with the Hybrid method for transonic test case.
135 x 50 x 35 mesh, Mtip = 0.664, density residual 135 x 50 x 35 mesh, Mtip = 0.800, density residual
converged to 5 x lo-'. converged to 5 x lo".
Procs
1 sweep I I 1 sweep
57 ' 461 10.3 % 4937 sec 10.6 % 5719 sec
228 470 15.1 % 1434 sec 228 558 15.3 % 1707 sec
456 502 18.8 % 863 sec 456 I 580 I 18.8 % I 998 sec
2 sweeps 2 sweeps I I I
57 394 10.1 % 5410 sec 10.1 % 6568 sec
228 398 14.8 % 1524 sec 14.8 % 1858 sec
456 404 20.6 % 889 sec 492 20.4 % 1082 sec
3 sweeps 3 sweeps
57 386 10.0 % 6423 sec 57 467 9.9 % 7748 sec
228 385 14.8 % 1771 sec 228 466 14.4 % 2143 sec
456 I 385 I 19.7 % [ 1012 sec 456 470 19.2 % 1226 sec
H@nd M.chod
M6p0.884
57P---- 57P----
pep- ....... p e h .......
458Pmcl- am-
z
1I
1
3P 1@ 10'
6 g"
Irc L U X I S (m 1 Pmc
Irc
1@ '
1O*
1Q
~~
SI 2?e 4%
~ a o n
Figure 11: Parallel Speedups of the time per iteration of TURNS using the Hybrid operator
6-9
The convergence plots show that a minimum of 5 the number of inner sweeps required for convergence.
inner sweeps (i.e. ,,i = 5) of DP-LUR are re- While DP-LUR required a minimumof 5 sweeps, the
quired to converge the solution. In both the subsonic Hybrid method converges a t a comparable rate to
and transonic cases, 4 sweeps began to diverge. For single processor LU-SGS with only 1 sweep. This
the Mtip = 0.664 case, 5 sweeps gives slightly worse is due to the more efficient Gauss-Seidel procedure
convergence than single processor LU-SGS while 6 used for the on-processor computations. With 2
sweeps gives slightly better. For the MtiP = 0.800 sweeps, the convergence of the Hybrid method is
case, 5 sweeps of DP-LUR gives about the same con- almost identical to single processor LU-SGS. With
vergence as single processor LU-SGS, and 6 sweeps is one sweep, there is significant spread between the
better. This seems to indicate that DP-LUR main- convergence curves for the different numbers of pro-
tains a good level of robustness for transonic cases, cessors, but with 2 sweeps, the spread is reduced
since it requires less inner sweeps to maintain the considerably so that all processor partitions follow
convergence rate of LU-SGS. The single processor essentially the same convergence path as LU-SGS.
LU-SGS method requires the work of approximately Although it is not shown in the figures, the conver-
1.8 sweeps of DP-LUR. Consequently, these results gence plot with 3 sweeps is only slightly better than
show that, in order to maintain the same conver- with 2, and it is therefore not plotted to avoid the
gence rate, the DP-LUR implicit operator requires graph from becoming too crowded.
about 3 times the computational work of single pro- The Hybrid method is considerably faster than DP-
cessor LU-SGS. LUR. The CPU times of the Hybrid method are only
Timings of the DP-LUR method indicate that more 55-60% those of DP-LUR. This is due t o the larger
sweeps seems to be the better choice. The overall amount of computational work in DP-LUR, because
CPU time with 7 sweeps is fastest, but the difference a larger number of sweeps are required for conver-
between 6 and 7 sweeps is small (less than 2%). Each gence.
additional sweep increases the CPU time per itera- It should be pointed out that each sweep with DP-
tion by 10-15%. Communication represents a rela-
tively small percentage of the total CPU time. The
L U R involves only a single sweep through the do-
main on each processor, whereas the Hybrid method
communication percentage increases with increasing performs the two-step LU-SGS algorithm on each
number of processors. Also, the percentages tend to processor, performing two sweeps through the do-
fluctuate for different cases which is probably due main. Thus, each sweep of the Hybrid method is
to the fact that these runs were done on a loaded approximately equivalent to the work of two sweeps
rather than dedicated machine.
in DP-LUR. This is indicated in the CPU times; the
It should be noted that, in theory, the solution us- CPU time using 6 sweeps of DP-LUR is approxi-
ing DP-LUR is the same regardless of the number of mately equal to 3 sweeps using the Hybrid method.
processors used, so the number of iterations should Using 1 sweep in the Hybrid method gives the best
be the same for all processor partitions. However, CPU time, but requires 17-18% more iterations than
Tables 1 and 2 show that the implementation did single processor LU-SGS. The CPU time with 2
show some slight discrepancies in the number of iter- sweeps is worse than that of 1 sweep by about 8%,
ations. Generally, the differences are small (less than but the convergence rate is much closer to that of
4%) and we attribute them to numerical roundoff in single processor LU-SGS. When 3 sweeps are used,
the machine. Differences in the overall solution are the convergence is only slightly better (a reduction
indistinguishable for the different partition sizes. in iterations of less than 5%) than 2 sweeps, while
A plot of the parallel speedups of the time per iter- the CPU time is about 11-15% more. Thus, 3 sweeps
ation of TURNS with DP-LUR is shown in Fig. 8. or more appears to be unnecessary.
The speedup from 57 to 228 processors is nearly lin- A plot of the parallel speedups of the time per iter-
ear, but some falloff is noted for 456 processors. This ation is shown in Fig. 11. The parallel speedups are
is believed to be due to the relatively small problem essentially the same as with DP-LUR.
size of 236,250 gridpoints. It is expected that the
speedup will be more linear with larger problems.
The parallel speedup increases slightly for a. larger 6. SUMMARY A N D CONCLUSIONS
number of sweeps, since the amount of computa- A strategy is presented for implementing the three-
tional work goes up. However, the difference is not dimensional Navier-Stokes Rotorcraft CFD code
significant. TURNS on massively parallel computer architec-
tures. The main portion of the code that is difficult
5.2 Hybrid Results to parallelize is the implicit timestep using the LU-
Results of timings with the Hybrid algorithm are SGS operator. We study two modifications of this
given in Tables 3 and 4, for the Miip = 0.664 and operator that make it more amenable to parallel im-
Mtip = 0.800 cases, respectively. Plots of the density plementation. The first is the Data-Parallel LU Re-
residual vs. CPU time are given in Figs. 9 and 10. laxation (DP-LUR) technique, which essentially re-
places the Gauss-Seidel sweeps in LU-SGS with Ja-
The efficiency of the Hybrid method is apparent in
6-10
cobi sweeps, and uses multiple sweeps of the domain 9496327 and by the Army Research Office contract
to maintain the same convergence rate. The sec- number DAAL03- 89-C-0038 with the University of
ond is a new approach that couples the Jacobi com- Minnesota Army High Performance Computing Re-
munication strategy of DP-LUR with Gauss-Seidel search Center (AHPCRC) and the DOD Shared Re-
sweeps of LU-SGS for the on-processor computa- source Center at the AHPCRC.
tions. It also uses multiple inner sweeps to maintain
the convergence rate of LU-SGS. Because this sec-
ond approach retains features of both the DP-LUR
and LU-SGS algorithms, we call it a Hybrid method. References
The TURNS code is tested on the Thinking Ma-
Strawn, R. C., and Caradonna, F. X., “Conser-
chines CM-5, using a MIMD approach for parallel
vative Full Potential Model for Unsteady Tran-
implementation. It is run for an Euler quasi-steady
sonic Rotor Flows,” A I A A Journal, Vol. 25, No.
calculation with 236,250 gridpoints, computing the
flow around the tip of a helicopter blade rotating 2, Feb. 1987, pp. 193-198.
with subsonic and transonic tip Mach numbers. Re- Bridgeman, J . O., Steger, J . L., and Caradonna,
sults from various processor partitions show that F. X., “A Conservative Finite-Difference Al-
both the DP-LUR and Hybrid modifications of LU- gorithm for the Unsteady Transonic Potential
SGS are very parallelizable, showing good parallel Equation in Generalized Coordinates,” AIAA
speedups. Both methods are also able to maintain Paper 82-1388, 9th Atmospheric Flight Me-
the convergence qualities of original LU-SGS for all chanics Conference, San Diego, CA, Aug. 1982.
test cases. The Hybrid method, however, requires
less CPU time due to lower computational work re- Srinivasan, G . R., “A Free-Wake Euler and
quirements. The DP-LUR modification of LU-SGS Navier-Stokes CFD Method and its Application
causes the amount of computational work in the im- to Helicopter Rotors Including Dynamic Stall,”
plicit solver to increase threefold, to maintain the JAI Associates, Inc., Technical Report 93-01,
same convergence rate. The Hybrid modification, November 1993.
however, can match to within 25% the convergence
rate of single processor LU-SGS with no increase in Srinivasan, G. R., Baeder, J . D., Obayashi, S.,
the computational work. It can exactly match the and McCroskey, W. J., “Flowfield of a Lifting
convergence rate with twice as much work in the im- Rotor in Hover: A Navier Stokes Simulation,”
plicit solver, yielding CPU times that are only 8% A I A A Journal, Vol. 30, No. 10, Oct. 1992, pp.
higher than the single sweep cases. Overall, the CPU 2371-2378.
times for the Hybrid method are only 5540% those
of DP-LUR. Srinivasan, G. R., and Baeder, J . D., “TURNS:
The computational work required of the Hybrid ap- A Free-Wake Euler/Navier-Stokes Numerical
proach on a parallel machine will always be less than Method for Helicopter Rotors,” A I A A Journal,
that of DP-LUR. On a few processors, the amount Vol. 31, No. 5, May 1993, pp. 959-962.
of computational work will be about the same as
LU-SGS. The Hybrid approach is therefore ideally Srinivasan, G.R., and Raghavan, V., Duque, E.
suited for machines that have smaller numbers of P. N., and McCroskey, W., J., “Flowfield of a
more powerful, non-vectorized, processors. One ex- Lifting Rotor in Hover - A Navier Stokes Simu-
ample of a machine that fits this category is the 150 lation,” A I A A Journal, Vol. 30, No. 10, October
processor IBM SP-2. We are currently implement- 1992.
ing the code on the IBM SP-2 a t NASA Ames, and
Srinivasan, G.R., and Ahmad, J.U., “Navier
expect better CPU times than what were obtained Stokes Simulation of Rotor-Body Flowfields in
on the CM-5.
hover Using Overset Grids,” Proceedings of the
Finally, although the TURNS code is used primarily Nineteenth European Rotorcraft Forum, Paper
for rotorcraft CFD applications, the paralleliza.tion No. C15, September 1993, Cernobbio Italy.
strategy is not unique to this application. The paral-
lelization procedures proposed here could be readily Duque, E.P.N., and Srinivasan, G.R., “Numer-
used for other CFD codes that use the LU-SGS al- ical Simulation of a Hovering Rotor Using Em-
gorithm. bedded Grids,” Proceedings of the 48th An-
nual Forum of the American Helicopter Society,
Washington DC, June 1992.
ACKNOWLEDGMENTS
The first author was supported by a NASA Grad- Duque, E.P.N., “A Structured/Unstructured
uate Student Fellowship. This work was supported Embedded Grid Solver for Helicopter Rotor
by allocation grants from the Minnesota Supercom- Flows,” Proceedings fo the 50th Annual Fo-
puter Institute (MSI) and Cray Research, Inc. The rum of the American Helicopter Society, Vol.
work is also supported in part by grant NSF CCR- 11, Washington DC, May 1994, pp. 1249-1257.
6-1 I
[lo] Baeder, J.D., Gallman, J.M., and Yu, Y.H., “A [21] Candler, G.V., Olynick, D.R., “Hypersonic
Computational Study of the Aeroacoustics of Flow Simulations Using a Diagonal Implicit
Rotors in Hover,” Proceedings of 49th Annual Method,” presented at the 10th International
Forum of the American Helicopter Society, St. Conference on Computing Methods in Applied
Louis, Missouri, May 1993, pp. 55-71. Sciences and Engineering, Paris France, Feb.
1992.
[ll] Baeder, J.D., and Srinivasan, G.R., “Compu-
tational Aeroacoustic Study of Isolated Blade [22] Candler, G.V., Wright, M., and McDonald,
Vortex Interaction Noise,” AHS Specialists’ J.D., “A Data Parallel LU-SGS Method for Re-
Aeromechanics Conference, San Francisco, CA, acting Flows,” A I A A Journal, Vol. 32, No. 12, I
Jan 1994. Dec. 1994, pp. 2380-2386.
[23] Wright, M.J., Candler, G.V., and Prampolini,
[12] Strawn, R. C., Biswas, R., and Lyrintzis, A. S., M. “A Data Parallel LU Relaxation Method for
“Helicopter Noise Predictions using Kirchhoff the Navier Stokes Equations,” AIAA Paper 95-
Methods,” presented a t the 51st Annual Forum 1750, 1995.
of the American Helicopter Society, 9-11, May,
1995, Fort Worth, Texas; also, to be published
in the Journal of Computational Acoustics.
-aui
sure equation fully coupled to the velocity field. No
simplification is made at this stage, the equation is =o,
axi
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
and, if ut-' > 0. The superscripts k - 1 and k denote
previous and current iteration.
The second member of equation (3) is discretized
axj (2) using second-order finite differences,
P
a2U
G = 4P
u j - 2u: + U; , (7)
Ax2
Face velocities
The relationship between the nodal and face values
is found by discretization (in a control volume cen- The equations for face velocities (9) are now replaced
tred at the face of control volume P) of a simplified into (10) leading to the following algebraic equation,
version of equation (2), obtained assuming mass con-
servation and constant viscosity
nb nb nb
p (uf-'uf - ub-'u;) Ay +
P; - P&
p (v;-'uk - vf-'u:) AX = - AY
2
and in case of D2,
4 AY
+ uAE A-Y (,5)
1-3
which, after replacing the equations for the velocities 3.1 Accuracy
a t the faces, yields, To obtain the accuracy of DTRECTO we performed
simulations of the flow in a two-dimensional square
UU k VU k
+ vnb + A::p,kb = cavity with sliding lid for 2 Reynolds numbers, (400
nb nb nb (13) and l O O O ) , and 3 grid sizes (64x64, 96x96 and
128x128). The Reynolds number definition was
The coefficients A:: represent links to the 9 nodal U Re = pUlidH/p. UIid is the lid velocity and H is
velocities surrounding P, and A:: represent links to the size of the square cavity.
4 nodal velocities ( V N E ,V N W ,VSEand Vsw). The The velocities were set constant a t every bound-
A:: coefficients includes the contributions from 7 ary, and zero normal gradient for the pressure was
nodal pressures (Pp, PE, Pw , PNE,PNW, PSE and used. This condition was implemented in an im-
PSW). plicit fashion to preserve the implicit feature of the
The momentum equation in the i = 2 direction method. The calculations were stopped for residuals
lower than 1x The residuals are the sum of the
VV k UV k
Anb Vnb + + A:rP,kb = absolute errors of the algebraic equations divided by
nb nb nb (14) reference quantities p U i d H , and pul;dH for momen-
tum and continuity equations, respectively. Calcula-
may be obtained by an identical procedure. tions were all performed in single precision.
Equations (1l ) , (13) and (14) are all assembled in a
single system of equations and solved simultaneously. Method Grid U,,,, Vmtn Vmaz
The system of equations is of the form, DIRECTO D1 64 -0.31999 -0.43943 0.29404
96 -0.32443 -0.44721 0.29897
Ax=b (15) 128 -0.32614 -0.44996 0.30090
Exact value -0.32878 -0.45356 0.30399
x is a vector with sequence of blocks with variables Accuracy 1.73 1.97 1.69
U , V and P . The order of matrix A is ( N I - 2) x
DIRECTO D2 64 -0.31956 -0.43968 0.29383
( N J - 2 ) x K , where N I x N J is the problemsize, and 96 -0.32430 -0.44729 0.29893
K stands for the number of variables (i.e., 3 in case 128 -0.32608 -0.45000 0.30087
of a two-dimensional laminar flow). This is a sparse, Exact value -0.32877 -0.45360 0.30380
unsymmetric, block-band (block tridiagonal) linear Accuracy 1.78 1.95 1.77
system. Because this is the most time consuming CPI 64 -0.32368 -0.44862 0.29925
part of the algorithm, special attention was given to 96 -0.32653 -0.45163 0.30183
this subproblem (in Section 3.3.1). 128 -0.32751 -0.45274 0.30271
After solution of the linear system (15) one global Exact value -0.32873 -0.45431 0.30379
iteration is completed. Because of the non-linearity Accuracv 2.05 1.85
- .. 2-. .
n~
of the differential governing equations, several global SIMPLE 128 -0.32614 -0.45119 0.30143
iterations are needed to obtain convergence and new
coefficients are calculated using the new velocity and Table 1: Square cavity results for Re = 400 (CPI
pressure fields, repeating the process until conver- results from Deng e2 al., 1994).
gence. The nomenclature “global iteration” is used
here to distinguish from the number of iterations as-
sociated with the solver. The estimated exact values and order of accuracy
Because all the conservation equations are solved of the results were estimated following the general-
as part of a single set, with no decoupling (or seg- ization of the Richardson extrapolation method. The
regation, accordingly to nomenclature in ref. [ll]), exact value can be approximated in terms of results
the algorithm can converge in a small number of it- on finite grids plus the leading term of the truncation
erations, and for this reason it has been designated error as,
DIRECTO [lo] (direct, in English).
41 = 4 e z + h,yXn + . . . , (16)
3 DISCUSSION OF RESULTS
The code development was made using the two clas-
sical geometries of a two-dimensional cavity with a
where h is the grid spacing in both directions and X n
sliding lid and a sudden expansion. In this paper re-
is a grid function, assumed the same for every grid
sults will be presented for the two-dimensional square
spacing ( h l , h2 and h3). Provided that h is small
cavity only.
enough for the leading term to be dominant, the or-
This Section discusses 3 major aspects of the algo-
der of the numerical scheme is estimated as [17],
rithm: the accuracy, memory requirements and com-
puting time, in subsections 3.1, 3.2 and 3.3, respec-
tively.
7-4
lh
differential equation for momentum and a relation-
ship between face and nodal values identical to our ~i~~~~2: Evolution of urn,,with grid resolution for
equations (3) and (4), designated D1. In CPI and ~~=400.
CSG, the governing differential equations of mo-
mentum are discretized at the centre of the control Fig. 3 shows the streamlines for D2 and SIMPLE
volumes before integration, while in the DIRECTO methods for a Re=1000 and a grid of 96x96. SIM-
method the equations are first integrated in the con- PLE is unable to predict the streamline distribution
trol volume and then discretized. This leads to in the centre. The dashed line (SIMPLE) a t the
centre of the flow represents the the value -0.11,
whereas the solid line (D2) represents -0.115.
1.o
0.8
0.6
YL
0.4
Figure 4: Memory requirements of DIRECTO (a),
b) and c)) compared to SIMPLE (d)) using hybrid
0.2 finite difference discretization scheme.
Figure 3: Stream function for D1 method (-) and 3.3.1 Linear solvers
SIMPLE method (- -) at Re=lOOO.
The Gaussian elimination method was used ini-
Given the similarities between DIRECTO and the tially [lo] during the FORTRAN implementation of
algorithms CPI and CSG of Deng et al. [17] one the algorithm. The first idea was to optimize the
would expect higher accuracy of the DIRECTO algo- Gaussian elimination method by adapting it to the
rithm; this is an aspect requiring further attention. block band structure and using BLAS kernels and
LAPACK library [20], on a vector processor VAX
6520-2VP. This reduced the computing time but still
3.2 Memory Requirements far from the SIMPLE+TDMA method, and required
Fig 4 shows the memory requirements for different a large amount of storage [13].
implementations of DIRECTO, compared to SIM- The next stage was the use of an iterative method
PLE using hybrid differencing. so that the sparse structure could be taken into ac-
The coefficient matrix, derived from a 9-node star, count. An iterative method has the additional ad-
has a dimension of ( N I - 2 ) x ( N J - 2 ) x K by ( N I - vantage of controlling the degree of accuracy for solv-
2) X ( N J - 2) X I<,where N I X N J is the problem ing the linear system of equations. Because the solver
size, and K stands for the number of variables (i.e., is an inner step of a global iteration required because
3 in case of a two-dimensional laminar flow). This is of nonlinearities, solving the equations to a high de-
shown by line a) in Fig. 4. gree of accuracy may prove useless.
Because of the block-tridiagonal structure it can Several methods were tested and GMRES (Gen-
be stored as a [ ( N I - 2) x ( N J - 2)] x K by 2 x eralized Minimum Residual) [14] was retained for
+ +
[ ( N I - 2) x 3 51 1 matrix (b) in Fig. 4). its robustness. GMRES is a Galerkin type method
For finer grids the block band structure becomes based on an orthonormal basis of a Krylov subspace.
sparser. This was exploited by using a sparse m a To obtain the solution of
trix structure, storing only the non-zeros values on
a vector, the column indices on an integer vector Ax=b,
and using pointers to the beginning of each row.
of the form
This structure reduced the memory requirements to
[ ( N I - 2) x ( N J - 2) x K] x I< x 9 (line c) in Fig.
Xk=XO+Zk, (21)
4).
On the other hand SIMPLE only requires 5 matri- where xo is an initial solution with residual
ces of dimension ( N I - 2) by ( N J - 2) to store the
coefficients (d) in Fig. 4). rpg = b - Ax0 . (22)
3.3 Computing Time zk is computed such that its residual projected onto
the Krylov subspace generated by ro is minimized.
The following computer tests were run on a DEC Al- Iterative methods of this type require the use of
phaStation AXP 3000, model 600S, for sequential al- preconditioners in order to improve the convergence
rate. Several preconditioners were tested and the
best proved to he the Incomplete LU factorization
of degree zero ILU(0) and ILUT [14, 211. The diag-
onal preconditioner although very simple to imple-
ment did not give as good results as the others [12].
Table 3 shows the total CPU times, and cor-
responding number of outer iterations needed t o
achieve a residual of 1 x on a DEC Alphas-
tation AXP 3000 model 6005, for several grid sizes,
using the DIRECTO+GMRES methods and for the Figure 6: Reordered (one-way dissection) matrix
SIMPLE+TDMA method [22]. The SIMPLE algo-
rithm was used with 4 sweeps of TDMA for compu- 5 processes. There was a reduction in (elapsed) time,
tation of the velocities and 8 for computation of the when passing from 3 to 4 processes; this is a 22% re-
pressure. It can be seen in Table 3 that the CPU duction corresponding to a relative speed-up of 1.27.
For 5 proteases, there is a degradation of CPU and
Grid SIMPLE DIGMRES DIGMRES elapsed times because the farm is composed only by
TDMA ILU(0) ILUT
4 machines, and more than 1 procegs will have to
32 13.1s (124) 8.5s (19) 11.7s (16) share the same processing element.
64 169.9s (381') 9~1.4~
'(20) 8~1.2~(i6j
96 764.2s (735) 608.4s (39) 285.2s (23)
CPU time Elapsed time
128 2292.0s (1212) 10689.1s (91) 1667.2s (65)
Processes Master Slave (max.) -
3 192.7s 2734.3s 3090.48
Table 3: CPU time and number of iterations for a ~
times are competitive. ILU(0) is a good choice for Table 4: CPU and elapsed time for a 64x64 grid and
small grid sizes and ILUT is recommended for finer Master-Slave approach.
grids because it keeps the number of outer iterations
low. To be able to use finer coarsegrain parallelism it is
necessary to reduce the CPU time spent by the m a 6
3.3.2 Parallelhation ter to accompany the decreasing of the total time
For this type of problems the parallelisation by do- induced by the reduction of the CPU time in the
main decomposition was selected. A non-overlapping slaves. Based on this need, another parallel version
domain decomposition strategy was used, where the of the code waa created, based on a SPMD strat-
domain was decomposed into disjoint subdomains egy. Table 5 shows the CPU and elapsed times of
separated by interfaces. The grid nodes were num- the Master-Slave and SPMD approaches for 3 sub-
bered first inside each subdomain and then on the domains on a 64x64 grid. The SPMD approach
interfaces, leading to a bordered block diagonal ma-
CPU time Elapsed time
trix shown in Figs. 5 and 6 [23].
Master-Slave 1780.3s 2434.5s
SPMD 1568.5s 1642.3s
D.S. Jang, R. Jetli, and S. Acharya. Compari- [17] G. B. Deng, J . Piquet, P. Queutey, and M. Vi-
son of the PISO, SIMPLER and SIMPLEC algo- sonneau. A new fully coupled solution of the
rithms for the treatment of the pressure-velocity Navier-Stokes equations. International Journal
coupling in steady flow problems. Numerical for Numerical Methods in Fluids, 19:605-639,
Heat Transfer, 10:209-228, 1986. 1994.
7-8
1. SUMMARY
This paper describes work done at Rockwell Science Center large. Especially, the time required for preprocessing
on the development and application of computational fluid increases almost exponentially as more and more details of
dynamics (CFD) solvers for unstructured grids. A the geometry are included in the simulation. For example,
description of the use of “interior boundary” conditions in in the case of the multibody space shuttle configuration
simulating moving bodies is also presented. (Ref. 7), several months were needed to generate a
structured grid when the fidelity requirements for the model
employed in the numerical simulation were increased
2. INTRODUCTION considerably.
The CFD group at Rockwell Science Center has been
involved over the past fifteen years in the development and Unstructured grid methodologies appear to be very
application of numerical techniques for the simulation of promising, since the preprocessing time could be orders of
flow past complex aerodynamic shapes. Starting with magnitude less than that required for structured grids. It is
small perturbation equations, codes have been developed t o indeed the case for inviscid flows. But, our experience with
solve more and more complex governing equations on unstructured grid computations has opened our eyes t o
structured grids (Ref. 1-4). The latest version of the several issues involved in such simulations. We propose t o
structured grid code solves Reynolds Averaged Navier- discuss some of those issues in this paper.
Stokes (RANS) equations in generalized curvilinear
coordinates. It includes the ability to simulate reacting Research on the development of unstructured grid solvers
multispecies flows (Ref. 5). Simulations requiring grid for Computational Fluid Dynamics (CFD) and
movements are handled quite elegantly using this code Computational ElectroMagnetics (CEM) has been i n
(Ref. 6). CFD codes developed at Rockwell Science Center progress at the Rockwell Science Center for the past
have played a significant role in several national projects several years. An unstructured grid solver for CFD, called
including the Space Shuttle, B-IB and National Aerospace UNIV, that can handle tetrahedral, triangular prizmatic and
Plane (NASP) projects. hexahedral cells, has been developed (Ref. 8). UNIV
employs a finite-element-like formulation that uses
Time required for performing accurate numerical simulation piecewise polynomial interpolation for the dependent
of complex fluid-dynamics problems is still sufficiently variables. The dependent variables are the cell averages of
large to discourage designers from including CFD internal energy, mass, x-, y-, and z-momenta.
techniques in the design cycle. Total time required for a Interpolating polynomials may be discontinuous across
numerical simulation consists of the time required for cell boundaries. An approximate Riemann solver is used t o
resolve discontinuities at cell boundaries. The domain of a
a) preprocessing, which consists of modifying the CAD dependent variable polynomial is restricted to a cell. The
geometry to a form suitable for numerical simulation, discretization of the governing equations is constructed
(in the case of structured grids) dividing the directly from the integral form of the conservation laws.
computational domain into zones, choosing proper No variational principle or method of weighted residuals or
grid resolution at the boundaries and finally grid other indirect approach is employed. The code has the
generation, option to use either a least-square polynomial or a EN0
(Essentially Non-Oscillatory) reconstruction.
b) solver, Reconstruction is the process of constructing an
interpolating function for a cell that satisfies the cell
and average. Please see Ref. 9. for details on E N 0 schemes.
c) post-processing, which consists of extracting Numerical formulation employed in UNIV and a new
physical quantities like skin-friction and heat- approach for simulating bodies in relative motion are
transfer, from the numerical solution; and discussed in the following sections. A generalized Lax-
visualization of the solution. Wendroff scheme for Euler equations adapted from CEM i s
also presented. A pointwise turbulence model that is highly
Several years of research in structured-grid simulations and suitable for unstructured grids is discussed. Lessons learned
developments in computer software and hardware from our experience with unstructured grid computations are
technologies have considerably reduced the turnaround time elucidated.
for numerical solutions. Still the time required to simulate
flow past complex geometries is unacceptably
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
8-2
3. NUMERICAL FORM.ULATION . suggested by Roe (Ref. 10) is employed for this purpose i n
Two different approaches to solving the initial-/boundary- the UNIVERSE-series of codes of which UNIV is a member.
value problem (IBVP) for general hyperbolic system of
conservation laws in the “conservation-law form” The two approaches alluded to at the beginning of this
represented by section differ in their “reconstruction” procedure and also
in the time-stepping scheme. Only explicit time-stepping
aq af, af2 af3 schemes are considered in both approaches. Both
-+-+-+-=o approaches permit use of multiple quadrature points and
at ax ay aZ curved surfaces for higher accuracy. Codes developed using
these approaches can handle hexahedral, tetrahedral and
have been developed. Equation (1) is satisfied at all (x, y, z) triangular- prismatic cells.
belonging to domain D with prescribed initial and
boundary values for the dependent (conserved) variable
vector q. Here, the Cartesian coordinate directions
(independent variables) are x, y , and z. The components of
flux tensor in the three coordinate directions are the vectors
f,. f2 andf,. In both approaches the domain D is divided
into several cells, and the integral form of the conservation
equations in each cell given by
dt
b
0 (origin)
is solved with prescribed initial values for 9
SC.1654E.091195
Fig. 1 “Left” and “Right” states for locally one-
dimensional Riemann problem.
and relevant boundary conditions. Here, 4 denotes the cell
average of the dependent variables;
3.1 The First Approach; a Finite-element Like
A ? A - A *
Algorithm
+ tj,k +
A
n=%j 3
1 The major credit for this work goes to Dr. Chakravarthy.
This approach employs a unified treatment for structured
is the outward unit normal at any point on the boundary and unstructured grids. The codes developed using this
- 4 -b
-
surface S of a cell; j , k , and 1 are the unit vectors in x, y
and z directions respectively; V is the cell volume and F
formulation are called UNIVERSE-series of codes. The
UNIVERSE-series includes “least-square” and “ENO” (Ref.
9) reconstruction options. Both these procedures involve
is the tensor of fluxes with (f,, f2, t3) as components. development of an interpolating polynomial Pc(x,y,z) for
Stated in words, Eqn. 2 implies that the rate of increase of a each of the conserved quantities, where
conserved quantity (iV) inside a cell is given by the net nP
inflow (flux) of that quantity into the cell. Therefore, as in = PF ,j(i) yk(i) zKi)
(3)
the case of cell-centered finite-volume structured grid i d
solvers (Ref. 3), solving the governing equations requires
evaluation of surface integrals from known values of cell where, p c are the coefficients of the polynomial. Pc is
averages. 1
applicable only within a given cell C . Integral of Pc over C
Surface integrals are evaluated using numerical quadrature reproduces the corresponding cell average. That is,
formulas. In this method an integral is written as the
weighted sum of the integrant evaluated at the quadrature (4)
points. The location and weights of quadrature points are so
chosen as to give the best possible approximation for the
integral. Higher order schemes require larger number of where,
quadrature points. Choosing the centroid of a surface as the
quadrature point yields second order accuracy. Since only
the cell-averages of the dependent variables are known, we (5)
need to develop a procedure for evaluating the dependent
variables at the quadrature points in order to compute the
The spatial accurac of the numerical scheme is determined
surface integrals (fluxes). The spatial accuracy of the
numerical scheme is determined by the accuracy of this
J ’
by the form of P . A linear polynomial in x, y and z
“reconstruction” procedure. The dependent variable vector q results in second-order accuracy while a quadratic
at a quadrature point may not be uniquely specified, since polynomial yields a third- order scheme. Linear
the point belongs to two neighboring cells with different polynomial requires evaluation of 4 coefficients, while the
polynomial representations. If the two vectors evaluated at quadratic polynomial requires 10. In the case of the “least-
a quadrature point using the polynomial reconstruction i n square” option, the polynomial coefficients are computed
the two “containing” cells are 91,and q R (Fig. I), then a such that the integral of Pc over cell C reproduces the
corresponding cell average values (Eqn. 4). and the
unique value q* is determined from the solution of a locally
integrals over the neighboring cells satisfy the
one-dimensional Riemann problem with and q R as the
corresponding cell averages in a least-square sense. That is,
“left” and “right” values. An approximate Riemann solver
8-3
a
-E=O
(2) Proximity neighbors (PN)
apF This latter type is defined in terms of distance from a given
cell.
for 0 I i S np. The error term E is given by,
A neighborhood is now defined to be a collection of
neighboring cells. A neighborhood hierarchy is defined as
follows:
Therefore, the neighborhood of a cell should be properly This process may be continued recursively, and depending
defined to satisfy equation (8). The UNIVERSE-series CFD on the order of Pc, a neighborhood may be found such that
formulation defines a “neighbor” of a given cell in a very equation (8) is satisfied.
flexible and useful way.
In the case of EN0 (Essentially NonOscillatory)
First, we consider two types of cell connectivities (Fig. 2): reconstruction, we seek to obtain a “best” polynomial
rather than a “least-squares” one. The “best” polynomial
(1) Node-aligned cells (NAC) corresponds to the “smoothest”. As always, the equation
for cell C must be satisfied (Eqn. 4). From the remaining nc
(2) Surface-aligned cells (SAC) equations, we can select any combination of np equations
and solve the resulting set of np + 1 equations. There are
(3 (9)
such combinations. The combination that yields the best
polynomial in terms of its E N 0 property is to be preferred.
For example, when the flow field contains a single shock
wave, the neighbors selected should lie on the same side of
the shock as cell C. This approach may be termed the “best
stencil” formulation and has been applied very successfully
in various forms to structured grid m0 formulations.
Node Aligned Cells Reference 9 contains many different strategies for this
task. Note that the “least squares” strategy may result in a
WAC) stencil that includes cells from both sides of a
discontinuity and hence not desirable.
(lb) Common-face neighbor (CFN) where a > 1. In other words, p c i s selected such that the
1
corresponding derivative at the centroid does not differ
(IC) Touching-face neighbor (TFN) “too much” from its value in a neighborhood. This
procedure attempts to construct a reconstruction
8-4
4. BOUNDARY CONDITIONS
The implementation of boundary conditions ensures
consistency in flux computations. That is, just like in the
case of any interior cell boundary, computation of fluxes
for a cell boundary that lies on the boundary of the
computational domain involves determination of “left” and
“right” states and Roe’s approximate Riemann solver. The
state that corresponds to the “outside” of the domain should
satisfy the appropriate boundary conditions. For instance,
when computing fluxes for a cell on the left boundary of the
domain where inviscid tangency condition is to be
satisfied, the “left” state should be such that the
corresponding velocity vector should be tangential to the
surface. This manner of imposing boundary conditions
In the above, the explicit dependence of RHS on t is useful ensures that only the information at a boundary that
for time-dependent problems where the boundary corresponds to waves propagating in to the computational
conditions or other behavior explicitly depend on time. domain is actually used in the computation of fluxes.
boundary of the computational grid is referred to as associate with each (pair of) interior boundary point the
“interior boundary conditions.” In this case the user cell that contains it. This chore of searching through the
specifies, among several attributes, the coordinate location mesh to determine the one cell that contains the boundary
of each boundary point as well as a vector normal point is efficiently accomplished in the UNIV flow solver
associated with the point. The need for the normal arises using an “octree” sort and search procedure. Given an
from the fact that even though interior boundary points are interior boundary point, an octree search of the sorted list
specifiable as individual points, they arise from boundary of node points of the mesh quickly yields the nearest mesh
surfaces that they are a part of. It is the surface normal node. All cells that contain the node as well as the
along with its location that describes the local geometry. common-node neighbors of this set of cells are searched, in
Note that the surface in question could very well be a surface that order, to determine if the given point is in any of those
of discontinuity (a shock wave), and it may not be possible cells. If not, the “nearest” cell is identified.
to assign unique values for the dependent variables at the
corresponding boundary point. To account for such a In the previous paragraphs, it was convenient to describe
situation, for every interior boundary location identified by the procedure as if the user provides pointwise information
the user, two interior boundary points are created and added related to interior boundaries. Depending on the relative
to the data base of the UNlV flow solver. One of the added fineness or coarseness of the geometry description of the
points has the normal pointing one way and the second interior surface with respect to the surrounding mesh, there
point the other way (Fig. 3). may be two or more user-specified (before the flow solver
replaces each user-specified point with two points, with the
A normals facing in opposite directions) interior boundary
points in a cell, or there may be none (Fig. 4). In Fig. 4 the
cells 1,5,8 and IO have two or more interior boundary
points while cells 4,7 and 9 have none. The case of
multiple interior points in a cell can be dealt with easily
(e.g., by replacing them with an equivalent single point, if
necessary). But, the case of no interior point in a cell that
actually straddles the interior boundary is not acceptable.
To avoid such problems, we start with the user describing
the interior surface as an unstructured grid (triangular
elements). Using an octree-based sort and search procedure,
the intersection of the mesh with this surface is identified
(Fig. 5). Interior boundary points are assigned to each such
intersection. There could be interior surface geometry
elements that do not participate in such intersections. The
centroids of these elements are optionally added to the list
of interior boundary conditions.
7. TURBULENCE MODELING
Until recently all the turbulence models employed i n
numerical simulations required the knowledge of the normal
points. distances of a point from surrounding walls. This
information is very difficult to obtain in the case of
unstructured grids. In the case of structured grids, mostly
5. GRID GENERATION distances along grid lines were employed. This was
The UNIVERSE-series of codes includes an unstructured grid sufficient since the grid lines were nearly orthogonal in the
generator, named UNIVG. UNIVG accepts specification of vicinity of a body where viscous effects are dominant. But
surface geometry in the form of a collection of patches. A when complex geometries requiring a multizone grid
patch geometry could be specified either in the ICES format topology were encountered, it became difficult to maintain
or by specifying sufficient number of non-intersecting continuity of eddy viscosity at zonal interfaces. To
lines on the patch. Each line in turn is discretized by an circumvent this problem, a pointwise turbulence model that
ordered collection of sufficient number of points. does not require any information regarding the distance of a
Triangular elements are first generated on the boundary of point from surrounding walls was developed at Rockwell
the computational domain satisfying user specified Science Center by Goldberg and Ramakrishnan (Ref. 14).
clustering requirements. The computational domain is then Since then, several such models have been developed, and
discretized in the form of tetrahedral cells using the reliable computation of turbulent flows on structured grids
“advancing front” technique (Ref. 8). has become a possibility.
preprocessing takes an unacceptably long time. used to demand about 200. This situation has been vastly
Unstructured h l e r solvers offer the most viable solution. improved, and the storage requirement has been brought
Since Euler equations, unlike Navier-Stokes equations, do down to a manageable 60 words per conservation cell.
not require very fine grids in the vicinity of solid bodies,
unstructured grid development becomes much easier to The concept of “interior” boundary conditions described in
handle, and several solutions for many different section 4.1 is very promising. It was used successfully i n
configurations can be carried out in a matter of a few weeks. computing the trajectory of a store released from an F-18.
This was indeed demonstrated in the case of some This concept also proved its usefulness in analyzing the
modifications that were carried out for B-IB bomber. effect of mounting an additional equipment on an aircraft.
Starting with the geometry of the aircraft in IGES format, In this case, the grid and solution from an earlier
an Euler solution was obtained for this complex computation could be used along with the geometry of the
configuration (Fig. 6) in about five working days. With the addgl equipment to obtain the required information in a
use of Massively Parallel Processing ( W P ) computers, timely manner. The present implementation of this
this process may be accelerated even more. From this point concept has some shortcomings. To minimize the number
of view, unstructured grid solvers have a clear edge over of arithmatic operations. several approximations were
their structured grid counterpans. introduced. Instead of computing the exact contribution of
each face for updating the interior boundary points, some
simple recipes were employed. This results in
communication between the cells that lie on either side of a
solid object. That is, the interior boundary point pairs 1
and 2 in Fig. 3. interact, resulting in an erroneous
interaction between the inside and outside of the body. It
appears that shortcuts may not work, and it may be
necessary to consider the exact geometry of the
intersecting surfaces when interior boundary conditions are
encountered. Since this process is very involved, it may
not be acceptable for many problems. Alternative
solutions are currently being investigated.
[41 S. Palaniswamy, S.R. Chakravanhy and D.K. Ota, [Ill C. Rowell, V.V. Shankar. W.F. Hall, A.H.
‘‘Finite-Rate Chemistry for USA-Series Codes: Mohammadian. “Algorithmic Aspects and
Formulation and Applications”, A I M Paper No. 89- Computing trends in Computational
0200, 1989. Electromagnetics Using Massively Parallel
Architectures”, Proceedings of First
Rockwell Science Center CFD Department, ‘ U N I S International Conference on Algorithms and
User Manual”, Version 94.1 I . November 1994. Architectures for Parallel Processing, Brisbane.
Australia, 19-21 April, 1995, Volume I , editor V. L.
S.V. Ramakrishnan. C.L. Chen, S.R. Chakravarthy Narasimhan.
and K.Y. Szema. “Numerical Simulation of Two
Opposing High Speed Trains in a Tunnel”. AIAA (121 S.R. Chakravarthy. K.Y. Szema and S.V.
Paper No. 95-0746. 1995. Ramakrishnan. “Unification of Exterior and Interior
Boundary Conditions for Inviscid Computational
D.F. Dominik et al., “Navier-Stokes Solution for Fluid Dynamics”, presented at the 5th International
the Space Shuttle Vehicle using High Fidelity Full Symposium on Computational Fluid Dynamics,
Scale Grid Model”. AIAA-93-0419, Jan. 1993. Sendai International Center, Japan, Aug. 31-Sep. 3,
1993.
S.R. Chakravarthy, et al.. “Computational Fluid
Dynamics Capability for Internally Carried Store 1131 C.L. Chen, K.Y. Szema and S.R. Chakravarthy,
Separation”, Technical Report (Phase 111). Contract “Optimization on Unstructured Grid“, AIAA Paper
No. N6053&90-C-0393, Naval Air Warfare Center, No. 95-0217, 1995.
Weapons Division, China Lake, CA 93555-6001,
May 1994. [I41 U.C. Goldberg and S.V. Ramakrishnan. “A
Pointwise Version of Baldwin-Barth Turbulence
[91 A. Harten and S.R. Chakravarthy, “Multi- Model”, International Journal of Computational
Dimensional FNO Schemes for General Geometry”, Fluid Dynamics, Vol. I , Dec. 1993.
U C U Computational and Applied Mathematics
(CAM) Report 91-16.
M. Delanaye *
Ph. Geuzaine t
J.A. Essers t
P. Rogiest 5
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
9-2
function, but not a quadratic function. This means that the or the Van Leer's flux vector splitting is employed to compute
dominant term of the truncation error involves a second order the upwind numerical flux at each quadrature point.
derivative and can be written:
5.2 Reconstruction phase
j=l
with: A u =
L
ul-uo
U N -
~ U o
1
where ($, y i ) are the coordinates of the Gauss quadrature Na is the number of neighbors of R, i.e. the cells connected
point j , w j denotes the weight associated with this point. to R by at least a common edge or a common vertex. Dl is
a 2 x Na matrix with constant coefficients.
By using n quadrature points, the formula (6) allows the ex-
iI act integration of polynomials with degree 2n - 1 at most. 5.2.3 Extension of the linear to the quadratic reconstruction
To meet the second-order accuracy requirement described in By using a Taylor series expansion of U around node 0 in
section 5.1, two quadraturepoints are at least needed in order (9), the truncation error E corresponding to formula (9) can
to compute exactly the flux integral of a quadratic polyno- be expressed as:
mial of the Cartesian coordinates. Schemes employing linear E = Er V2u0 (10)
reconstruction only necessitate one quadrature point (located
at the mid-point edge), but they are usually first-order accu-
rate only as already mentioned above. Essers et al. however
with: V 2 uo = [ ]
a;,u0
aiYuo
axyuo
proved that a one quadrature point integration can produce a
full second-order scheme even for very irregular meshes. This Note that Er is a 2 x 3 matrix containing constant coeffi-
accuracy can only be recovered by applying a non conserva- cients of c?( h ) . For arbitrary meshes, secontl-order accuracy
tive correction to the scheme, which definitely constitutes a is nevertheless recovered by subtracting E from the right-hand-
drawback with respect to the present method. side of (9):
A Riemann solver such as the Roe's flux difference splitting VUO= Di A U - Er V'UO (11)
9-4
This second-order numerical gradient does indeed depend on ways: either by selecting another stencil for the reconstruction
some sufficiently accurate (first-orcler at least) second-order which does not involve the discontinuity ( E N 0 schemes "), or
derivatives. By replacing ( 1 1 ) in ( 5 ) , we obtain a quadratic by modifying the reconstruction within the s a n e stencil (TVD
-
reconstruction for which the only unavailable coefficients are schemes 23).
the second-order derivatives of U :
The design of multidimensional limiters has Iieen introduced
linear part
by Barth and Jespersen ' I . However, as shown by Venkata-
krishnan **, such limiters may severely hamper the conver-
gence to the steady state. This problem is still more dramatic
when employing implicit schemes with large CFL numbers.
" Venkatakrishnan 28 proposed some modifications to the limiter,
quadratic part
and obtained convergence at the price of the evaluation of an
The second order derivatives are computed by a technique additional constant.
sometimes referred to as the minimum-energy reconstruction
5 i 2 6 . It sunply consists in fitting the cell quadratic polynomial
W e employ another approach by using the rather old idea of
'ti,.,, to the values of the neighboring nodes. The following hybrid schemes 25 , however applied to the reconstruction.
functional is ininirnized with respect to V'UO: The quadratic reconstruction is switched to a monotone con-
stant reconstruction in the vicinity of discontinuities. While in
"smooth flows regions", it remains unaltered.
( A , - A l E r ) V Z ~ g= ( 1 - A I D l ) A i i (14) i=l
60 = N,
(16)
'U is
the pressure orkand the velocity norm. The complete form
of 7 , which acts as a filter term, is given in reference ( ').
with D2 a 3 x Nn matrix with constant coefficients: The quadratic reconstruction ( 5 ) is finally modified as follows:
-1
D2 = [(A, - A I E ~(A,
) ~- AlEr)]
at one quadrature point located at the mid-edge, which requires where J (Q) = & + 2aQ2 is the Jacobian of 3
the evaluation of the gradient.; of the primitive variables at the
nodes. As pointed out in section 5.2.3, these gradients have As pointed out by Kuffer 30, deciding when the Newton loop
been previously computed with a second-order accuracy during ha..; to be stopped is not easy. A large residual decrcd;.\L IS
' not
the quadratic reconstruction phase. They are obtained at the always required, which necessitates many inner iterations and
quadrature point by using a linear interpolation between the left then costs a lot of computational time. Except for unsteady
( L ) and the right ( R )neighbors of the edge. Strictly speaking, flow computations for which equation (20) must be solved
that procedure is only valid if the mid-edge point lies on the accurately, many authors usually limit the number of inner
line joining the left and right neighbors. If it does not, the iterations to one (n = 0). The resulting descent direction is
following modified interpolation formula ha..; to be used: in fact usually accurate enough to decrease the residual satis-
factorily. As the time step increases to infinity, the iterative
time-marching scheme tends to a Newton-Raphson lineariza-
tion of the steady state equations. Restricted to one inner loop
iteration, the iterative process (21) becomes:
where P is the quadrature point, Q the projection of P on
An analysis of the convergence of what is referred to as the in- In most of the results presented in this paper, the Roe’s flux
exact Newloidfinite-difference projection methods is given by difference splitting is employed. It is quite a complex and
Brown 20. The interesting feature of equation (27) is that the expensive task to derive analytically even an approximate form
calculation and the storage of the Jacobian are not required. ’’ of the jacobian of the latter scheme. One alternative is to use
Indeed, the computation of the jacobians of the advective and the easily available jacobian of the Van Leer’s scheme in the
diffusive flux may be very complicated, and the exact jacobian preconditionner, which costs 2 to 3 times less computational
of the Roe’s flux difference splitting is very expensive to com- time than the Roe’s scheme jacobian. A comparison between
pute. Furthermore, the introduction of turbulence modelling in both preconditonners is addressed in the section devoted to the
the frame of future developments will also lead to difficulties presentation of the results.
for deriving jacobians. The stencil of the quadratic reconstruc-
tion usually involves an average of 9 to 13 cells. Therefore, It should be mentioned that up to now the contribution of the
the required storage should amount from 144 to 208 words per viscous flux jacobian is not introduced in the preconditionner.
cell, which is quite expensive.
7.5 Tiiiie step iiicreiiieiit coiitrol
A proper choice of the parameter E in (27) is given by the As explained in the previous section, the Newton’s method is
analysis of Dennis and Schnabel ” : implemented in a time-stepping form. The evolution in time
is monitored by the time step. During the time-marching, the
EllPll = dY time step is increased to infinity in order to ultimately achieve
the Newton’s quadratic convergence. Like many authors, this
where 77 is the machine zero and 1 I. II represents the RMS norm.
9-7
is performed by employing an empirical formula in which the number of points involved in the mesh. The edge data struc-
C F L number varies according to the inverse of a residual ture employed in the code and the relatively insensitivity of
norm: the accuracy of the numerical scheme to grid distortions al-
low the use of very general polygonal cells, and as a result
of somewhat distorted meshes. We developed a very general
adaptation strategy based on mesh enrichment and coarsening.
The method is based on an error indicator of the form (16).
There indeed subsists two different parameters to tune in order
Cells whose error indicator lies above a preset threshold are
to optimize the convergence rate : the initial C F L number
candidates for refinement, while others whose error indicator
and the exponent p . Typical values of the latter parameters
lies under another preset value are to be possibly coarsened.
are : C F L o = 10 and p = 0.5.
The refinement strategy, which is implemented for any type
8. BOUNDARY CONDITIONS of polygons is described in reference ( ’). In particular, trian-
gles and quadrangles can be divided anisotropically depending
The treatment of the boundary conditions has a strong influ-
on the value of an anisotropy sensor based on some standard
ence on the convergence of an implicit scheme. For inviscid
deviations of the gradients of a flow parameter computed in
flow computations, we use a very convenient procedure, wich
the directions pointing to the different neighbors of the cell.
consists in imposing the boundary conditions in a weak manner
Two types of coarsening procedures are considered. The first
via the modification of the advective flux through the bound-
one is based on the refinement history. A tree containing the
ary edges. Hence, according to the boundary type, some of
information between successive meshes is updated during the
the flow variables are imposed at the quadrature points of the
refinements. It is then rather easy to delete “son” cells and to
edges, and others are computed from their values at interior
recover the “parent”. The second method is more general and
nodes using extrapolation formulas similar to those used to
coarsens the grid by deleting vertices and recombining others
evaluate left and right values at the quadrature points of in-
to build larger polygons.
ner edges. For viscous flow computations, inlet and outlet
boundaries are treated in a similar way as inviscid boundary
10. RESULTS
conditions. At the solid walls, the viscous flux is modified in
order to impose the noslip boundary condition: 10.1 Subsonic sine-bump
The effect of the various reconstructions (quadratic - linear
r o i - constant) has first been tested by computing the inviscid
subsonic flow (kf, = 0.5) in a channel perturbed by a sine
bump with a mesh of 1294 cells (fig. 2a). The geometry is
defined as follows:
That method however turns out to be generally too weak to cor-
Lower wall:
rectly satisfy the no slip condition. Two additional procedures
have been tested. The first corresponds to the introduction -0.7 <x< 0 :y=O
of dummy nodes in the stencil of the boundary cells. These 0 <x< 1 : y = 0.05[1 + sin(2.rrz - $)]
dummy nodes are located at the mid-point of boundary edges. 1 <z< 1.7 :y=O
The flow variables at these nodes are extrapolated or imposed
by the no slip boundary condition before each evaluation of UDper wall: -0.7 < x < 1.7 : y = 0.7
the flow derivatives. That method has been implemented in a
fully implicit manner and successfully used for the flat plate The Roe’s scheme is employed as Riemann solver. The solu-
boundary layer computation. Unfortunately, the result is not so tions have been computed for an infinite value of the C F L
good for more complex flow computations. For these flows, number and %maximumnumber of GMRES iterations equal to
we have tested another procedure. The boundary nodes are 60 with a restart every 30 iterations. Figures 2c and 2d show
no longer located at the cell gravity center, but at the mid- the evolution of the Mach number and the total pressure on
boundary edge, and the noslip boundary condition is imposed the lower wall. The quadratic reconstruction clearly appears
in its strong form at each Newton iteration. to lead to the lowest spurious entropy generation (fig. 2d).
Hence, it predicts the highest peak Mach number : 0.835. For
Finally, notice that it is essential to include the contribution the sake of comparison, the peak values respectively calculated
of the boundary conditions in the preconditionner. The ja- with the linear and the constant reconstructions are equal to
cobian of the modified boundary advective flux is calculated 0.804 and 0.754. When compared to other reconstructions (re-
analytically for most of the boundary conditions except for sults not shown), the symmetry of the solution obtained with
the subsonic inlet. For the latter, it is derived from a finite the quadratic scheme is almost perfect as can be seen from the
difference formula similar to equation (27). iso-mach lines pattern (fig. 2b).
and number of Newton iterations (fig. 20. The quadratic re- migrate to their right location. During that phase, the residual
construction however takes about 25 % more CPU time than actually stagnates. That prevents the CFL number to increase
the linear reconstruction scheme to achieve the same residual to infinity, and therefore dramatically slows down the conver-
decay. gence. We actually use the two following remedies: the star-
ting solution is obtained with a cheap low order scheme, and
10.2 Subcritical NACA0012 airfoil we use a grid sequencing strategy with mesh adaptation. The
The second test case again illustrates the accuracy gain ob- initial mesh contains 1420 rectangular cells (fig. 4a). After
tained with the quadratic reconstruction scheme. The inviscid three adaptation, the final grid (fig. 4b) involves a lower num-
subsonic flow over the NACA0012 airfoil has been computed ber of cells (1296), which are very general polygons. The total
at a freestream Mach number of 0.63 and an incidence of 2 deg. computational time (not shown here) amounts to 400 CPU sec.
The mesh contains 4537 cells (fig. 3 4 . The far-field bound- on a HP9000/730 workstation (infinite C F L number). Fig.
ary is located at a distance of 20 chords away from the airfoil. 4c shows the points where the detector automatically activates.
The starting solution corresponds to the uniform flow. The In order to avoid endless switches of the latter, it is frozen after
solutions computed with the various reconstructions behave 5 Newton’s iterations. As can be shown of fig. 4d and 4e, a
similarly on the lower wall (fig. 3b). However, larger discrep- very crisp shock is captured. Different convergence histories
ancies occur at the upper wall due to the strong flow accelera- for computations performed on the initial mesh are presented
tion. The highest peak Mach number (0.981) is again obtained in fig. 4g and 4h. The fastest convergence is again obtained
with the quadratic reconstruction and agrees very well with the with an infinite C F L number. Those figures also show that
value computed by Paillere (0.983) 36 on the same mesh with an exponent p equal to 2 yields a similar convergence history.
a fluctuation splitting scheme. Figure 3d shows the evolution For the sake of comparison, the GMRES algorithm appears to
of the total pressure along the wall. Notice that the level of be about 3 times faster than the SOR scheme in terms of the
spurious entropy generated by the quadratic reconstruction is computational time.
very low. The lift coefficient Cl = 0.323 compares well
with the value computed by Paillere 36 (C/ = 0.322), and 10.4 Inviscid hypersonic flow over a double-ellipse
the purely numerical pressure drag coefficient is found very We now consider the inviscid flow over the double-ellipse test
low, Cd = 0.00034. The lift coefficient is however slightly case proposed in the workshop of Antibes 38 at 30 deg. angle
lower than the exact one predicted by a full potential method of attack and a Mach number of 8.15. The initial mesh of
(C/ = 0.334). That difference can be explained by the fact 2412 triangles (fig. 5a) is adapted three times (9527 cells,
that no vortex correction is imposed at the far-field boundary fig. 5b). The iso-mach lines pattern is presented in fig. 5c.
condition 3 6 . The error between the present value and the ex- Nearby, the fig. 5d shows the nodes where discontinuities are
act one is equal to 3.4 %. According to the work of Thomas automatically detected. Convergence histories are presented
and Salas 37, a computation with a mesh of about 20 chords for the computation on the final adapted mesh. Notice in
and with no vortex correction should underpredict the lift co- fig. 5e the dramatic convergence improvement obtained with
efficient with a factor of 4 %. implicit scheme with respect to a 4 steps explicit Runge-Kutta
scheme. Fig. 5h, 5g and 5i respectively give the evolution
The influence of the exponent ( p ) of the C F L update formula of the Mach number, the pressure coefficient and the total
(29) on the convergence has been tested (fig. 3e and 30. The pressure along the windward and leeward sides. Our results
code diverges when the computation is initiated with an infinite are compared with those obtained by Gustafsson et al. and
C F L number. The convergence history obtained when the Khalfallah et al. published in the workshop proceedings 38.
GMRES is replaced by an SOR iterative solver is provided The pressure coefficient and the Mach number agree with the
in fig. 3f and 3e. Figure 3f clearly shows that the Newton’s results of the latter authors. Notice the fair agreement between
quadratic convergence is never reached with the SOR strategy. the computed total pressure and the exact one which can be
Nevertheless, this strategy turns out to be competitive in terms obtained from the normal shock theory (less than 0.02 % error
of the computational cost (fig. 3e). on the leeward side).
10.6 Laminar viscoiis flow over a flat plate 0.143 instead of the reference value 0.15. Moreover, the cell
The accuracy of the Navier-Stokes code has been assessed longitudinal dimension in the region of the separation point is
by investigating the development of a laminar compressible also relatively large: about 1 % of the chord.
boundary layer over an adiabatic flat plate. For that calcula-
tion, the Mach and the Prandtl numbers are respectively taken 11. CONCLUSION
equal to 0.5 and 1 . The viscosity is proportional to the tem- In this paper, an original quadratic reconstruction finite-volume
perature (Crocco’s viscosity law) in order to compare our re- scheme for solving the Euler and full Navier-Stokes equa-
sults with the exact solution predicted by the boundary layer tions has been presented. The quadratic reconstruction is a
theory. The computation is performed with a full quadratic higher-order extension of the robust Green-Gauss linear recon-
reconstruction and the Roe’s flux difference splitting. The struction. The accuracy of the resulting discretized advective
initial C F L number is equal to 10 and the exponent p is derivatives is second-order, and is insensitive to grid distor-
0.5. We noticed quite an important numerical influence of the tions. The robustness and the high accuracy of the scheme
downstream boundary condition (inviscid subsonic outlet with have been demonstrated by various computations on very dis-
pressure imposed), that forced us to locate that boundary quite torted meshes. The Newton-Krylov method based on the
far from the leading edge of the plate, i.e. at a Reynolds num- GMRES iterative solver has been successfully used to drama-
ber based on x equal to 10,000. The mesh contains rectangular tically improve the convergence to steady state with respect
cells. There is an average of 15 cells in the displacement thick- to explicit methods. The implicit scheme has been tested on
ness of the boundary layer. An excellent agreement is found fully subsonic, transonic, supersonic inviscid flows, and on
between the computed and exact velocity and temperature pro- laminar viscous flows computations. For transonic and su-
files (fig. 7c). Fig. 7b shows the evolution of the skin friction personic flows, a discrete discontinuity detector is employed
coefficient along the plate, which also agrees very well with to switch the scheme to a monotone constant reconstruction.
the exact one. Notice also the good agreement, especially in This alternative does not encounter the major problems of the
outer part of the boundary layer, between the computed and classical multidimensional limiters to drive the convergence
exact shear stress (fig. 7d). Unfortunately, some deviation oc- to machine accuracy. For inviscid flow test cases when the
curs near the wall. Our latest investigations show that it seems Roe’s flux difference splitting is employed, the precondition-
to be caused by a perturbation coming from the downstream ner based on an approximate jacobian of the Roe’s flux dif-
boundary condition. The problem must be further studied. The ference splitting always lead to a faster convergence than a
convergence history is reported in fig. 7a. The relatively slow preconditionner based on the Van Leer’s flux vector splitting
convergence is attributed on one hand to the fact that no con- although much cheaper to compute. For viscous flow com-
tribution of the viscous terms jacobians is introduced in the putations, the ability of the scheme to deal with hybrid grids
preconditionner and on the other hand to the weakness of the is a real advantage. The quadratic reconstruction has led to
I L U ( 0 ) decomposition. very accurate solutions. However, the proper imposition of
the boundary conditions remains a problem. Two methods
10.7 Laminar viscous flow over the NACA0012 airfoil have been tested. The first one which modifies the stencil
In this final test case, we consider the laminar flow over a of the reconstruction for boundary cells to include the effect
NACA0012 airfoil at 0 deg. incidence with a freestream Mach of the boundary conditions has been successfully applied for
number of 0.5 and a Reynolds number of 5000. The wall a flat plate boundary layer computation. But, another proce-
is adiabatic. The Sutherland viscosity law is employed and dure was required for the computation of a laminar viscous
i
the Prandtl number is equal to 0.72. The flexibility of the flow around the NACA0012 airfoil. It consists in locating the
method is illustrated by employing a hybrid grid (fig. 8a). boundary nodes on the boundary edges rather than at the cell
It consists in a structured C-type part around the airfoil and gravity center and then to apply the boundary conditions in
in the wake surrounded by a triangular mesh. The far-field their strong form. This modified strategy, which is explicit,
boundary is located at a distance of 33 chords from the airfoil. unfortunately artificially perturbs the convergence for nodes
The cell aspect ratio varies from 100 near the wall to 50,000 near solid walls. More efforts should also be devoted to the
in the wake. The iso-mach lines pattern presented in fig. 8b improvement of the preconditionner which seems to be too
shows the development of the boundary layer and its separation weak for viscous flow computations.
near the trailing edge to form a small recirculation bubble.
The pressure and skin friction coefficients are presented in 12. ACKNOWLEDGMENTS
fig. 8c and 8d. Accuracy estimates of the results may be The works of Ph. Geuzaine and P. Rogiest are presently sup-
carried out by comparing the location of the separation point ported by fellowships awarded by the Fund for the Formation
(in percents of the chord) and the magnitudes of the pressure in Research in Industry and Agriculture (F.R.I.A.), and by the
and viscous drag coefficients. We obtain x S e p = 81.7%, National Fund for Scientific Research (F.N.R.S.), respectively.
Cd, = 0.0227, Cd, = 0.0320. These results agree with the The authors wish to thank Prof. H. Deconinck from Von Kar-
reference values obtained by Swanson and Turkel 39 on a 518 man Institute. (Belgium) for providing the mesh employed for
x 128 structured mesh (z, = 81.4%, c d , = 0.02235, the computation of the subsonic flow around the NACA0012
Cd, = 0.03299). Notice however that the present mesh airfoil.
only involves 7709 cells and is relatively coarse in the leading
edge region which is responsible for a slightly underprediction
of the skin friction. We obtain a maximum peak value of
9-10
-r
Fig. 2a: Mesh (1294 cells)
(0.4 - 0.85,A0.01875)
1.02
1.01
r 0.65 0.99
9 0.96 QUA 4
0.55
0.87
0.96
0.45
0.85
-0.5 0 0.5 1 1.5 4.5 0 0.5 I 1.5
Fig. 2c: Wall Mach number Fig. 2d: Wall Total pressure
IHOO
..-
--._ _ . ,^
QUA. ROE C
QUA-VL c
le46 .
LIN ROE -E--.
CON -ROE -x--
EXPLICIT --
I
0 2w 4w 600 600 lorn 0 3 6 0 12
CPU(*) itUaliOr*l
Fig. 28: Convergence - CPU (HP9000) Fig. 21: Convergence. Newton iter.
0.8 1
0.6 0 DO
0.4 OVA c
LIN c
CON .=-
o'
00 0.2 0.4 x 0.6 0.6 1 0.011
0 OB OB 1
Fig. 3c: Wall Mach number Fig. 3d: Wall Total pressure
I ._ -
0 MI0 1WO 1500 ZWO 25X 0 40 80 120 160
CPU [*I nsntrns
Fig. 3e: Convergence CPU (DECalpha 250)
~ Fig. 3t: Convergence Newton Iter.
~
I L
Fig. 4d: Iso-Mach lines
Fig. 4c: Discontinuity detector
(0.59 - 1.33,A0.03)
1.4, I
1'1
1.05
r-
-
-4-
0.5 I
-2 -1 0 I
Fig. 49: Convergence - CPU (HP9000) Fig. 4h: Convergence - Newton iter.
Transonic bump, M , = 0.85
9-14
Fig. 5a: Initial mesh Fig. 5b: Adapted mesh Fig. 5c: Iso-Mach lines
Fig. 5d: Detector
(2412 cells) (9527 cells) (0 - 8.15,AO.Z)
* ,
I
~ ;E
Fig. 50: Convergence - CPU (DECalpha 250) Fig. 5f: Convergence - Newton iter.
I
o m
O.W,
n*l
1
...-
-. ! . . ....... ..._..I __
0.w
w 4.w 4.w 0.e -0.W 4.M 4.w OOB .Ma 9.w 4.02 0.-
Fig. 5g: Wall Cp Fig. 5h: Wall Mach number Fig. 51: Wall Total pressure
Fig. 6a: initial mesh 6242 cells) Fig. 6 b Final Mesh (10149 cells)
1.2
0.8 - -
0.4 - -
0 _ .................................................. .-
Present -
1 .01 Exact -e.-,
1 -
5 - ............
: 0.99
3 -
'
0.98
0.97 - -
0.96 - -
0.95 I
-1 -0.5 .O 0.5 1 1.5
X
Fig. 61: Total pressure distribution
0.4
Preoml
ExSd .
-
.-
0 0.2 0.4 0.6 0.8 1
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
10-2
1.1 DCfinition d’un schCma cindtique Diffkrents choix eont possiblea pour la fonction
d’kquilibre fw([). Les deux plus couranb sont la fonc-
Lee &&mas cinktiques eont des schkmas VolumeeFinis tion ’crkneau’ prop& par Perthame 17’81
d k e n t r b pour lesquels la fonction Flux Numkrique
F ( w , J , n ) eat de type Flux-Splitting c’eat-Mire de la
forme suivante:
seulement dt.6 prouvb de manikre formelle dam [SJl]. 2 Extension a I’ordre 2 et posi-
La question de la positivi6 du schdma a par contre 6 6
r h l u e dam [11,14]. On rappelle ci-deseoue le principal t ivit6
rhltat.
Pour 6tre le plus gCndral possible, on se place dans un 2.1 Principe de I’extension 1 I’ordre 2
cadre multidimensionnel (d ddsignant le nombre de di- Pour dtendre une mdthode de volumes finis b I’ordre 2
mensions d’espace) et le d l a g e , not6 M h , est s u p p d en espace, il exiete au moins deux approches classiques:
quelconque. On note K un dldment quelconque de Mh,
m(K) sa mesure de Lebesgue dans Rd, K e eon voisin le 0 La premikre (sans doute la plus uti& du fait de
long de la face e , nK,e la normale b la face e dirig6e de sa simplicitd et de sa gdndrali6) est la mdthode
K vera Ke et m(e) la mesure de Lebesgue de la face e MUSCL de Van Leer. Elle ~0nsiSteb dhmposer
dans Rd-’ (cf figure 2). un pas de temps en deux dtapes: une premi6re &ape
d’interpolation affine de la eolution approch&, une
seeonde &ape oii l’on applique le echima volume fini
aux valeurs interpoldes de la solution approchb. Le
point essentiel rdside dam le fait que lora de I’dtape
d’interpolation, il est nkcessaire de limiter la valeur
du gradient de la solution approchde a h d’dviter
1’apparition d’oscillations.
0 La aeconde dhignde dam la littdrature anglesaxone
sous le non de ’corrected antidiffusive flux approach’
(elle sera not& CAFA par la suite) consiste h ajouter
au flux numdrique du premier ordre une correction
antidifhive qui doit dtre limit& pour des raisons
de stabilitd numdrique.
L SUP
m(W(IUEI + + +IA,
2d 1 r
< (3)
Quelque soit l’approche adoptie, MUSCL ou CAFA,
il est ndcesaaire de joindre (ou dventuellement de sub-
K€Mh m(K) stituer) b ces critbres empiriques de limitation de pente
Remarque: En pratique cette condition est un peu ou de flux, qui permettent de contrbler les oscillations,
plus restrictive que la condition CFL usuelle (elle cor- une condition qui garantisee que le schdma laisse invari-
respond environ pour un gaz parfait avec y = 1.4 B ant l’ensemble Wad (ce qui suppose bien eiir que la p r e
CFL = 0.5). Toutefois c’est seulement une condition pridtd est dejb satiafaite par le schkma d’ordre 1). I1
suffisante et, dam lea applications, on n’a jamais con- est intdressant de remarquer que cette seule propridtd
statk de difficult6 en prenant CFL = 0.9. d’invariance de l’ensemble Wad garantit la stabilitd en
10-4
Remarques:
. .
+ +
La fonction exp[(+, 4N.t dpE.f)/r1 n’est autre
que la Maxwellienne. La fonction exp itant convexe sur
10-8
Sur lea figures 13 et 14, on p r k n t e le calcul d’un in several space dimensions: tbe corrected antidif-
6coulement stationnaire entrant a Mach 2 dans un tune1 fusive flux approacb, Math. of Comp. 57, 1991, p
comportant une rampe inclin& 15 de&. Le mail- 169-210
lage a & obtenu a p r b t r o i s rffiementssucceaeite ZI
partir d’un maillage gromier comportant 2500 cellules. [3] B. EINFELDT, C.D. MUNZ, P.L. ROE and B.
Le critkre de rathement utili& (cf section 4) a permis SJOGREEN, On godunov type metbods near low
de detecter toutes les ondes p r h n t e s dana IUcoulement; densities, J. ofa m p . Phys., Vol. 92, 1991, 273-295
en particulier le maillage a dt4 raf6nd au niveau de la
ligne de glissement dmanant du point triple situ6 sur la [4] S.M. DESHPANDE, On the Maxwelian distribu-
paroi sup6rieure (cf figure 11). On voit sur la figure 12 tion, symmetric form and the entropy conservation
que le raffinement du maillage a permis une mdlioration for tbe compressible Euler equations, T d . Rep.
sensible de la qualit6 des r h l t a t s . 2583, NASA Langley, 1986
Enfin on prbente sur les figures 15 Q 17 des rbultats [5] S.M. DESHPANDE, A sewnd order accurate, ki-
numdriques 3D concernant le calcul d’un dcoulement netic tbeory based, metbod for inviscid w m p m i b l e
tramonique (Mach: 0.84, Incidence: 3.06 de@&) au- Bows, Tech. Rep. 2613, NASA Langley, 1986
tour de la voilure M6 de 1’ONER.A. Ce cas teat est t&
bien document4 dam [21]. Le maillage initial est con- [SI E. GODLEWSKI, P.A. RAVIART, Hyperbolic sys-
stitud d’environ 60000 tktrddres, ce qui est assez grossier tems of conservation laws, SMAI, 1990
pour ce type de calculs. Le maillage final (figure 20) a
6tk obtenu a p r b 2 raffbements successifs. Lea rbultats 171 B. PERTHAME, Boltzmann tVpe schemes for gas
obtenus sont tout Q fait en accord avec ceux des diffdrents dynamics and the entropy property, SIAM J. Num.
contributeurs du workshop AGARD (211. Le raffinement Anal., Vol 27-6, 1990, 1405-1421
du maillage permet la encore d’amdliorer sensiblement la
prCcision des rdsultats. [8] B. PEKTHAME, F. CORON, Numerical passage
from kinetic to fluid equations, SIAM J. Num.
Anal., Vol28-1, 1991, 2642
6 Conclusion
[9] B. PEKTHAME, Second order Boltzmann schemes
On a prdsentd dans cet article un nouveau schdma for compressible Euler equations in one and two
cinetique d’ordre 2 prdservant la positivite de la masse space variables, SIAM J. Num. Anal., Vol. 241,
volumique et de la tempdrature sous condition CFL. Lea 1992
premier rbultats nudriques obtenus sur maillages non
structurb sont tree bona et confirment lea propridth [lo] B. PERTHAME, Y. QIU, A new variant of Van
thdoriques de robustesse du schdma. De plus l’estimation Leer’smetbod for multidimemional systems of con-
d’entropie discrkte aesocike au schdma d’ordre 1 permet servation laws, Rapport technique 1562, INRIA,
de ddgager, de manikre naturelle, un critkre de rathe- 1991
ment de maillage fond6 sur la production d’entropie lo-
cale. Ce critkre semble un excellent candidat pour la [ll] P. VILLEDIEU, P.A. MAZET, Sche‘mascine‘tiques
capture des discontinuitds stationnaires. pour les kquations d%uler
La suite de cette dtude va consister a dtendre ce bors Bquilibre thermocbimique, h paraitre dans la
schdma au calcul d’dcoulements rdactifs, pour lesquels Recherche Adrospatiale
la robustesse de la d t h o d e numCrique est un critkre e%
sentiel. Ce travail est en coura. [12] P. VILLEDIEU, Approximations de type cine‘tique
du syst4me byperbolique de la dynamique des
gaz bors kquilibre tbermochimique, Thkse de
References l’universitd Paul Sabatier, 1994
[l] F. BOURDEL, Jp. CROISILLE, p. DELORME, [13] P. VILLEDIEU, ScbBmas cine‘tiques d’ordre 2 sur
P. MAZET, Sur l’approxhation par e‘le‘mentsfinis maillages non structurk Rapport Technique DEW
des systkmes hyperboliques K-diagonalisables, A p n 213526.00, Fev. 1995
plication aux 6quations d’Euler et aux m’langes de
gaz, La Recherche Adrospatiale, 1989-5, 15-34 [14] P. VILLEDIEU, J.L. ESTIVALEZES, High order
pasitivity preserving scbemes for the compressible
[2] F. COQUEL, P.G. LEFLOCH, Convergence of fi- Euler equations, a paraitre dans SIAM J. Num.
nite difference schemes for scalar conservation laws Anal.
10-10
0.2 -0.5
0
.3
0.0
0.1
0.00.0 0.2 0.4 0.6 0.8 1.0 -250.0
-1.5 02 0.4 0.6 0.8 1.0
:m
0.0 0.2 0.4 0.6 0.8 1.0
2r
f
i
0.6
4.5
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 "0.0 0.2 0.4 0.6 0.8 1.0
_-
figure 8: Ernes tah c w : Reflined Mesh (10000 .ell.)and C, at the wall
10-12
I
I
figure 10: Medium Mesh (SOW cells) and Rd%ed Mesh (14000 &)
*re 12: Emvy test case: Iao density Lines on the r e 5 e d Mesh
figure 13: Coarse Mesh (2500 cells) and Reffincd Mesh (9000 ceb)
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
11-2
= pT(x)a
=
(1)
4 X
where the base functions pT = [l, z]for m = 2 and Figure 1: Nodal unknowm U and the interpolated
pT = [I, 2, for m = 3 in one dimension [5]. function 0.
.
in the interpolation domain Qi This reduces both Taking A and IC constant, the analytical solution of
computational cost and memory. this first order homogeneous differential equation is
obtained as:
3.1.3 Weighted Lead Squama Approach
A drawback of the interpolation procedure so far pre-
sented is that equal weight is given to all the points
in ni. This can rapidly cause a deterioration of the
approximation [SI. A remedy can be the introduc- With uo = 1 and UL = 0, equation (11) reduces to
tion of weighting functions, such as a Gauss function,
which will be described next.
Following the least squares approach from above, we
can directly include the weighting functions w(xj) in
A test for time marching schemes is solving equation
eq. (4):
(9) by iterating until steady state is reached to a p
proximate the exact result. This is usually done by
n
J = C w ( ~ j ) ( u F - i i ( ~ j )=CW(Z~)(U;-P
)~
n
a) T 3
expanding equation (9) in time using a Taylor series:
j=1 j=1
At'd'u
(7) U"+' = un + O0
T Bt' (13)
i=l
Again minimising J with respect to a,we obtain
A discretisation in space must be performed next.
First, known and proven finite element methods will
U = A - ~ B ~ ~ (8)
be presented, and then the finite point method pr+
with A = W(xj)(CTC) and B = C T W . W is now posed will be described.
a diagonal matrix containing the weights w(x,) at
each point in Qi. 3.2.lFinite element solution
In [6], the authors demonstrate a strong sensitivity It is well known that the exact solution to this prob-
to the number of points chosen within each cloud Qi lem can be nodally reproduced by the finite element
if no weighting functions are used. In an example, method using the following Petrov-Galerkin meth-
the shape function plots show a drastic deterioration ods for all ranges of the Peclet number Pe [7]. This
can be achieved by expanding equation (13) up to
for both linear and quadratic base functions p.
8U
-
8z
=a 3
Figure 3: Exact solution to the convection diffusion For the linear case, it is not possible to directly com-
equation using a Taylor-Galerkin scheme. pute the necessary second order derivatives. This
can be overcome by performing an accumulation of
3.2.2 The finite point method (FPM) differences at the central point and the rest of the
Let us now analyse the finite point method in the points j within the cloud. Hence,
context of the 1D convection-diffusion equation. In-
tegrating eq. (9) in time, performing a Taylor ex-
pansion of eq. (13) up to second order leads to:
- -
I I I
9.
I I I
I I I
-
and Substituting these derivatives into eq. (18) w w w
a
w b
leads to a system of equationsfrom which the un- 2 1 3 X
0.4 .
02 -
Inversion of A leads to
21 21-h
BU
- -- a1 = (-2 1 - - X) Ul 1 + (-2h2 - -)u1+
BX 2hl 2hl 2hl
0.4
0.2 -
O*
0.2 0.4 0.6 - - 018
t
WITHOUT wdghthg
EXACT -
Wng QAUSS wdghthg fundom +
0.8
1
EXACT
WlTHOUTwdghthg
-
Umlng QAUSS weighlhg fundlor).
0.8
0.0
0.4
1
0.2
0.8
1
0.0
0.4
Figure T : Convection diffusion equationUsing linear
base functions and 4 nodes per cloud for
0.2
a) Pc = 0.6, b) Pc = 1 and c) Pe = 2.6.
and 8 demonstrate this behavior for n = 4 and n = 5. oscillations disappear if a weighting interpolation is
used.
Using Gaussian weighting functions, the improve-
ment of the solution is impressive. With A j = rmjn,
where rmin refers to the minimum distance of r in
Cli, practically exact nodal values are recovered for
this 1D test problem (see Figures 7 and 8).
EXACT
WIl"lwdgh1tng
-
Umhg OIU88 WdOhihg f u M l o r u +
0
. 0.4 0.6 03 - - \
0.6 .
W C T
WlTHOUT wdohiing
-
U&IQ mu88 w*ghihg fUndi0ru +
0.4.
0 2 i i : l l
0
0.8
Figure 9: Convection diffusion equation: quadratic
baae functions and 3 nodes per cloud for
a) Pc = 0.6, b) Pc = 1 and c) Pc = 2.6.
021
0.4 I I
0 02 0.4 0.6 0.8 1
where wj are the same weighting functions used in
the interpolations of eq. (8) and the coefficients of
Figure 10: Convectiondiffurion equationusing qua- eq. (32) are obtained as:
dratic b e functions and 6 pointr per
cloud for Pc = 1.
there must be at least one point in every quadrant The results of the FPM were obtained by employing
of orthogonal axes. This leads to a minimum of 5 7 nodes in ni, X = Amin, c = 1 and linear base func-
points per cloud. tions (rn = 3). A global comparison of the meshless
solution is shown in Figure 12. In a), b), c) and d)
At the boundary, the two points adjacent to the cen-
the mesh, the Taylor-Galerkin solution, a four-stage
tral point on the boundary plus the closest points are
Runge-Kutta Galerkin result and the FPM solution
chosen. Another condition is that no boundary sec-
for the density are presented, respectively. Quali-
tion is crossed so that points from the opposite side
tatively, all results are very similar. In Figure 13
are not chosen. For instance, at the trailing edge of
close-up views in the stagnation area enhance the
an airfoil, the closest points to a point on one side of comparison of density contours of a) FPM without
the airfoil may lie accrom the wall on the other side.
weighted diffusion,b) FPM with full Gaum weight-
Since there M a physical separation of these points,
ing, c) RK-Galerkin and d) Taylor Galerkin. Note
they are not included in the same interpolation do- the improvement of solution b) with respect to a),
main. not exhibiting any oscillations in the stagnation area
4.S 2D Results through the use of weighted diffusion terms.
43.1 Subsonic iesi case In Figure 14, a) velocity contours and b) velocity
vectors in the stagnation zone are displayed, respec-
The first 2D test case considered is a NACA0012
tively.
profile with a free stream Mach number of 0.5 and 0
degrees angle of attack, analyzed by Zienkiewicr el 4.S.2Supersonic l e d case
al[10,11]. In order to compare solutions, a finite el-
ement solution has been taken for comparison. The The second 2D test case is a hypersonic inviscid flow
meshless grid of 2556 points is shown in Figure 11. of Mach 8.15 around a double ellipse, which is well
The same points have been used for the FE solution documented by the proceedings of the workshop in
on the equivalent unstructured triangular mesh ob- Antibes, 1991 1121. The flow enters at an angle of 30
tained using a standard advancing front technique degrees. The solution is characterized by a strong
[7,81. primary bow shock and a weaker canopy shock.
To solve this problem, a grid of approximately 11000
points was generated using again the advancing front
technique. Linear base functions (m=3) and 6 point
clouds with Gaussian weighting were used. The re-
siduals of the solution have been reduced to Sir or-
ders of magnitude. Figures 15 a). b) and c) present
the meshless grid, Mach number contours and den-
sity lines, respectively. Note that the solution is very
smooth and the location of the shock is well cap
tured. The numerical overshoot of about 3% in Mach
number is within reasonable limits and it could be
improved by increasing the balancing diffusion. The
convergence of this solution was slow due to a low
Courant number of 0.25 (avoiding ueg. pressures).
Figure 16 a) demonstrates the high quality of the so-
Figure 11: Point distribution mound m NACA0012 lution in the vicinity of the stagnation area showing
profile
no oscillations. Figure 16 b) displays the pressure
Again, the idea is to compare the influence of the coefficient e, on the boundary of the double ellipse
weighting functions in the finite point approxima- which compares well to other contributors [12].
tion. In previous reports [5,6] we have shown re- 5. REFERENCES
sults proving the superiority of weighting functions [l] Nayroles, B., Touzot, G. and V i o n , P. 'Gener-
in a 2D context, but without using weighting func- alizing the Finite Element Method: Diffuse A p
tions for the balancing diffusion terms (see eq. (32)). proximation and Diffuse Elements", Computk
Here, the benefit of the weighted diffusion terms tional Mechanics, 10, 307-318,1992
shall be presented. [2] Belytschko, T.,Lu, Y. and Gu, L. "Element Free
11-10
Galerkin Methods", Int. Journal for Numerical vection Dominated Flows", PhD Thesis, Univer-
Methodo in Engineering, 37,229-256,1994 sity College of Swansea, 1986
[3] Liu, W.K., Jan, S. and Belytschko 'Reproduc- [9] Jame.son, A., Schmidt, W., and Turkel, E. "Nu-
ing Kernel Particle Methodsn, Int. Journal for merical simulation of the Enler equations by fi-
Numerical Methods in Engineering (to be pub- nite volume methods Usirg Runge-Kutta time
lished) stepping c.chema" AIAA paper 81-1259. AIAA
[4] Batina J., "A Gridlea# Euler/Navier Stoka So- 5th Computational Fluid Dynamics Conf., 1981
lution Algorithm for Complex Aircraft Applica- [lo] Zienkiewicr, O.C. and Wu, J. "A General Ex-
tions", AIAA paper, 93-0333, Reno NV, January plicit or Semi-Explicit Algorithm for Comprcu-
11-14,1993 ible and Incompressible Flows", Inst. of Num.
[5] Oiate E., Idclsohn S. and Zienkiewicr O.C. "Fi- Meth. in Eng., University College of Si-,
nite Point Methods in Computational Mechan- CR/682/91,1991
ics", Publication CIMNE No. 67, July 1995 [ll] Zienkiewicr, O.C., Codina, R., Morgan, K. and
[6] Oiate E., Idclsohn S., Fischer T., Zienkiewicr Sai, S. "A General Algorithm for Compressible
O.C. "A Finite Point Method for analysis of fluid and Incompressible Flow", Inst. of Num. Meth. in
flow problems", 9th Int. Conf. on Finite Elementi Eng., University College of Sw-, CR/842/94,
in Fluids, Venuia, Italy, 15-21 October 1995 June 1994
[7] Zienkiewicr, O.C. and Taylor R.L. "The Finite [12] Problem 6 of the Workshop on Hypersonic Flows
Element Method", 4th Edition, Volume 2, Mc. for Reentry Problems, Antibu, fiance, January
Gran Hill, 1991 22-25 1990
[E] Peraire, J. "A Finite Element Method for Cou-
Pigum 111: Double &pr: a) Density contrmn m the ste.lpl.tion MI~C rad b) Pr+- coemdmr cp dong the bound-
of the body.
13-1
NUMERICAL
SIMULATION OF INTERNAL AND EXTERNALGAS DYNMIC
FLOWS ON STRUCTURED AND UNSTRUCTURED ADAPTIVE GRIDS
U.G.Pirumov,
I.E.Ivanov,
r.A.kKryukov
Moscow Aviation Institute
Volokolamskoe sk 4
125871 Moscow, Russia
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithm”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
13-2
+tanh(3(x2+y2-1)). Figures lb and IC contain the BJ results - Figs. 4c and 4f. As we expect the TL
solution-adaptive grids produced by our algorithm. reconstruction procedure is most dissipative. MT and
BJ results are rather close, but in MT case errors are a
4. NUMERICAL RESULTS little smaller and maximum values are a little greater.
These problems also shows that the MT
4.1. Linear sealar 2D problems reconstruction may not preserve the symmetIy. We
Numerical dissipation of explicit hgh resolution have some difficulties with implementation of the BJ
schemes with various "reconstruction" algorithm were reconstruction on the curvilinear grids. So results for
compared by numerical results for the two this grid aren't presented in Tabs. 1 and 2.
dimensional advection equation [24]
4.2. A channel with a 15' compression-expansion
rmP
The next case is the flow through a duct with
compression-expansion ramp in the bottom wall. The
The exact solution of (20), (20) consist in the rotation conditions for this case are: M,=2, ~ 1 . 4 The
.
of the initial values round ( x o , y o )with angular velocity Computational grid is equally spaced and contains 180
w. In this paper presented two series of calculations. cells in the streamwise direction and 60 cells in the
As initial values was chosen a cut-out cylinder and a cross flow direction.
cone (fg.2). We used the angular velocity m to be 0.1 The computed Mach contours are presented in Fig. 4.
and x0=50, y0=50. The region of computation was Note that the induced and reflected shocks are quite
(0,10O]x[O,1001. The numerical calculation were done thin and any unphysical oscillations are absent. AU of
on three type of grids with 100 grid points in each characteristic features of the flow are well resolved on
direction. The fust type is uniform rectangular grid, such fme grid without adaptation.
the second type is smooth curvilinear grid (Fig. 3.a)
described by transformation 4.3. A Oblique Shock-reflection problem
One of the most popular problems for checking out
various elements of numerical algorithm (such as
reconstruction, adaptation etc.) is the regular
reflection of an oblique shock wave by a flat plate. In
Figs. 5 a-f, results are shown for a case with M,=2.9
and p = 29", where p is the angle made by incident
shock wave and the k t plate Fig. 5a. First steady
where c1= Ax(i - I), Ax = I , solution was obtained on the uniform 60x20
ql=Ayo-l), Ay=l. rectangular grid. The corresponding pressure contours
And the third type is random grid (Fig. 3.b) are presented in Fig. 5a. Then grid adaptation was
performed by the proposed above method. The
pressure was used as the adaptation function. M e r
X# = tl+ &$,Ax that new steady state was obtained. Figs. 5a and 5b
y# = ql t #@Ay depict the adapted grid and associated steady state
where and a,, are uniformly distributed random flow solution. These figures shows that the pressure
numbers on (-0.4, 0.4). gradients become much better resolved.
At time t=20rr the initial values have carried out one 4.4. A underexpanded jet flow
full rotation and returned to their initial position. The The next case is the unsteady underexpanded
approximations of the initial values on uniform grid supersonic jet flow. The conditions for this case are:
are shown in Fig. 2. To improve picture resolution in M,=l.S,n=p pm_ = 3, T=Tm and 5=y,=1.4.
Figs. 2 and 4 we used only part of the computation J _
computatio gnd 1s equally spaced and contains 180
The
region [SO,lOo]x[25,75]. Size and initial position of the points in the streamwise direction and 80 points in the
cut-out cylinder and the cone are same as in [24]. We crossflow direction. Fig. 6 shows the computed Mach
perform long time calculations until +I2077 which contours for most characteristic time moments. Note
corresponds to six full rotations of initial values. As that the nonreflecting boundary conditions allow to
mentioned in [24] these problems are well suited to calculate such rather complicated flow almost without
benchmark the numerical properties of the schemes. unphysical reflection on the open boundaries.
In this paper three reconstruction procedures The steady state solution is shown in Fig. 7a. Fig. 7b
described in section 2.2 are compared. Numerical shows the steady state solution obtained using coarse
results obtained with these reconstructions on 90x40 rectangular grid. For this solution the grid
computational grids of three types are presented in adaptation was performed using the Mach number
Tab. 1 for the cone and in Tab. 2 for the cut-out gradients as the adaptation function. In Fig. 7c, the
cylinder. These tables contain numerical solution adaptive grid is presented and Fig. 7b shows the
errors calculated with respect to the L, norm and computed Mach contours. The solution computed
maximum values of obtained solutions. Fig. 4 shows using c o m e adapted grid is mush close to the fm
numerical solutions computed using uniform grid solution in Fig. 7a. But some flow features aren't
rectangular grid. TL reconstruction results are shown succeeded to capture. The second Mach stem places
in Figs. 4a and 4d, MT results - Figs. 4b and 4e and somewhat farther from the nozzle cut than in Fig. 7a.
13-5
But the first Mach stem are resolved better than by Technique for the Solution of Navier-Stokes
using the fine gnd. It is main deficiency of moving Equation", AIAA Pap., 90-1605, 1990.
solution-adaptive gnd algorithms. If there are large 8. Tdyaeva N.I., "Generalization of Modified
gradients of parameters in a flow region then grid Godunov's Scheme for Unstructured Grids",
points are too come in regions of middle and low Uchenyle Zapicki TSAGI, XVII, 2, 1986, pp. 18-
gradients. 26, (in Russian).
9. Casper J., Atkim H.L., "A Finite-Volume High-
4.5. ~ o p l now
e Order EN0 Scheme for Two-Dimensional
In the last example we present results of numerical Hyperblic Systems", J.Comput.Phys., 106, 1993,
simulation of the internal axisymmetric nozzle flow of pp. 62-76
the ideal gas with ~ 1 . 2 2 The
. nozzle consists fium 10. Shu C-W., Osher S. "Efficient Implementation of
two paas. First part IS a h v a l nozzle and second one Essentially Non-Oscillatory Shock-Capturing
is cylindrical tube adjoining to the supersonic part of Schemes", J.Comput.Phys., 77, 1988, pp. 439-471.
the Laval nozzle. 11. Rodionov A.V., "High Order Godunov Scheme",
Zhurnal vichislitelnoy mathem. 1 mathem. phys.,
Initially the steady state solution was obtained using 27, 4, 1987, pp. 585-593, (in Russian).
130x40 simple grid (Fig. Sa). In Figs. Sa and 8b the 12. Ramsey C.L., van Leer B., Roe P.L., "A
top half of figures shows the computed Mach contours Multidimensinal Flux Function wth Application to
and the bottom half shows the computational gnd. the Euler and Navier-Stokes Equation", J. Comput.
For the computed steady state solution the grid PhyS., 105, 1993, pp. 306-323.
adaptation was performed using the Mach number 13. Roe P.L., "Dmrete Model for the Numerical
gradients as the adaptation function. The new steady Analysis of Time-Dependent Multidimensional Gas
state solution and the adaptive grid are presented in Dynamics", J.Comput.Phys., 63, 1986, pp. 458-476.
Fig. 8b. 14. Kryukov IA., Ivanov I.E., "High Resolution
Monotone Method for Computation Internal and
5. CONCLUSIONS Jet Inviscid Flows", in "Nonequilibnum processes
In present paper the upwind monotone numerical in nozzles and jets", Proc. 1st Int. Cod., Coll.
method for the solution of the Euler equations is Abstr., June 1995, pp. 92-93.
presented. This method is based on the high order 15. Roe P.L., "Approximate Riemann Solver
version of the Godunov's scheme. This method is Parameter Vector and Difference Schemes",
realized using both the structured quadrilateral and the J.Comput.Phys., 43, 1981, pp. 357-372.
unstructured triangular grids. Essentially 2D 16. Osher S., Solomon F., "Upwind Difference
reconstruction procedures make possible to perform Schemes for Hyperbolic Systems of Conservation
calculations using the strongly skew gnds. Some Laws", Math.Comput., 38, 158, 1982, pp. 339-374.
features of 2D reconstruction procedures are. studied 17. Dukowicz J.K,"A General, Non-iterative
to solve the linear scalar problem. The new solution- Riemann Solver for Godunov's Method",
adaptive grid algorithm is proposed. J.Comput.Phys., 61, 1985, pp. 119-137.
Presented numerical results illustrates the capability of 18. Davis S.F., SIAM J.Sci.Stat.Comput., 9, 1988, pp.
the proposed algorithms. It can be see that the grid 445-473.
adaptation procedure make possible to obtain 19. Einfeldt B., SIAM J.Numer.Anal., 25, 1988, pp.
significantlymore accurate results. 294-318.
20. Thompson KW., "Time Dependent BoundaIy
6. REFERENCES
Conditions for Hyperbohc Systems",
1. Pirumov U.G., Roslyakov G.S., "Gas Flows in
J.Comput.Phys., 68, 1, 1987, pp. 1-24.
Nozzles", Springer-Verlag, 1985, 425 p. 21. Vankeiibilck P., Deconinck H., "Solution of the
2. Gorbunov V.N., Pirumov U.G., Ryzhov Yu.A, Compressible Euler Equations with Higher Order
"Non-EquWum Condensation in High-speed ENO-schemes on General Unstructured Meshes",
Gas Flows", Gordon and Breach Science in "Computational Fluid Dynamios'9T, v01.2, 1992.
Publishers, 1989, 290 p. 22. Barth T.J., Jespersen D.C., "The Design and
3. P i m o v U.G., Roslyakov G.S., "Numerical Application of Upwind Schemes on Unstructured
Method for Gas Dynamic", Moscow, Vishai Meshes", AIAA Pap, 89-0366, 1989.
Shkola, 1987, 231 p., (in Russian). 23. Kania LA. "An Adaptive Grid Algoritbm for
4. Godunov S.K., "Finite Difference Method for Accurate Flowtield Calculations", AIAA Pap., 90-
Numerical Calculation the Discontinue Solutions 0327, 1990.
24. Mum C-D., "On the Numerical Dissipation of
of Gasdynamic Equations", Mathem. Sb., 47, 3,
1959, pp. 271-306, (in Russian). High Resolution Scheme for Hyperbolic
5. Godunov S.K., Zabrodin AV., Ivanov M.Ya., Conservation Laws",J.Comput.Phys., 77, 1988, pp.
Krayko AN., Prokopov G.P., "Numerical Solution 18-39.
of Multidimensional Gasdynamics Problems", 25. Harten A., Cha!aavarthy S.R, "Multidimensional
Moscow, Nauka, 1976,400pp., (in Ruman). EN0 Schemes for General Geometries", Tech.,
6. Connett W.C., Agaswal RK., Schwattz A.L., "An Report 91-76, ICASE, 1991.
Adaptive Grid-Generation Scheme for Flowtield 26. Ab& R "On Essentially Non-oscillatory
calc~lation~", AIAA Pap., 87-0199, 1987. Schemes on Unstructured Meshes: Analysis and
7. Connett W.C., Agarwal RK., Schwattz A.L., Implementation", J.Comput.Phys., 114, 1994, 45-
Wheeler J.C., "An Algebraic Adaptive Grid 58.
13-6
b) 0)
Fig.1. Adaptive respow ofuniform grids to weighting function H(x,y)=tanh(3(x-$)) + tmh(3(x%y2-1)).
a) weighting functioq; b) regular quaddated gri@ c) unstructuEd triangular grid.
a) b)
Fig. 2. Initial values and exact solution after each full rotation. -
a) cone, b) - cut-out cylinder.
13-7
a) b)
Fig. 3. Examples of curvilinear (a) and random (b) calculation grids (20x20).
Fig.4. Computed solutions on the uniform grid after six rotations. a)-o) - cone, d)-t) - cut-off'cyliader.
13-8
Table 1. Results of solution of the linear scalar problem for rotating cone.
Table 2. Results of solution of the linear scalar problem for rotating cut-out cylinder.
I
0.0 0: 5 IC0 1'.5 2'. 0 215
0'
0)
Fig. 6 . Pressure contours for oblique shock reflection problem (M=2.9, p=29').
a) solution on uniform grid; b) solution on adaptive grid; c) solution-adaptivegrid.
13-9
Fig. 8. Steady underexpanded jet. a) - fine grid solution; b) - coarse grid solution without adaptation; c ) -
adaptive grid; d) - adaprive coarse grid solution.
13-10
MRCH NUMBER
0: 0 0'. 1 0: 2 3
0'. 0: CI 0'.5 0'. 6 0: 7
x
a)
MRCH NUMBER
X
b)
Fig. 9. Nozzle flow problem. a) solution and grid without adaptation; b) adaptive solution and @id.
14-1
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995. and published in CP-578.
Inwm
(3)
au an
-+A-+B-=-+-
ai7 aFv a8"
(7)
at at atl at
14-3
where, A and B are the Jacobian matrices for the The function a(z) is defined as.
transformation of and respectively. Therefore A and B
can be described as dz)= [$z) -hz'I (14)
3.2 Discretisation of the Epuatioas viscosity. A relationship is provided in order that '6 ' is
The Navier-Stokes equations, in the form shown in equation 5. suitably scaled for highly skewed grids. This relation is,
can be simply discretised into a two step scheme. The first
step solves the flow in the &direction. and the second step 8={fi+?+O%(,/R+,/E)] (15b)
solves in the q-direction. This method of discretisation can
lead to second order accuracy in time. where fi and ? are the CO-variantvelocities. A study of the
actual effects of varying the value of the constant. 6, in the
solution, appears later in this report. This constant is referred
to as the entropy parameter throughout this report.
The term 'g', in equations (11) and (13), is the flux limiter.
where, Five different limiters have been implemented and
investigated, they are given in reference [I]. and are:
(10)
Note tha$ the flux functions with a superscript * are calculated
using U ' In reference [I], Yee has offered several other
methods of calculating the flux functions, but they are not The minmod function of a list of arguments is equal to the
covered here. smallest argument in absolute value if the list of arguments are
Roe's averaging is applied to cell vertex points in order to of the same sign, or is equal to zero if any arguments are of
calculate the flow variables at (i+l/Zj) and (iJ+l/Z). opposite sign.
.
The vectors, OA and OB in equation (IO) contain the anti- In equation (16c) E is included to stop any division by zero.
is given a small value, usually of the order of 10.'.
E
33 GridGeneration
The grid used for this problem was calculated using a very
simple analytical approach followed by a short smoothing
operation. It is necessary to have a high concentration of
points at the jet exit. where large changes in the flow will
occur, and near the wall. in order to accurately capture the
boundary layer.
For the function y one finds, The grid generator introduces regions of highly concentrated
points where specified. This leads to the production of a grid
which is highly irregular. Rapid changes in grid point
concentration can cause the numerical code to fail, or spurious
glitches in the solution of the flow to occur.
has a rectangular profile, the actual profile would be closer to
a quadratic profile. The jet boundary conditions are fixed at
initial conditions forthe injected flow.
The pressure distribution along the surface of the plate has
been compared with experimental results pruduced by Zukoski
el al 131, for the test case described above. The experiments
have been done for a three dimensional case, and the jet
injection hole was circular. The comparison is shown in
figure 3. This numerical solution has been calculated using
0.0 02 0.4 0.6 0.8 q.0 1.1 1.4 3.8 1.0 2.0
limiter 1, and with &0.001.
XlL Upstream of the jet, the results produced by the code compare
well with experiment. Downstream of the jet there is a large
Figure 2. Grid generated for the transverse jet case. discrepancy between experimental and numerical data. This
discrepancy is probably due to three dimensional effects
which are coming into play amund the jet. Some of the flow
ALaplacian smoothing operator is applied to the grid to solve will be passing amund the jet, reducing the mass flow
these problems. Up to 15 smoothing iterations are performed through, and just downstream of it. Another possibility can be
on the grid. The number of iterations executed depends on the attributed to a turbulent region formingjust downstream of the
degree of irregularity of the grid, and the number of points in jet. At this stage of the work, the numerical code does not
the regions wherethere is a high concentration of points. The include a turbulence model and this pati of the flow can not he
grid generated for this test case, using this method is shown in accurately represented. However, as an initial foray into this
figure 2. problem and for comparing the effects of changing different
parameters, the resolution of the results especially near and
4. RESULTS A N D DISCUSSlON upstream of the jet is adequate.
The test case for the investigation is a laminar boundary layer
developing over a flat plate. A jet issues perpendicularly into 4.2 The Entropy Parameter
the supersonic crossflow from 6 1 , where Lx#,, the
position of the jet. The inflow conditions for the flowfield are
In order to examine the effects of the entropy parameter, S, as
defined in equation (15), numerical solutions for the test case
calculated for a Mach number of 2.61. The inflow conditions
were found for seven different values of S,varying from 0.001
of the jet are as follows: the jet pressure ratio, Pp-=7.0 the
to 1. All of the cases. in which the entropy parameter was
temperature ratio, T)”-=l.O; and Mj=l.O. The Reynolds being studied, were calculated using Limiter 1. This limiter is
number of the flow; ReL=749.000; and the freestream the most robust one, and was least likely to fail when the more
temperature, T=300K. These values were derived from the extreme values of the entropy parameter were being tested.
data provided in reference [3].
The skin friction and pressure distributions across the flat
plate, for different values of 8, are presented in figures 4 and
4.1 Comparison with Experiment
Other than the results obtained to test the grid dependency of 5, respectively. These plots show the effect of S on the
the solution, all of the numerical results produced were numerical solution. The pressure plots show that as the
calculated on a l00xl00 grid as shown in figure 2. The grid entropy parameter is increased the shock wave becomes less
has 6 points in the jet, and contains between 30 and 35 points well defined. For the highest values of the entropy parameter
in the boundary layer region. Simple boundary conditions the shock wave has smeared all the way to the jet injection
have been used everywhere. The flat plate is modelled using point. Also by increasing the entropy parameter the low
no-slip conditions and is considered to be adiabatic, the inflow pressure region after the jet, denoting a recirculation region,
condition is fixed at the initial condition, and all outflow becomes damped out and the pressure plateau related to the
conditions use a simple linear extrapolation technique. The jet separation region upstream of the jet reduces considerably.
The Skin friction distributions show the change in the point of
boundary layer separation, xSn (i.e. where the skin friction
I’ =I/
M OS 1.0 IS 50
0.0 0.5 1.0 2.0 xn
x/L
Figure 4. A graph comparing t h e effectsof
Figure 3. Pressure distribution along the surface of different values of Son the skjn friction distribution
t h e flat plate. across thsflat plate.
I 4-5
.
I" I
a0 a5 1.0 15 1J
dL
Figure 5. A graph comparing the effects of Figure 6. Velocity profiles in the boundary layer
different values of S on the pressure distribution for different values of 6.
along the flat plate.
a-1) a-2)
bl) b-21
c-I) c-2)
d-1) d-2)
e-2)
~. i 1
OW3
> 0.w
0.m
0.m
0.0 Ob I .o 1.5 2J 0.0 0.5 1.0 1.5 2.0
XR XR
om,
om
06 05 10 IS 20 0.0 0.5 1 .o 1.5
XlL XR
c-1) timiler 1 6=1.0 C.2) Urniter 1 6=1.0
D.075 0.075
o m 0.M
om 0.m p
0.0 1.0 1.5 2.0 0.0 03 1.0 7.5 1.0
XlL XA
Figure 9. Contour plots of real dissipation (a-1 to 9-1) and artificial dissipation ( a 4 to 9-2) for the flowiield in the vicinity of the flat plate
using severd different limiters and values of 6.
14-8
00 06 18 1s 20 0.0
UL
b-1) Limiter 1 6=005 b-2) Limiter 1 6=0.05
UL UL
el) Limiter 1 6=1.0 c-2) Limiter 1 6=1.0
0.01
"m
0.m
gom
0.0,
0.M
0.0 0.5 LO 1.5 20 OD 0,5 1.e ,I 7.0
XJL dL
e-1) Limiter 5 6=0.05 e-2) Umiter 5 6=0.05
Figure 10 Contour plots of pressure across the whole flowfield(a-I to e-1 ) and Mach number near the flat plate(a-2 to e-2)
using several different limiters and values of 6
14-9
..............................
1,lrnlter 2
I.irnllrr 1
1,lrnltur 4 ... :
1,lmltrr 5
0.4 1 I I I
I I I I
0.0 0.5 I .o 1.5 2.0 0.0 0.5 1.0 I .5 2.Q
xn xn
Figure 11. Pressure distribution along the flat plate for Figure 12. Skin friction plots for the flat plate, for each
five different limiters. limiter.
of G2-real and G2-artificial for three values of 6 have been similar to that shown in equation (15b). Lin also uses
included. These plots can not be directly compared with each different values of the entropy parameter for the linear and
other as the variation of the contour lines is different for each non-linear waves. This means a much smaller value for the
case. They show general trends only. Of particular interest is entropy parameter can be used for the linear waves and hence
the small region just upstream of the jet where both the the boundary layer will be less affected by the artificial
artificial and real dissipations are being added in comparable dissipation term. An investigation of how effective this
amounts. concept is, is underway. The results presented by Lin are
encouraging.
Figures loa-I to c-l show pressure contours for the whole
flowfield. These plots illustrate the effects of increasing 6 on
the whole solution. The separated shock wave becomes less 4.3 Comparing Limiters
well defined until i t is not even clear that there is a The choice of flux limiter is an important factor in TVD
shockwave, and the shock wave caused by the reattachment of schemes. Five different limiters have been investigated here.
the boundary layer, downstream of the jet, also becomes less Limiter I is the well known minmod limiter. Limiter 2 is the
easy to recognise. limiter formulated by Van Leer and limiter 5 is known in the
literature as the "Roe's Superbee" and is highly compressive.
Figures loa-2 to c-2 are contour plots of Mach number for the
flow near the wall. These plots illustrate the Mach disk, the It was found that the numerical scheme failed when limiter 5
separation and reattachment of the boundary layer, and was used if the entropy parameter was set below a value of
regions of recirculation. The plots also show the slight 0.05. Therefore in order that the limiters could be compared,
increase in boundary layer thickness with increasing 6. The the computed solution was found for each limiter with the
small pocket of recirculation just downstream of the jet is entropy parameter set at a value of 0.05. In all likelihood this
reduced in size when large values of 6 are used. means that the best possible results for limiters 2, 3 and 4 have
not been found.
In conclusion, when 6 is increased to a value of 0.5 and
Figures I 1 and 12 show pressure and skin friction
beyond, the solution becomes highly inaccurate. However, for distributions along the tlat plate for each of the limiters. The
values of the entropy parameter equal to and below 0.01, the pressure distributions show that the major effect of using
solution seems to be less sensitive to changes in 6. This different limiters is to change the point of boundary layer
profile represents the solution where very little artificial separation. The shock definition on the surface of the flat
dissipation is being added to the boundary layer and i t is also plate, denoted by the pressure gradient of the shock wave, is
the solution closest to the experimental pressure data. not greatly improved by using a more compressive limiter.
Other methods of modelling the entropy parameter, see From figures I I and 12 it can be seen that limiters 2 , 3 and 4
equation (15b), have been defined by Muller [ 10,l I ] and Lin produce similar results. Also, these results compare well with
[2]. Muller has used an entropic function of the local spectral the results obtained using limiter 1 with values of 6 below
radii to model the entropy parameter. A brief examination of 0.01, as shown in figures 6 and 7, but generally it seems
this technique showed that for 6=0.005, and using limiter I , limiters 2, 3 and 4 are better than limiter I .
the results produced were very similar to the results produced
using the method described by equation (15b). I t is possible From figure 12, one can see that limiter 5 , the "Superbee."
that Muller's entropy function will be more effective when does not seem to be well conditioned for this problem, and is
used with other limiters, and this will be investigated in later probably unacceptable for use when solving any viscous flow
work. problem. The corresponding skin friction distribution is
considerably different from those obtained by the other
Lin [2] states that the viscous flow results using the scheme limiters, both upstream and downstream of the jet.
described in this investigation are unacceptable if a value of
6=0.25 is used. This agrees with the results presented here. Figures 13 and 14 present velocity profiles for each of the
Lin suggests using a form of the entropy function which is limiters. The plots given in figure 13 are for the unseparated
14-10
. i
j
Figure 13. Velocity profiles in the unseparated Figure 14. Velocity profiles for t h e separated boundary
boundary layer for different limiters. layer for different limiters.
boundary layer. As expected, the data suggests that limiters drawn from these graphs. Firstly, on the real dissipation plots,
with a more compressive nature reduce the thickness of the the concentration of the contours uptstream of the jet are far
boundary layer. Ln the case of the scheme using limiter 5, the higher than for the other limiters, and there is a small region of
boundary layer thickness has been significantly reduced. concentrated contours downstream of the jet for limiters I and
Again, limiters 2, 3 and 4 produce similar profiles. Limiter 1 3 which is not as prominent for limiter 5. Secondly, the
produces a slightly thicker boundary layer than the others. artificial dissipation added by limiter 5 is quite different in
pattern to the dissipaton added by the other limiters.
The results in figure 14 are velocity profiles for the boundary
layer after it has separated. Limiter 2, 3 and 4 produce similar The pressure contours shown in figure IO, show how the more
profiles, however there is now a more significant degree of compressive limiters produce better defined shocks waves and
difference in the results being obtained. The variation in expansion fans. The movement of the separation shock wave
boundary layer thickness for different limiters, is reversed further upstream of the jet, for limiter 5 , is also clearly
compared to the unseparated case. The more compressive illustrated here. The Mach number contours show the change
limiters produce a thicker boundary layer. Primarily this is in the thickness of the boundary layer for different limiters.
caused by the change in x,,,, for the different limiters. These The plot for limiter 5, most clearly illustrates the reattachment
results are similar to those obtained for low values of the of the boundary layer. The recirculation region downstream
entropy parameter using limiter 1, as shown in figure 7. of the jet is much bigger for limiter 5 , than for the other two
limiters present.
Dissipation profiles for limiters I , 3 and 5 shown in figures
8b, d and e respectively, indicate that for the unseparated The results presented here show that limiter 5 is not a good
boundary layer the artificial dissipation introduced into the y- choice of limiter for a viscous flow problem, because of its
momentum equation is high for limiter 3, but low for the other highly compressive nature. Limiter 1 produces acceptable
two limiters. Limiter 5 introduces artificial dissipation of the results if the value of the entropy parameter is limited to 0.01
opposite sign to the real dissipation, in the x-momentum for this test case. Limiters 2, 3 and 4, the mid-range limiters
equation. This result explains the decrease in the boundary including the Van Leer limiter, produce similar results and are
layer thickness when using limiter 5. Also the amount of real probably best suited for this problem.
dissipation being added when limiter 5 is being used is
slightly higher than for the other limiters.
4.4 Effects of Grid Size
The dissipation profiles for the separated boundary layer show In order to check the dependence of the solution on the
a marked difference between each of the limiters used. The available number of grid points, the numerical code was run
amount of real dissipation being added when limiter 5 is being on three different grids with different point concentrations.
used is much smaller than that added when the other limiters The grid sizes used for this study were, 50x50, IOOxlOO and
are used. The artificial dissipation being added to the x- 200x200. These calculations were all done using Limiter 3
momentum also seems to be reduced, although there is a and the entropy parameter, 6=0.05.
region at the edge of the boundary layer where large amounts
of dissipation of the opposite sign is being added. Each of the A comparison of the pressure distribution along the flat plate
other limiters also add this opposing dissipation but not in for each grid is shown in figure 15. The plots show a severe
such a comparably vast quantity. However the other limiters degradation of results between the different meshes used. The
add more dissipation in other regions of the boundary layer, most coarse grid does not define the shock wave well, and the
e.g. near the wall. Also in contrast to limiters land 3, the pressure valley and plateau upstream of the jet do not reach
artificial dissipation added to the y-momentum equation by the values produced by the other two grids. The main
limiter 5 is of the opposite sign to the real dissipation being differences between the 200x200 and the 1OOx 100 grid are the
added. These graphs go some way towards explaining why improved shock definition at the wall, and a slight increase in
the results from limiter 5 are so different to the results the pressure plateau for the higher grid concentration.
obtained from theother limiters. The skin friction distribution comparison given on figure 16,
The contour plots of dissipation for limiter and are show a marked difference in profile for each grid type. The
shown on figures 9b, d and e. Two main conclusions can be skin friction given by the 200x200 grid is far higher than the
14-1 1
2.0 -l -. 1
0.4 Y . - I
I I I
I I I I
0.0 0.5 1.0 1.5 2.0
0.0 0.5 1.I) 1 .5 2.0
Distance along flat plate, x/L Distance along flat plate, fi
Figure 15 A graph showing the effects of grid size on Figure 16 A graph showing the skin friction profile
the pressure distribution along a flat plate. along the flat plate for different grids.
I
I
--- 2wxmu
I
other grids, and the recirculation region just downstream of Figures 19 and 20 show the artificial dissipation terms added
the jet is much better defined when more points are used. The to the x-momentum equation profiled through the unseparated
differences can be attributed to the fact that, for a higher point and separated boundary layer, respectively. For the
concentration the boundary layer is better defined. The unseparated boundary layer, the lOOx 100 and 200x200 grid
location at which the boundary layer separates does not vary produce a similar profile, and near the wall the amount of
greatly over a change in grid point concentration. artificial dissipation being added for each of these cases is
Boundary layer velocity profiles are provided on figures 17 quite close. The real differences in the profiles occur towards
and 18 for each of the grids. The unseparated boundary layer the edge of the boundary layer. The amount of dissipation
profiles, presented in figure 17, show the simple improvement provided by the scheme using the 50x50 grid is sometimes
twice as much as that introduced when the other grids are
in boundary layer definition as more points are put near the
being used.
wall. Table 1 gives the number of points in the boundary
layer for each case. The grid used earlier in the report does In the case of the separated boundary layer, the 50x50 grid
not provide an accurately defined boundary layer, but it was of seems to add less artificial dissipation than the other two grids.
acceptable quality for the comparative study being performed. Although it is not clear from figure 20, the dissipation being
added by the IOOxlOO and 200x200 grid is almost identical in
Figure 18 shows velocity profiles in the separated region for
each of the grids. In the recirculation region near the wall, the
lOOxl00 and 200x200 grids produce similar results. The
50x50 grid produces a solution with a larger amount of
I GRIDSIZE
Points in the
boundary layer
recirculation than the other two grids. I 50x50 I ~ 23
Both sets of velocity profiles show that the grid concentration 1oox 100
has a large effect on the calculation of the boundary layer, and
hence the rest of the solution. 200x200
Table 1. Grid points in the boundary layer.
14-12
, ... ..,., ,, , . ,. ,
:
,
.. . ..... .
: I
i I
:
, , .. ... . . . ,..... . .. ..
;
,.. .. ... ... . .
i
0.0 ; I
I I I I I
-0.02 -0.01 0.0 0.01 0.02 0.03 -0.6 -0.4 -0.2 0.0 0.2 0.4
Artificial dissipation in GIterm Artificial dissipation in GI term
Figure 19 Dissipation profiles through an unseparated Figure 20 Dissipation profiles through a separated
boundary layer for different grids. boundary layer for different grids.
the recirculation region. It is only when the edge of the Viscous Hypersonic Flows”, J. Comput. Phys., 88, 1990,
separated boundary layer is reached that large differences pp3 1-61.
become apparent. In the region of the separated boundary 2. Lin, H-C. “Dissipation Additions to Flux-Difference
layer, the IOOxlOO grid seems to add far more dissipation in Splitting”, J. Comput. Phys.117 ,1995, pp20-27.
this region than the other two grids. It can be clearly seen that 3. Zukoski, E.E., Spaid, F.W. “Secondary Injection of Gases
in this region the grid is becoming more coarse, and numerical into a Supersonic Flow”, AIAA J., 2, Oct 1964, pp1689-
truncation errors are becoming dominant. 1696.
It is clear from this brief study, that grid point density has a 4. Allmaras, S.R. “Contamination of laminar boundary layers
great effect on the viscous regions of the solution, and the by artificial dissipation in Navier-Stokes solutions”,
amount of artificial dissipation being added by the numerical Proceedings of the Conference on Numerical Methods in
scheme should not be analysed in isolation. Fluid Dynamics, Reading, UK, 1992.
5. Tatsumi, S., Martinelli, L., Jameson, A. “Design,
Implementation, and Validation of Flux Limited Schemes
5. CONCLUSIONS
for the Solution of the Compressible Navier-Stokes
The numerical scheme described in this study has been used to
Equations”, AIAA-94-0647.
model a transverse jet interacting with a supersonic flow. This
test case differs from others that have been used to study 6. Caughey, D.A., Varma, R.R. “Evaluation of Navier-
Stokes Solutions Using the Integral Effect of Numerical
artificial dissipation, as the problem includes a separated
Dissipation”, AIAA J., 32, Feb 1994, pp294-300.
boundary layer, and reverse flow regions.
7. Turkel, E., Vatsa, V.N. “Effect of Artificial Viscosity on
This scheme can be used to produce adequate viscous Three-Dimensional Flow Solutions”, AIAA J., 32, Jan.
supersonic flows. It has been shown here, that the choice of 1994, pp39-45.
limiter and value for the entropy parameter is of great 8. Swanson, R.C., Turkel, E. “Aspects of a High-Resolution
importance for producing good results. For the transverse jet Scheme for the Navier-Stokes Equations”, AIAA-93-
test case, limiters 2, 3 and 4 produces good results, and limiter 3372-CP.
4 is probably best suited. Although the examination of the 9. Yee, H.C., Kutler, P. “Application of Second order
entropy parameter was not done for each limiter separately, Accurate Total Variation Diminishing Schemes to the
when limiter 1 was used, it was found that the best results Euler Equations in General Geometries”, NASA TM-
were produced when the entropy parameter was no greater 85845, 1985.
than 0.01. It is reasonable to assume that this range of values IO. Muller, B. “Implicit Upwind Finite Difference Simulation
will produce acceptable results with the other limiters. of Laminar Hypersonic Flow over Flared Cones”, Notes
on Numerical Fluid Mechanics, 29, 1990.
6. REFERENCES ll.’Muller, B. “Comparison of Upwind and Central Finite
1. Yee, H.C., Klopfer, G.H., Montagne, J.L. “High- Difference Methods for the Compressible Navier-Stokes
Resolution Shock-Capturing Schemes for Inviscid and Equations”, Notes on Numerical Fluid Mechanics, 30,
1991.
16-1
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
16-2
-1lall (9)
The Jacobian matrix A possesses a complete set of In order to assure the conservation of the scheme, the
real eigenvectors, hence introduction of the filter operators may not lead to
additional sources, hence
At U , = U , = (ui +ui-l)
os-A.6. <1
Ax ) I - 2
Therefore, to obtain a stable scheme the 6 imust be This leads to the Flux-Filter scheme for the quasi one
At dimesional Flux-Filter scheme:
zero when hi is negative and -(Aj6i)ma 5 1 . The
Ax
first condition defines the Flux Filter Operator:
with =
ui +U,-]
3.2 Implementation of the Flux-Filter 2
operator The trivial solution for steady state solution is
The previous section has defined an operator which
will theoretical allow a stable iterations process. For its
implementation three facts must be taken into account:
4 TWO DIMENSIONAL FLUX- where U is the averaged flow vector for the cell. This
FILTER SCHEME formulation reflects the region of dependency. Similar
for 7". and 7'-
In this section, the construction of the two- The conservation requirement is imposed with
dimensional Flux-Filter [9] scheme will be outlined.
The numerical equation for point iJ of structured grid Q=I-(?--+?-+ +?+++?+-) (25)
becomes:
4.1 Two-Dimensional Flux-Filter where U is the averaged flow vector for the cell.
Operator Similar for T.
and .
The two-dimensional Flux Filter is based on the one The conservation requirement is imposed with
dimensional operators defined in the last section. The
following constructions have been tested: Q=I-(,-+,+,) (29)
Analysis has lead to the following requirements for the
two-dimensional Flux-Filter: if 3 f [O] then
2 =%+e/.
0 for conservation, the sum of all Flux-Filters for one
cell must add up to the identity matrix: c7 =I . where n is the number of non-null matrices.
Although that for a steady state solution all
residuals should disappear, this seems to be an
unnecessary requirement. However, at a shock the 4.2 The Flux Residual
temporal change becomes zero with residuals The flux residual or flux balance is the flux integral
which are non-zero. If this constraint is not applied over the cell's circumference.
the shock position and strength are incorrect.
0 if the cell has a supersonic velocity component
pointing away from a grid point then the
corresponding Flux-Filter must be the null-matrix.
-+-
au
at 'f-
R
F.iidS=O
This reflects the perception that in the above case where R is the cell's surface. The first order discrete
no information can propagate towards this point. flux integration based on point i, j for a quadrilateral
This lead to the following scheme for quadrilateral cell is
cells (Fig 6.3):
7--= 1(7-(u,iis)+
4 g-(u,ii,))
au au au
-+a-+b-=O (39)
atax ay
Note the inclusion of the information attached to the In this equation a quantity U is convected with a
vertex i + l , j + 1 . The first order integration violates velocity a in the x-direction and a velocity b in the .y-
the criteria which requires that the flux calculation of direction. The theoretical solution preserves the initial
a cell's face is independent of the cell in consideration. function along the convection direction. The accuracy
The first order integration has the advantage that the by which a numerical solution approaches this
resulting scheme is stable. The second order theoretical solution is fully dependent on the
integration fulfills the latter criteria but it has a numerical scheme.
stability problem, which will be analyzed and clarified
in detail. For those reasons, a blended first and second Equation ( 3 9 ) represents the numerical model
order integration will be used, and is called the equation for a one-by-one dimensional first order
preferential direction integration Given by upwind scheme and for the Flux-Filter scheme with
the first order flux integration
[ y ):
- a-+b- til,] - ( "
-a-++-
2
")
2
~ ~ - 1 , ~( 4 1 )
D,(,:) = d4 -( a$ - b$)uf,l-l -
To obtain a meaningfull viscous solution, the influence The second order approach leads to the solutions
of the viscous term must exceed the influence of the which matches the theoretical solution for convection
artificial viscosity in the viscous dominant flow directions of 0 and 45 degrees. The solution for the
regions. 22.5 degree convection is dispersive. The accuracy
levels of both methods are given in the next table
where the error is the mean square deviation from the
theoretical solution:
5 SCALAR MODEL EQUATION
1-a
5.1 Von Neumann Stability Analysis
u l , j = -- '1, j-1 (44)
l+a
hence The results of a Von Neumann stability analysis [lo]
is presented in figure 5 where the maximum
amplification factor is plotted in function of 9,,9, for
(45) q =.25and a small amount of artificial viscosity.
Stability is obtained when 9,8, 20.3 which is the
For the case that a appraoches zero the scheme will upper limit for the CFL number.
produce an undamped oscillatory profile in the
complete domain. This reduces the robustness of the Each time discretization method has its own stability
Flux-Filter scheme where even a slight disturbance is contour wherein the roots of the space discretization
immediately transmitted, in an unfavorable manner, must lay. Figures 4 presents the spacial-roots for the
throughout the numerical domain. In contrast, the Flux-Filter scheme for CFL numbers of 0.3, 0.5, 2.0,
profile for the first order integration scheme is 10.0. The conclusion is that the gain for using Runge
Kutta is minor and that in theory large CFL numbers
can be used with an implicit scheme. However, in
practise, the implicit scheme worked for a maximum
CFL number of 1.0. The gain of factor 3 is not
sufficient to justify the use of an implicit scheme, due
which does not have a oscillatory behaviour. The
to the significant increase in workload.
problem can be remedied with the addition of artificial
viscosity andor the use of a preferential integration
direction. The preferential integration is a blended
form between the trapezoidal integration and the first 6 RESULTS
order integration (Eq.34). Hence the preferential
integration is a first order integration:
The supersonic wedge
The geometry is a two dimensional channel with a 15'
U;;' - = -(.5 + qxe, + 9 , ) U i , ] wedge and followed by a 15' expansion corner (Fig 6).
The inflow Mach number is set to 2. This
(47) configuration induces interactions between shock and
expansion waves [ 111. A shock wave is produced at
the wedge and reflected at the upper boundary. The
reflected shock wave is weakened by the expansion
For the critical case that 9, = 0 the profile becomes fan. The expansion fan is also reflected by the upper
boundary. Dependent upon the length of the channel,
the shock wave and expansion fan are reflected
manifold. Analytical results predict a Mach number of
1.454 behind the first shock wave. The maximum
deflection for this Mach number is 10.5', which
implies that the first reflection will induce a subsonic
flow with an entropy layer (or slip stream). This
16-7
problem is a widely used standard test case [I I]. The Extension to a 3dimensional Flux-Filter scheme is in
grids are a structured 180 by 60 grid, a unstructured progress
grid with 1664 quadrilateral elements and an
unstructured grid with 3633 tnagular elements. Figure ACKNOWLEDGEMENT
6 presents the mach number distribution for the 3
grids. This research projwt was sponsered by the DFG under
contract Wa424/10.
NACA0012 Transonic
REFERENCES
This standard AGARD test case [I21 consists of a
NACAOOIZ profile in a transonic flow. The angle of Steger, J. L , Warming, R.F.: Flux Vector Splitting of
attack is 1.25" and the free stream Mach number is the Inviscid Gasdynamic Equations with Application lo
0.8. The main features are a strong shock located at Fimte-Difference Methods. Journal of Computational
x=O.62 on the upper surlace and a weak shock at Physics, Vol40, pg 263-293 (1981)
x=O.37 on the lower side. The Mach number
distributions is given in figures 7. Van Leer, B.: Flux Vector Splitting for the Euler
Equations. Proc. 8th International Conference on
This problem is solved with a Runge-Kutta scheme on Numerical Methods in Fluid Dynanncs, Berlin,
a 160 by 50 structured grid. The maximum CFL Springer ( I 982)
number was 0.4. After 3000 iteration steps, a pseudo
convergence was obtained, where the upper shock
[31 Roe, P.L.: The Use of the Riemann Problem in Finite
position continued to move back and forward around DifferenceSchemes. Lecture Notes in Physics, Vol 141,
one grid cell. pg354-359,BerlinSpnnger Verlag(1981)
The predicted Cl (0.3708) and Cd (0.0213) cdlicients
correspond well to those given by the AGARD [41 Osher, S.: Numerical Solution of Singular Pertubation
test. [121 Problems and Hyperbolic Systems of Conservation
Laws.Mathematical Studies, Vol47 Amsterdam, North
Holland ( 1981)
NACA0012 Mach 0.5 Reynolds 5000
This test case [I31 demonstrates the use of the Flux- Powell, K.G., Barth, T.J., Parpia, LF.: A Solution
Filter scheme on the Navier Stokes equation. The test Scheme for the Enler Equations Based on a
case consists of a subcritical flow ( M a 4 . 5 ) over a Multidimensional Wave Model. A I M 93-0065 (1993)
NACA-0012 with a Reynolds number of 5000. The
angle of attack is 0 degrees. The flow has a Rossow, C.C.: Efficient Cell-Vertex Upwind Scheme
recirculation at the trailing edge. The predicted for the Two Dimensional Euler Equations. AJAA
Journal vol. 32, pg278-284, (1994)
location of separation from references is at ~ 0 . 8 2
This scheme predicts the flow separation to at ~ 0 . 9 2
(Fig. 8), which indicates that the Flux-Filter scheme 171 Gib,M., Anderson, W , Roberts, T.: Upwind Control
Volumes A New Upwnd Apprcach. AJAA 90-0104
has an incorrect degree of dissipation. (1990)
The integration, spatial and temporal, could be [IIlLevy, D.W., Powell, K.G., Van Leer, 9.: An
Implementation of a Grid Independent Upwnd Scheme
improved to allow much larger time step and fortheEulerEquations. AJAA 89-1931 (1989)
convergence rates. This would be advantageous for the
3dimensional version. One way is the use of higher 1121Viviand, H.: Numerical Solutions of Twc-Dunensional
order flux integration or the use of flux limiters. The ReferenceTestCases. AGARD-AR-2IL.pg6.1 (1985)
temporal integration can be improved by solving the
implicit method more accurately or by implementing a [I31 Venkatakrishnan, V.: Viscous Computations Using a
multigrid scheme. The improvements should also be Direct Solver. Computers & Fluids Vol 18. No2 pg 191-
aimed at reducing the level of artificial viscosity or to 204 (1990)
eleminate its use.
znn.,
15
10
a5
r~~~
ODI O 3 5
Mach Number
a0 DS , , , ,
, 10 $0
Massflux- and ERthalDy Error
4s M 05 10
Figure 2 Solution of the scalar model equation with first order, respectivelysecond order flux integration
16-9
RK Modified 4r
n 4[
Figure 4 Spacial roots of flux filter scheme vs stability contours of different time integration metha i
16-10
i.5
O
‘m
0.5
Figure 5 Amplification factor and i (CFL= 0.3) for the flux filter scheme.
1.87451
UnstructuredGrid 16MOuad-Elements 175D Nodes (AdvancingFmnlTechnique) 1.80692
1.73933
1.67175
1.60416
1.53657
1.46898
1.4014
1.33381
1.26622
1.19863
I 1.13104
1.06346
Figure 6 Supersonic flow over a 15 degree wedge solved on different grid types.
16-11
math
0.53BB07
0.483128
0.429446
0 375165
03220M
0.28M04
0 214723
0 '161042
0.107361
0.0536807
Implicit Multidimensional Upwind Residual Distribution
Schemes on Adaptive Meshes
H. Paillere, J.-C. Carette, E. Issman, E. van der Weide, H. Deconinck and G. Degrez
von Karman Institute for Fluid Dynamics
ChaussCe de Waterloo 72, 8-1640 Rhode-St-Genke, Belgium
linrs. which are the fieldlines of the velocity vector < with axc- - +a,.c-
kla-1
= 0,
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorilhms”
held in Seville, Spuin, from 2-5 October 1995, and published in CP-578.
17-2
,. r+ it is given by
r o o1
la 0
0
0
0
0 1 1
01
or in Cartesian coordinates
rig. 1 : Mach angles = t p and streamline coordinate sys-
Y)
Lein (1,
where Lgp =
+ defines the Mach angle i p be-
l.wveen Machlines and streamline, see Figure 1. Again,
the physics dictate a multidimensional upwinding dis-
U- =
Jmax($,
_M1-1-d
*c
and s and fi
IMZ- 11) and 1 = *.
are given by @ =
To circumvent
the singularity at the sonic point, E is different from zero
cretization of the acoustic equations by a space opera- and given a small value (ty&dy 0.05).
tor upwinded along the Mochlines, as in the method of Clearly, the third and fourth characteristic equations de-
characteristics. The point of deviation from the method couple for all flow regimes. implying as before that en-
of characteristics is to achieve this in a conaeruotive for- tropy and total enthalpy are conserved along streamlines
mulation. such that shocks and contacts can be handled in the steady state. Considering the first and second
without any special treatment. This indeed is the hall- equations, U- = 0 and U+ = 1 for supersonic flow, and
mark and basis of success of the state-of-the-art Finite the equations are fully diagonal; the system is in fact
Volume approach. The price paid in these methods, identical to eqn(5), where the acoustic variables are made
however. is that, the upwinding is based on locally one- to propagate along the Mach lines. In the subsonic case,
dimensional physics, by considering the states adjacent the system is no longer diagonal and the two acoustic
to each finite volume face as the initial data for a one- equations become coupled and form a system which is
dimensional Riemann problem in the direction of the face elliptic at steady state.
normal. Such an approach precludes the use ofMach lines The residual in conservative variables is obtained by
or the streamline as the upwinding directions. transforming eqn(9). giving
For subsonic pow, the two acoustic equations form an
elliptic subset. and it is less clear what should be the
opt.imal space discretization.
To fix the ideas. consider again the Enler system in the where R is to transformation matrix from characteristic
form ofeq. (3), hut assume that the inlet conditions are to conservative variables.
such that the flow is irrotational. so that the third and
fourth equations decouple from S and H :
2 RESIDUAL DISTRIBUTION SPACE
DISCRETIZATION
The finite volume setting with its underlying disconlinu-
ous solution representation naturally leads to the defini-
tion of 1D Riemann problems at the discontinuous inter-
vanish, leaving only the contributions from the bound- the coefficients CA, are all non-negative for k #
aries. i . Stability and monotonicity preservation is then
\ratious residual distribution schemes have been prc- guaranteed under the CFL-like condition
pased, such as the central scheme of Jameson [3] or the
Lax-Wendroff schemes of Ni [4], Hall [5] and Morton [6].
In the present context the residual distribution frame- Y
work has been used to formulate conservative mnltidi- k
mensional upwind advection schemes. At the scalar level,
linear and non-linear advection schemes are obtained by Indeed, if :U is a local maximum. i.e. (11; -U:) 50
distributing the cell residual to the downstream nodes M, then 5 0. Consequently a local maximum
only. in this way, properties such as positivity and 8ec- cannot increase and similarly a local minimum can-
ond order accuracy (linearity preservation) [7, 8, 91 can not decrease. The condition(l5) is called global pos-
he built-in. As long as the distributed parts sum u p to itivity and is difficult to impose for fluctuation split-
Ilie coiiaetvotivecell residual. the schemes satisfy discrete ting schemes. Therefore a more restrictive property
conservation. is introduced, namely local positivity, see [I?]. This
The appbcation to the Euler equations, e.g. for advec- means that condition(l5) is imposed for each con-
tioil of entropy, total enthalpy and acoustic variables, is tributing control volume in eqn(l4), which is very
straight forward, provided that a conservative linearira- easy to check. Positivity will in general be linked to
tion can be found which ensures that the flux balance over some upwinding. in the fluctuation splitting context
a cell caii be written exactly in terms of the quasilinear upwind biasing is obtained by limiting the distribu-
equations discussed in section 2. tion of the cell residual to the downstream nodes.
L i n e a r i t y Preservation or Residual Property
2.1 Distribution schemes for scalar ad- Second order truncation error in the steady state
is obtained by demanding that no updates are sent
vection to the vertices if the cell residual is zero. This is
The subject of multidimensional shock-capturing advec- obtained when the distribution coefficients p." are
tion schemes on triangles has been extensively treated in bounded, such that
previous pubhcations, and the reader is referred to 19, lo],
as well as to the work of Roe and Sidilkover [8, 111for de- p?+" + 0 when cb" + O (17)
tads. Only the most important aspects are recalled here,
and the extension to quadrilaterals is briefly discussed._ It can be proven that only non-linear schemes (a scheme
('onsider the linear advection equation, with constant A: is called linear if the coefficients C k , in eqn( 15) are inde-
au
-+A.
at
- -
vu = 0 (11)
pendent of Uk) can satisfy both properties.
4
where r is the boundary of the control volume 0. Be-
cause f h e solution is stored in the vertices of the cell, the
contour integral can be easily evaluated by the trapee-
i i i m rille In the fluctuation-splitting approach, fractions
of 0'' are sent to the cell vertices, which after assembling
contributions from all cells leads to the nodal update.
The seini-discretization at point I is then
k
+
n J. i
= -R(u,)
where S, is the area of the median dual cell around node
i . and the /3: are the the distribution coefficients which Fig. 2 : Triangle and inward normals 13,
w n i up t,o one for each cell. T h e way these coefficients
iiic evaluated, determines the properties of the scbeme.
'Tlw most. import.ant of these are: schemes. Since the inward normals ti, sum up to zero. one
Positivity
has also E,
k. = 0. Four important distribution schemes
are:
.X monotonic scheme can be obtained by demanding
posit,ivity. Suppose that. the numerical solution at
iriesh point i is U , . Then the positivity property The N and PSI or limited N scheme
requires t,hat in t.he discrete iorm of eqn(l1)
Define k: = max(0,k.) and k; = min(O,k,), then the
distribution to the nodes for the N scheme are given by:
Fig. 3 : Normals for the triangle FV scheme Compared with the scheme on triangles, eqn(l9). every
point h a s his own inflow state, given by
3 CONSERVATIVE LINEARIZATION
We now consider system (30) in conservative form
au aF
-+-+-=O,
aG
or
au
-+VF=O
-- (34)
at az ay at
(3.5)
Fig. 5 : Normals for the quadrilateral FV scheme
On the other hand, the positive advection distribution
schemes require a quasi-linear form of the residual. A
conservative linearization is defined such that the quasi-
aiid the dist.ribution to the nodes linear form integrated over the surface is identical t.o the
= k,(u4 - UI) + k,(u2
y,$Q - UI) flux integration over t,he boundaries obtained by a par-
ticular integration rule. For the Euler equations on trian-
= k.(u3 - u2) + k,+(u* - 11,)
:f"Q,$Q gles, this is easily achieved by assuming that the Roe pa-
&.Q,$Q = k,+(u, - uz) + k,+(u3 - u4) (29)
rameter vector z = fi(1, U, w , H ) varies
~ linearly over
d:"Q,$" = k,+(u, - U,) + k,(U3 - U,) each element. Since U, F and G are quadratic in the
components of 2, the Jacobian matrices aUJO2.a F J a Z .
The limited version of this scheme is obtained as before. and aG/aZ are linear in the components of Z. making
the integration over a triangle trivial. Defining the aver-
2.2 System distribution schemes age state $? over the cell:
Iiroause the syst,em is hyperbolic, ti, can be written a Because the exact Jacobians are nsed. one can transforin
wiwre t l w columns of R, contain the right eigenvectors, On quadrilaterals it is more difficult and for the nioinent
.\, is a diagonal mal.rix of the eigenvalues and L, = R;'. a linearization is used which is only exact far parallelo-
gra,,E.
' l . 1 1 ~ mat.ricrs li,' and ii,- are given by
The global update for the system, analogous to t h e scalar
c a e eqn(l4), is t.hen given by
1;; = R , A + L , . Ii; = R;A;L, (33)
Here hf contains the positive and A; the negative eigen-
valws.
\Vi1 11 I.hesc definit.ions t.he system schemes on triangles
t a n he obtained just, by replacing the scalar I;, by the
nial.rix I<,i n the equations (19), (23), (24) and (25). On (42)
(Iiiiiclrilal,crals t,lie N-scheme i n the form (26) does 1101
geiieralizr to syst.ems and only system versions of the FV.
I.DA rliid I ~ x - L ~ ~ e ~ ~schemes
I r o f f can be obtained. where D? is the cell distrihet.ion matrix
17-6
5 IMPLICIT ACCELERATION
Explicit time-integration of the semi-discrete equations
(42), although straightforward and robust, suffer from
stability limits for some classes of problems. such as sub-
sonic flows with stagnation regions and viscous flows.
Implicit time-integration is in turn less limikd by restric-
tions over the time-step but requires on the other hand
large non-linear systems of equations to be solved.
(43)
.UPDATE: U*+' = U* + A l l *
( b ) M a c h number isolines lbl Msch nurnbcriaolincs, general v i c y and zoom o i Irading edge
R,(U, +Elm)-R,(U)
0 00 o 33 0.67 LOO (44)
E
m
- Perturb m-th component of U, t U, €1,. + Broyden's method allows to update the Jacobian matrix
- Compute new fluctnation, without having to compute twelve residual evaluations.
- Distribute the 3 contributions (i=1,2,3): On the other hand, non-linear convergence will he at most
linear and more iterations will he needed at the non-linear
level.
+
where * T [ I J J el,) denotes the residual contribution 5.3 Solution of the linear system
t,o node i when the m-th component of U at node j has Following the linearization process, the linear system (43)
lieen perturbed. is iteratively solved with left (or right) preconditioning:
A key issue t,o the numerical computation of the Jacobian
.U a finite-difference approximation is the proper choice ~F(U')-~JF(U*)A* = - . j p ( ~ k ) - 1 ~ ( ~ k ) 3(48)
of E . which can be determined here on a component-by-
coinpouent. basis. The question is treated by Schnabel[lS] with .?,(Uk)obtained by some incomplete approximate
who advocates: factorization of JF(U*). Block ILU factorization is used
in our numerical experiments. Krylov subspace acceler-
E = Jiimax[I~,.ml,t~~(~,.m)lsign(U,,m), (45) ation techniques have been considered to accelerate the
convergence of the iterative solve. In the framework of
d h typ( U,,,) typical user-defined order of magnitude
a this paper, we have favoured GMRES[lG] among other
for the m-th component of U at node j and q a lower solvers because of its optimality and since it does not
bound on the inaccuracy in the residual R ( U ) evaluation represent a severe limitation for 2D medium size p r o b
(relative noise). This lower hound is at best the machine- lems on today's computers despite its storage require-
epsilon of the computer and can he larger if R(U) is com- ments. We refer to [l7] for a description and as-ment
puted by a lengthy piece of code. Should q be worse or if of alternate preconditioners and other Krylov subspace
R(L1) is not differentiable everywhere, one might rather techniques, such as Q M R and TFQMR[l8]. A constant
resort t,o the secant method, known for multidimensional Krylov snhspace dimension of 30 is used in the numerical
problems as the Broyden's update. experiments and the linear solver is stopped when the
normahed linear residual drop5 below lo-', This linear
convergence criteria is easily met within the 30 Krylov
5.2.2 Broyden's method subiterations in the early stages of the Convergence pro-
Broyden's update method is the multidimensional exten- cess when the CFL number is not too large.
sioii of the secant method used for nnivariate problems,
avoiding the need for computing any derivative. If the
kth Newt,on-Raphson step is denoted' by: 5.4 Global convergence and fixed-point
method
The choice of an optimal time-step is a key issue to en-
sure a fast and robust convergence. It seems logical to
wit.1) A'lr = Cl*+' - U k 9the generalization of the one- increase the time-step when approaching to the converged
diniensional secant condition is that JR(U*+') satisfies: solution as the likelihood to be within the radius of con-
vergence of the Newton met.hod increases. Automatic
time-increment control algorithms have been set up to
relieve the user from explicitly monitoring the CFL num-
whereA'R= R ( U * + ' ) - R ( U * ) . However, Thisdoesnot, ber following the convergence level. Some experiments
determine JR(U*+') uniquely in more than one dimen- with such algorithms can be found in [17]. We present
sion. I n Broyden's update approach, JR(U*+') is chosen now a technique which consists. after some approxima-
by making the least change (see[lS] for proper matrix tion, in accelerating a fixed-point method. The technique
norms) to JR(U')),consistent with thecondition (46). As never reaches any Newton-like convergence. but, shows,
such. the method suffers a major drawback as it entails a
for a constant limited CFL number, a good global con-
romplete fill-in of the Jacobian matrix whereas the true vergence behaviour. The technique consists in solving the
.lacobian matrix is sparse. Alternatively. we can look for steady-state Euler/Navier-Stokes equations with a n infi-
nite CFL number, i.e. full Newton time integration. but
-
t,lie solution to the same least change problem under the
addit,ional condition B E ~ ( J Rwhere) S ( J R ) represents using a finite CFL number in the preconditioning matrix
the set of 11 x n matrices with the same sparsity pattern at each linearization: J p ( U * ) = JR(U*)and ~ F ( U * )
ob-
a., .lrt. The resulting update is given by: tained some factorization of l / A ' t + J R . It should be
pointed out that, since the Krylov subipace dimension is
JR(l:*") =JR(uk) + not increased. tl<w results also in solving less accurately
the linear system.
P , ( J ~ I { D - [A'&
' J R ( U * ) A * ~A*U},
] The scheme, already used in [19]. is building up t.he
main features of the flow at the very early stages of
wlieie 'PP+~~,,,
is the matrixoperator which maps any ma- the convergence process much faster than the classical
trix onto the same matrix but restricted to the sparsity backward Euler discretization in time. Asymptotically
though. the method shows a monotonic linear conver-
"The iimc-step, has tern eluded from the formulation. However, gence behaviour and never reaches the convergence rate,
Ithe arguintnration which follows stdl holds, as backward Euler die-
crrl ~ d a ~ i oini i tmie amounts t o e Classical Newton's method where possibly quadratic. of backward Euler. T h e method a p
I t w incrmwni 3 lr h a s been under-relaxed lor the update. pears t.lierefore as compleinentary t.o backward Euler as
17-9
i t can be used for the first non-linear iterations and pro- Implicit time integration was performed by updating the
vide, so doing, a well-featured initial guess for backward Jacobian with Broyden's formula, with a maximum CFL
Euler. number of 200. Convergence history is shown in Fig. 10
The scheme can be viewed as an accelerated fixed-point and was achieved in about 750 CPU-eeconds. In com-
met,hod. T h e basic implicit technique consists in a sim- parison, abont 40000 CPU sec were needed to reduce the
ple relaxation procedure immediately followed by a non- residual to lo-' using explicit Euler time-stepping.
linear update. The relaxation procedure is based on some
approximation ~ F ( U *of) the augmented Jacobian of the
residual J F ( U * ) = l / A * t + JR(U*),
and the o v e r d pro-
cess reads:
6 MESH ADAPTIVITY
which is the stopping criterion. Notice that (53) seeks to
In [?I], it was proposed to use the residual decomposition equidistribute the contribution from each element to the
technique developed in the context of multidimensional global error bound.
upwind methods as a tool to extend the SUPG method From the adaptivity criterion (53), one can isolate h, for
to compressible flows. This idea was shown to lead to each triangle T which provides ns with a new "reference"
increased performances and robustness compared to the size for each triangle. Then, it is easy to decide whether
standard system extensions of SUPG 122, 231. a given triangle has to be refined, coarsened or kept as it
In the present section, we report the continuation of this is. Of course, when dealing with the 2D Euler equations,
work with focus on mesh adaptivity 124, 251 and we will one can compute four different required mesh sizes h,(+k)
show that the use of the mnltidimensional residual de- for the next triangulation 'Tj. At that point, several o p
compositions introduced to generalize the SUPG scheme tions can be taken. It could be decided for instance to
to hyperbolic systems allows for the derivation of an er- control the error only on one of the 4 equations but this
ror estimation procedure for the Enler equations in a very is risky because one could miss some of the flow features
natural and inexpensive way. which are not "seen" by the corresponding variable. Our
prefered choice therefore consists in taking the minimum
of the four mesh sizes,
6.1 SUPG a posteriori error estimate
(55)
The main ingredient of the proposed error estimation is
the a posteriori error estimate developed by Johnson and
Eriksson 126, 271 for the SUPG scheme applied to the which ensures an eqnation-by-equation control of the er-
following convection-diffusion equation: ror over the mesh under the required tolerance TOL.
i.
VU - d . (&") =f in 61, (50)
6.3 Adaptivity technique
with Dirichlet boundary conditions on the boundary r The adaptivity technique developed in the present re-
of the computational domain 61. If we assume that the search is inspired by the innovative work of Richter [ZS].
advect.ion vector . i
is constant, the a posteriori error esti- It consists in non-hierarchical h-refinementjderefinement
mat,e for the scalar shock capturing SUPG scheme applied allowing efficient mesh optimization operations such as
to the sl.ationary problem (50) can be written from 1271 edge swapping and Laplacian smoothing.
as: T h e refinement operation is achieved by the introduc-
tion of an additional node for each edge of an element
ll~-crllL2p)5 CII min(l,R-'h') R(U)II',(q+maxR1'2
r-
, for which the calculated spacing is less than the element
(51)
parameter h. For interior edges, the additional node is
where placed at the mid-point of the edge and the solution at
the new vertex is interpolated from the solution at the
R ( I ' ) =I i.dU-f 1 + SEBTCn
max
I 8U
I' onTE7,
(52)
extremities, whereas for boundary edges, the geometrical
location of the new node is determined through a spline
interpolation involving the four dosest existing points.
wit.11 T a triangle of mesh 7.R the artificial viscosity of For any edge that is subdivided in this manner, the two
I.hr SUPG scheme and ns the normal to side S of T. Note adjacent triangles associated to this edge both have to be
that. for simplicity, the computed solution U is compared divided in order to preserve the consistency of the final
with the solution ii of a pertnrbed continuons advection- grid.
difusioii problem obtained by replacing K by R(U) in eq. Our coarsening strategy is based on the nse of a non-
( 5 0 ) . In general, llu - iill is expected to be dominated hierarchical data structure which enables the deletion of
by.C111? - CJll, where C is a constant, so that control of nodes of the initial grid and the use of the structural o p
liii - - ~ lsuffices.
l timization techniques described below. The coarsening
17-11
( b ) M a c h number i i o l i n ~ j
ACKNOWLEDGMENTS
Fig. 14 : Three "pathological" low degree node configu-
rations and their associated treatments T h e second and third authors are supported by a fellow-
ship of the Belgian Fund F.R.I.A. Part of the research
FOI more details about the adaptivity technique we refer was supported by the CE through Bright/EuRam con-
tlir reader to 1241. tract AERO-CT-0040 and the ESA MSTP program.
141
,. Ni. R.-H.: A Multide Grid Scheme for Solving the I191 Issman, E.; Degrez, G.: Convergence acceleration of
Eu'ler Equations. a 2D EulerlNavier-Stokes solver by Krslov subspace
AIAA Journal. Vol. 20, 1981, pp 1565-1571 methods.
ECCOMAS 94. 1994
151 Hall, M.: Cell-vertex multigrid schemes for solution
of the Euler Equations. [20] Numerical simulation of compressible Navier-Stokes
Numerical Methods for Fluid Dynamics, I1 (K.W. flows.
Mort.on and M.J. Baines, eds.). Oxford University Bristeaux, M.-O., Glowinski, R., Periaux, J., Vi-
Press, 1986. viand, H. eds., Proceedings of the GAMM work-
shop, held at INRIA, Sophia-Antipolis (France), on
[6] Morton, K.;Paisley, M.:On the cell-centre and cell- December 4-6, 1985, Vol. 18 of Notes on Numerical
vertex approaches to the steady Euler equations and Fluid Mechanics. Vieweg, 1987.
t h e use of shock fitting.
Lecture Notes in Physics, Vol. 264. Springer-Verlag, [?I] Carette, J.-C.; Deconinck, H.; Paillere, H.; Roe,
P. L.: Multidimensional upwinding : Its relation to
19x6
finite elements.
Int. J. of Num. Meth. Fluids. Vol. 20, 1995. pp 935-
,lil. Roe. P.: Lluear advection schemes on triangular 955.
medies.
Cranfield Institute of Technology
. report, November [2?] Hughes, T. J . R.; Mallet, M.: A new finite element
1987. CoA 8720. formulation for computational fluid dynamics: Ill.
The generalized streamline operator for multidimen-
181 Roe, P.: "Optimum" upwind advection on a trian-
sional advective-diffusive systems.
gular mesh.
Comp. Meth. Appl. Mech. Engrg., Vol. 58, 1986, pp
ICASE Report 90-75, 1990.
305-328.
['J] Struijs, R.; Deconinck, H.; Roe, P.: Fluctuation [23] Hansbo, P.: Explicit streamline diffusion finite ele-
Splitting Schemes for the 2D Euler Quations. ment methods for the compressible Euler equatious
VKI LS 1991-01. Computational Fluid Dynamics. in conservation variables.
J. of Comp. Phys.. Vol. 109, 1993, pp 274-288.
[lo] Deconinck, H.; Struijs, R.; Bourgois, G.; Roe, P.:
Compact advection schemes on unstructured grids. [24] Carette, J.-C.; Deconinck, H.: Unstructured mesh
VIiI LS 1993-04. Computational Fluid Dynamics, adaptivity for SUPG formulations hased on residual
1993. decomposition of the euler equations.
Proc. of the VKI Lecture Series on Computational
11 I ] Sidilkover, D.; Roe, P.: Unification of Some Advec- Fluid Dynamics. VKI LS 1995-02, 1995.
tion Schemes i n T w o Dimensions .
ICASE Report 95-10, 1995. [?SI Carette, J.-C.; Deconinck, H.: A posteriori finite el-
ement error estimation for the euler equations based
[ I ? ] Deconinck, H.; Struijs, R.: Bourgois, G.; Roe, P. L.: on multidimensional residual decomposition.
Compact. advection schemes on unstructured grids. Venice, Proc. gfh Int. Conf. Finit,e Element,s in Flu-
Proc. of the VKI Lecture Series on Computational ids, 1995,
Fliiid Dyuamics. VKI LS 1993-04, 1993.
(261 Johnson, C.: Adaptive finite element niet,liods for
[ I 31 Sidilkover. D.: Multidimensional upwinding and diffusion and convection problems.
multigrid. Comp. Meth. Appl. Mech. Engrg., Vol. 82, 1990, pp
Proc. of the 12th AIAA CFD Conference, San Diego, 301-323.
l a n e 19-22, 1995. AlAA paper 95-1759-CP, 1995.
[XI Eriksson. Ii. E.; Johnson. C . : Adapt.ive stream-
1141 Drgrez, G.; Issman, E.: Solving steady compressible line diffusion finite element methods for stat.ionary
Row problems with subspace iteration methods. convection-diffusion problems.
I n l y 3-i3 1995, 3rd Int. Congress on Industrial and Mat.h. Comp., Vol. 60, 1993, pp 167-188.
Applied Mathematics, Hamburg.
[28] Richter, R.: Schtmas de capture de discontinuit.Cs en
(151 Schnabel. R. D.; Dennis, J . E.: Numerical methods maillage non-structure avec adaptation dynamique.
for unconstrained optimization and non-linear equa- applications a u x tcoulements de I'a6rodgnamique.
t,ions. Prentice-Hall Series in Computational Math- PhD thesis, IMHEF, Ecole Polytechnique Federale
cnial.ics. Prent,ice-Hall. 1983. de Lausanne, Switzerland, 1993.
1161 Y. Saad. M . H. S.: Gmres: A generalized minimal [29] Miiller, J.-D.; Roe. P. L.; Deconinck, H.: Delaonay-
residual algorithm for solving nonsymmetric linear based triangulations for the Navier-Stokes equations
systems. with minimum user input.
SIAM J . Sci. St.at. Comput.. Vol. 7, No 3, July 1986. Lecture Notes in Physics. Vol. 414. Springm. 1992.
~pp856-869.
1. SUMMARY
Genuinely multidimensional upwind dissipation models are such as the one-dimensional concept of 'Total Variation
developed for the 2D/3D EuledNavier-Stokes equations using Diminishing' TVD schemes.
a cell-centered finite-volume approach on structured grids.
Very recently a more formal approach towards a general
The numerical f l u x is formulated using the artificial
formulation of artificial dissipation terms, applicable to
dissipation concept. An overview is given for 2D/3D compact
structured as well as unstructured meshes is developed, based
upwind dissipation for stencils u p to respectively 6 and 8
on the concept of Local Extremum Diminishing (LED)
points. A classification is set up for first and second order
schemes, by way of generalised limitersI4. All these
accurate schemes that have respectively minimum and zero
developments however still remain in the dimensional
cross diffusion. Second order monotone schemes are
splitting approach.
developed using the concept of non-linear limiter functions
applied on multidimensional ratios of flux differences. A In this framework, 2D multidimensional upwind schemes have
classification is presented for different families of 2D ratios. been reformulated as a way of defining dissipation terms, with
3D multidimensional limiters based on 3D ratios of flux the requirements of positivity and classical limiter
differences are introduced. The scalar dissipation models are c o n ~ e p t s ~ In
. ~contrast
~ , ~ ~ .to the dimensional-split models, the
cxtended and applied to the EuledNavier-Stokes equations multi-D dissipation depends on the direction of the convection
based on a characteristic decomposition of the inviscid speed and on variations of the solution or fluxes i n different
operator. The resulting characteristic compatibility equations mesh directions. The corresponding numerical flux for a cell
consisting of convective and source terms are depending on a face is determined by a multidimensional interpolation inside
set of 3 propagation directions. An overview is given for an upstream triangle. As a result the multi-D dissipation is
different choices of directions. The multidimensional more compact than the models with a one-dimensional
discretisation is considered for both the convective and source interpolation along the mesh lines. Recently a comparison and
terms along its associated advective speed. unification was performed for the underlying scalar linear and
non-linear positive convection schemes for both the finite
2. INTRODUCTION volume and fluctuation methods' 1*27*32.
In the last ten years extensive research has been ongoing The idea of multidimensional limiters was first introduced for
towards the development of genuinely multidimensional a 2D scalar convection problem2'. Different classes of 2D
upwind schemes. The main motivation is to reduce the mesh limiters have been classified and applied to the 2D Euler
dcpendency appearing in classical dimensional-split schemes /Navier Stokes equation^^.'^*^'.^^". I n the present paper an
and as a result to capture the physics more accurately. Two overview is given for compact 2D convection schemes for
main approaches are found in literature: the fluctuation stencils up to 6 points. Different classes of 2D ratios are
splitting schemes and the finite volume schemes. for a review determined by the choice of i ) a triangular interpolation
see ref, 19.23.30 domain and ii) variations along meshlines or diagonals. The
The fluctuation splitting schemes consist of an upwind analysis is extended for 3D convection schemes as basis for
distribution of a fluctuation (residual) over the nodes of a the development of dissipation models including 3D limiters
triangular or tetrahedral ce112,3,,'6,'7.'8,2'.22I n the finite and ratios. A classification is given concerning first and
volume methods the numerical f l u x is determined using second order schemes with respectively minimum and zero
multidimensional extrapolation'.6.7,'s,24,~s. Application of cross diffusion for stencils up to 8 points.
both methods to the EuledNavier-Stokes equations consists of The scalar dissipation models are extended to the
two basic elements : ( I ) a suitable wave m ~ d e l l i n g ~or- ' ~ EuledNavier-Stokes equations based on a characteristic
characteristic d e c o m p o ~ i t i o n ~ of
~ ' the
~ ~ 'inviscid
~ ~ ' ~ operator d e c o m p ~ s i t i o n ~of. ~ the inviscid operator. The resulting
and, (2) a scalar convection scheme. characteristic compatibility equations represent the convection
The concept of artificial dissipation associated to central of an entropy, a shear and 2 acoustic waves. They consist of
schemes. became a key element i n Euler and Navier-Stokes convective and source terms that depend on a set of 3
calculations during the last 15 years. The family of upwind propagation directions. A n overview is given of different
schemes, which can be considered as a rational way of strategies concerning the choice of the directions. The
defining dissipation in a numerical algorithm, has led to a resulting equations are discretized using the scalar dissipation
matrix dissipation form, as opposed to scalar models. The multi-D dissipation models are considered for
One of the essential elements of the upwind dissipation is the both the convective and source terms based on its associated
concept of non-linear limiters, leading to high resolution. 2nd advective speed.
order scltemes. satisfying some condition of monotonicity.
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
18-2
i- 1/2,j,k i+l/2,j,k Two options have been investigated for the choice (6ux,6uy)
i n ( I O ) leading to compact 2D upwind schemes,and are
Figure 1 S e c o n d o r d e r dimensional-split upwind illustrated in figure 2. Both variations in x and y- direction
dissipation (a>O). determine a triangular domain of dependence for the multi-D
artificial dissipation.
18-3
f3l)2,j = aui,j - a q j - 1 / 2 + P 65+1/2,j-I ( 1 1) A more severe constraint is the condition for general second
order accuracy, in the classical sense, defining a unique
Using the definition of the numerical flux (3), the multi-D
member of the class of compact zero cross diffusion second
upwind dissipation model is determined from ( I I ) ,
order schemes for the non-homogeneous convection equation.
Notice that for the configuration I the scheme is an upwind
scheme while for configuration I I the scheme is the classical
central scheme
with positive interpolation coefficients a and P depending on
a and b. Similar formulas are valid for the fluxes on cell faces
ij+1/2 introducing analogue coefficients 6 and y. 4.1.2 Linear 4 - p O b l r upwind schemes
Both 6-point families have a subclass of 4-point stencils in
Interesting to notice is the sign of the multi-D contributions in
common with the choice of P = p O in (12) and figure 3,
( I 2). The term based on coefficient p and defined in the same yielding the numerical dissipation for a,b>O
direction as the 1st order term reduces the dissipation as for
classical higher order schemes. While the term depending on 1
a in the other mesh direction increases the dissipation (12).
di+l/z,j = 711
' 'Ui+l/z,j + L ( Sui,j-l/z ) (13)
This addition o f dissipation is not a loss of accuracy, on the with
contrary i t reduces the diffusion in the cross flow direction as
shown in ref.I0. The resulting 4-point schemes are actually a one-parameter
Writing out the residual, the resulting 6-point families are (A=a+6) family of schemes, although the parameters a and 6
dcterrnined by 3 parameters A=a+6, p and y. Figure 3 shows can be chosen independently. The general 4-point scheme is
for both configurations the interpolation triangles for the four splitted in a central part and dissipation term:
cell faces.
r-------- 1-------- ?--"""-I
I I I I
I I I I
I I I
. .
2D linear 6-point upwind
schemes
4
( A , Pm Y/
+
2nd order
4
homogeneous
conuection conuection
I I
2-parameter class
zero cross diffusion
I I
4
config. I
I
+
config. I I
I I
continuous interpolation,
e.g. 5-point linearity __ ___ _ _ _ _ _ , - _ _ _ _ _ _______
I ,
Ransbeeck ('94) , I I I I I
(a+b)n: .a.b I d2 I ; .M I
It
monotone
2nd order
I1st order)
I O S A S min(a,b) I
dimensional-split zero cross diffusion
minimum cross diffusion
For more details about the 2D convection schemes and In ref. I o monotonicity conditions on the coefficients in (17)
dissipation including 4-point,5-point and 6-point schemes, see are derived for the 4 classes. All 2D ratios found in literature
e.g. I. fit in one of these classes.
e.g.2s*26927.32
and is related to the choice of a triangular interpolation. Two with interpolation cocfficients ax,px and yx depending on a,
triangle configurations are shown i n figure 6. For each b and c. Similar formulas are valid for the fluxes on cell faces
configuration two options arc considered when fixing 6uy and ij+l/2,k and i,j,kflR using respectively the sets (ay.py,yy)
with the numerator of (17) taken as: a variation along x- and (a29P d z ) .
direction or a variation along the diagonal.
I_ - - - - _ _ I _ _ -- - - - I--------------
I I I I I I
I I I I I I
; o ; o ; ; o ; o ;
Dii, = 3 (a6:F;+b6$F~+cF:F;)u~~,
t' + ( A S; S; + B F; F; + c 6; 6; - D 6; F; F ; ) u ~ , ~ , ~ (23)
0 5 A 5 min(a,b)
+
0 IB I min(b,c)
0 I C 5 min(a,c)
max (O,A+B-b,A+C-a,B+C-c) 5 D 5 min(A,B,C)
I
I
I I
optimum linear schemes in two and three dimensions in ref.24. 5.1. Characteristic Decomposition
In the present approach we choose the scheme that
corresponds with the value of D which is identical with the 5.1.1 CIiuracteristic variables/ compatibility eyuatiorts
value for the first order minimum cross diffusion scheme: The 2D Euler equations are expressed by
D=min(a.b.c).
au +
-&- 9.G = g + 2.au= 0
4.2.4 31) MirlticliiiiensionalLimiters +
where A=(A,B) are the jacobian matrices. The eigenvalues of
I n this section the 2D multidimensional limiters discussed
++
bct'ore are extended for the 3D upwind schemes. The 2nd the matrix K = A.K associated to an arbitrary unit propagation
ordcr zero cross diffusion scheme related to D=min(a,b,c)
(fig.8) is rewritten as the first order minimum cross diffusion direction 2 define for a large part the behaviour of the
scheme plus anti-diffusive limited corrcction term, solutions to the Euler equations. Wave-like solutions exist if
the eigenvalues of K are real and the corresponding
eigenvectors linear independent5. The latter define a similarity
transformation which diagonalizes matrix K,
with 1 ( b - min(a,b) )
Aa, = a;*)- a!') = -
2 with the left eigenvectors being the rows of P - l , the right
eigenvectors being the columns of P and the diagonal matrix
Ap, = p',"- p'," =-
I ( c - min(a,c) ) A consisting of the eigenvalues,
2
3.2 , h ( 3 L
h(')=h(2)= c , h(4)= 5.2-c (29)
Remark that only one limiter is applied in (24). An alternative
possibility would be to add a different limitedratio to each Using the left eigenvectors, a set of characteristic variables
component of the correction term. Near discontinuities the can be constructed,
liinitcr is switched off (CP=O) and the 1st order multi-D
dissipation is applied. In smooth regions Ct, = I and then the 6W = P-1 6U or 6U = P 6 W = 5
k=l
6 ~ ( k ) i ( ~ ) (30)
linear 2nd order scheme is applied.
Notice that this definition of L is not the same as for the linear or 6w(') = 6p - 6p/c2
multi-D models (20) because the reference dissipation has
been changed to the minimum cross diffusion scheme instead
of the classical 1st order upwind scheme. The definition of the
3D ratio is based on the variations 6uy and 6uz i n the
correction term of (24) and some extra variation in the third
direction. Thus for face i+l/2,j,k a variation 6ux is introduced
6w (4) = -Z3.65 + 6plpc
in the 3D ratio,
with p being a free parameter. Eq. (31) is not the only possible
definition of characteristic variables5, but the above choice is
well appropriate for our purpose and is based on 3 arbitrary
propagation directions,
Equation (25) has the same form as the definiton of a 3D ratio
in the formulation of a new fluctuation splitting scheme in iti = (K~,,K.'Y ) = (cosei,sinei) , li= (K.'Y ,-K~,) for i=1,2,3 (32)
ref.26 . Different possibilities can be considered for 6u,, as
shown in ref.12 where three different classes of 3D ratios are In order to identify appropriate wave decompositions, the
characteristic variables are defined4by different propa ation
defined. Each definition corresponds with the variations in a
tetrahedron constructed by the three variations along x-,y- and directions: w ( I ) , w ( ~ areLe5ted
) ) w 4, are
to K~ and w ( ~ and ?
z-axis. For more details concerning monotonicity conditions related to respectively K,.K3. Multiplying eq. (27) by the
see ref. 1 2. matrix P- I and introducing the characteristic variables (30)
(3 I ) leads to the characteristic compatibility equations :
5. EXTENSION FOR THE EULEWNS EQUATIONS aw + P
7F I A P g + P-lBPg=0 (33)
The conservative form of the 3D Navier-Stokes equations is
writLen as:
or after working out (33) explicitly,
The corresponding 4-wave model consists of one entropy obtained i n a laminar boundary layer on very coarse
wave, one shear wave and two acoustic waves8. The first two meshes O.
terms in cach equation of (34) represent the convection of the 9
Cviivectivii vf entropy and enthalpy
associated wave i n the characteristic direction, --.
The first characteristic direction K , is taken perpendicular to
the velocity
The subscript c in (35) refers to the convective part. The third 1, = (40)
terms i n (34) are the coupling or source terms, and their
presence results from the fact that the jacobian matrices i n Using the definition of specific entropy and total enthalpy,
(27) are not simultaneously diagonalizable by matrix P. Notice
that the coupling terms show also an advective behaviour F v 6p
associated to the directions, 6S=-(--6p), 6H=*(---) 6p 6p + v.6v-.
-B
(41)
P c2 PO-1) c2 Y
(36)
the first two characteristic equations of (34) are rewritten as
that are the the normal directions to the propagation directions
+++
KI,K2,K, . The subscript s refers to the source terms.
i,.ap = 0 .
+
I,=
+ -
I?= I,
1
(38) e, = e +25 e,= e, + p , e1 = e, - p (44)
In smooth regions a continuous switch between the pressure with 8 the flow angle and p = a r c t a n ( l / d c - ) the
gradient (38) and the streamline direction is introduced. This machangle. A fully decouple system of characteristic
model shows good accuracy in both subsonic and supersonic equations is obtained in steady state,
regime7.9*10.A better convergence behaviour than with (37)
is obtained cspecially in supersonic flow. I n some cases (e.g.
;.as = 0 ?.aH = 0
subsonic flows) convergence can only be obtained by freezin (45)
the directions after a certain residual drop of I or 2 orders 9 , l f (? +cR2).2).aR3 = 0 , (?-cZ3).aR4 = 0
spliuiiig
~.~!vL.irrlo-Mriclirrrigle 6. RESULTS
This model is an algebraic continuation of tlic supersonic 6.1. 2D supersonic Lava1 nozzle
niachaiiglc dccompositioii i n Ihc subsonic range with Thc inviscid supersonic flow in a Lava1 nozzle is calculated at
continuity at M = l , developed i n the fluctuation splitting ;I Machnuinbcr of 2.91 on an H-type mesh with 128x32 cells.
approach 1 7 * ' '*I9. The corresponding directions are, The first order ininirnum cross diffusion scheme (4MCD) and
n: 5-point continuous zcro cross diffusion scheme (SZCD)
e, = e,. e,=efp+-,
2
e,=e, (47) combined with ininmod limiter and the ratio of subclass (la) is
~- investigated. The multi-D schemes arc compared with the
with p=arctan(I/,/IM 2 - I I) defined as the pscudo- classical 2nd order Flux Differcnce Splitting schcme (FDS2)
inachangle. Notice the 2 sets of propagation directions leading with mininod limiter. The classical scheme is tested on a finer
to 2 splittings of the residual. For the final residual the average mesh of 256x64 cells, as reference solution. The extension to
01' both splittings is taken. the Euler equations is based on the 2D characteristic
decomposition (34) with the 3 characteristic directions
Considerable research is still being performed to identify the +++
K,,K2,K3defined by the inachangle splitting. Both the
most suitable directions, see For i ~ i s t a n c e ~ ~for
. ~ ' )a recent
convective and source tcrins of (34) are discretiscd with the
survcy.
same scalar multi-D dissipation model.
5.2. Numerical Flux Formulation Figure 9 shows the isomachlines for the 4 solutions. The first
The space operators i n the characteristic coinpatibility order multi-D scheme performs well in comparison with the
equations (34),(43) and (45) discussed in section 5.1 are classical 2nd order scheme u p to the 2nd reflection of the
discrctiscd. In the case that there is no source term, the space shock structure. The 2nd order multi-D scheme is superior to
operator is expressed by the classical scheme on the same mesh. I t compares very well
with the reference solution on the finer mesh. Figure IO and
no source term : aik) . Vw (k) or i i C( k ) . VR (k) (48) I I show respectively the Machnumber and total temperature
distribution along thc symmetry-axis. The total temperature or
where the gradient acts on the characteristic variable or a total enthalpy should be constant in the whole field. The errors
steady Riemann invariant. Whcn a source term is present, the for the multi-D scheincs are much smaller than for the
space operator can be written as. sical results. even on the finer mesh.
Figure 12 shows the convergcnce history. Both first and
second order 2D results show a good convergence behaviour
obtained with a 3 level multigrid acceleration combined with a
whcre the convective and sourcc terms are written as an
S-stage Rungc Kutta prodecure and residual smoothing with a
advection of respectively a charactcristic variable and a
CFL of respectively 10.0 and 8.0.
'source' variable along the associated dircctions (35) and (36).
In hoth cases thc sourcc and convcctive terms have the sarnc
I'orm and can be treated by thc same multi-D scheinc or 6.2. 3D supersonic corner llow
dissipation model. An inviscid supersonic corner flow4 (M=3.0) is considered,
which is generated by two unswept compression ramps with
In the formulation used i n previous work10y12.31*33the 9.S deg. wedge angle as illuslrated in figure 13. The first order
scalar multi-D dissipation models were applied only to the 3D minimum cross diffusion scheme is tested in comparison
convective terms while the source terms were discretised by a with classical first and sccond order (ininmod limiter) upwind
central approximation without artificial dissipation. In the schemes on a uniform mesh with 3 2 x 3 2 ~ 3 2cells. The
present approach also the source terms can be treated with a accuracy of the 3D scheme is investigated for both 3D and 2D
multi-D scheme based on the associated speed (36). flow phenomena appearing i n this testcase. The extension to
Discretising both convective and source terms leads to two the Euler equations is pcrformed using the 3D extension of the
numerical fluxes or dissipations for evcry scalar equation. characteristic variables (3 I ) and equations (34). see rcf.I2.
+-++
Next the scalar multi-D discretisation is re-transforined to the The three characteristic directions K , ,K2,K3 are taken along
conscrviitivc residual by use of the right eigenvectors. The the pressure gradient direction. When the pressure gradient
resulting inviscid numerical flux on ccll face i+1/2,j is dcfincd goes to zcro a blcnding is performed with the velocity
by direction.
Figure 14 shows the convcrgence history. No freezing of the
dircctions was needed to rcach Convergence with the 3D
scheme. Convergence is obtained with the use of multigrid
where d, and ds represent respectivcly the scalar multi-D acceleration and residual smoothing with a 5 stage Runge
dissipation ofthe convective and source part for each of the 4 Kutta procedure with CFL =IO.
characteristic equations. The old formulation wlicre thc
coiivectivc term is treated by a multi-D schcinc and the source lsoinach lines are shown in figure 15. The classical first order
term by ;I central scheme without artificial dissipation is easily scheme shows smeared out shocks and no contact
recovered by putting the dissipation for the source term i n (50) discontinuities while the first order 3D result shows an
to zero. accuracy comparable with classical 2nd order. From the
isomachlines near the solid walls one can conclude that the
multi-d result shows less entropy creation than the classical
schemes.
18-10
AlAA 9th Cumpututional Fluid Dynamics Conferencr. Dimensions.’ SIAM 1. Nwner. Anal.,Vol. 29, No.6, pp. 1542-
Buffalo,.pp. 8-24. 1568.
l 6 Mesaros, L.M.. and Roe P.L. (1995). ’Multidimensional 25 Sidilkover, D. (1989). ’Numerical Solution to Steady-State
Fluctuation Splitting Schemes Based on Decomposition Problems with Discontinuities.’ P1i.D Thesis Dept. .
Methods.’ AlAA Puper 95-1699, Proc. AlAA 12th Mathematics. Weizmann Institute of Science, Israel.
Cumpututionril Fluid D.vnmics Conference. San Diego.
26 Sidilkover. D. (1994). ‘A genuinely multidimensional
Paillire, H. Carette. J.C.. and Deconinck. H. (1994). upwind scheme and efficient multigrid solver for the
‘Multidimensional Upwind and SUPG methods for the compressible Euler equations.’ Submitted for Journal of
Solution of the Compressible Flow Equations on Unstructured Computational Physics.
Grids.’ V K I Lecture Series 1994-05 on Cumpurationul Fluid
27 Sidilkover, D., and Roe, P.L. (1994). ‘Unification of some
I>ynueiic.v. Von Karman Institute. Brussels, Belgium.
’* Paillire, H. Deconinck, H.. and Roe P.L. (1995).
‘Conservative upwind residual-distributionschemes based on
advection schemes i n two dimensions.’ Submitted for
publication.
28 Spekreyse, S.P. ( 1987). ‘Multigrid Solution of Monotone
the steady characteristics of the Euler equations.’ AIAA Paper
Second order Discretizations of Hyperbolic Conservation
Y5-17(JO. Prnc. AlAA 12th Compututionnl Fluid Dynumics
Laws.’ Math. Coinp..vol.49,pp.135-155.
Cuiference, San Diego.
29 Swanson, R.C., and Turkel, E. (1992). ‘On Central-
l9Paillire. H. (1995). ‘Multidimensional Upwind Residual
Difference and Upwind Schemes.’ Journal of Computational
Distribution Schemes for the Euler and Navier-Stokes
Physics, 101. pp.297-306.
Equations on UnstructuredGrids.’ Ph.D. Thesis, Von Karman
Institute. Brussels, Belgium. 30 Van Leer, B. (1992). ’Progress i n multi-dimensional
upwind differencing.’ ICASE Report No. Y2-43.
20 Rice, J.G.,and Schnipke, R.J.(1985). ‘A Monotone
Streamline Upwind Finite Element Method for Convection 3 1 Van Ransbeeck, P.. and Hirsch, Ch. (1993). ‘New Upwind
Dominated Flows.’ Computer Methods in Appl. Mech. und Dissipation Models with a Multidimensional Approach.’ AlAA
Engineering. Vol. 48, pp.313-327. Puper 93-9304. Proc. AIAA I Ith Computntional Fluid
Dynumics Conference, pp. 8 1-9 I .
2 1 Roe. P.L. (1986). ’Discrete Models for the numerical
analysis o l time-dependent multidimensional gas dynamics.’ 32 Van Ransbeeck. P.. and Hirsch. Ch. (1994). ‘Solution
Journul qf Computationnl Physics, Vo1.63. Adaptive Navier-Stokes Solvers using multidimensional
upwind schemes and multigrid acceleration.’ VUB pan of
22 Roe, P.L. (1990). “Optimum’ upwind advection on
Progress Report BriteEuram contract AERO-CT-OWO/PL-
triangular mesh.’ ICASE Report No. 90-75.
2037. VKI CR 1994-26.
23 Roe. P.L. f 1994). ‘Multidimensional Upwinding-
3 3 Van Ransbeeck. P.. and Hirsch. Ch. (1994).
Motivation and Concepts.‘ V K I Lecture Series IYY4-05 on
’Multidimensional Upwind Dissipation Models for the 2D
~ i i ~ i I Dynmics. Von Karman Institute,
C ~ i i n / ) ~ i t i i t i ~ Fluid
Navier-Stokes Equations.’ Prm. ECCOMAS 2nd Europmn
Brusscls. Belgium.
Computational Fluid Dynnniics Conference. Wi ley.
24 Roe. P.L.. and Sidilkover, D. (1992). ‘Optimum Positive September IY94.p~. 655-662.
Lincar Schemes for Advection i n Two and Three
FIGURES
2nd order FDS + MINMOD 128x32 - 2nd order FDS + MINMOD 256x64 -
Multi-D 1st order 128x32 - Multi-D 2nd order + MINMOD + RATIO IA 128x32 -
Figure 9 Supersonic Lava1 nozzle (M=2.91), isornach1ines.i .41<M<2.91. AM=O.O2.
18-12
Figure 11 Supersonic Laval nozzle (M=2.91). Total temperature distribution along symmetry-axis.
18-13
FDSZ 256x64, C b l . 0
FDSZ 128x32, Cn=8.0
4YCD 128X32, CFL=IO.O
5ZCD 128x32, C b 8 . 0
Irn. 2ua
CYCLE
t’
4 0
Figure 13 Supersonic corner flow, flow behaviour4 Figure 14 Supersonic corner flow, convergence history RMS.
Paper presented at he AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
19-2
vector differences:
v1= vm i f - Vrn
vz = V” - V” -C
v3 = v m + 1 - v* P,m + l = m
At (8)
the second order explicit fractional step algo- For the implicit solutions of equations (14)-(17).
rithm, in fully discretised matrix form, over a Element By Element (E-B-E) technique[l] is em-
single time step is defined by: ployed in order to ease the memory requirements
g V f = [B, + P.C, - (a+ D) V,Im (10)
needed by the storage of the stiffness matrix
of FEM. The iterative solution is fully vector-
ized[7]. The right hand side values of these equa-
tions are scaled with the square of the time step
to increase accuracy. These scaling is found to
reduce the number of iterations by almost 50%.
2.3 Artificial Dissipation
In the present study, a fourth order accurate ar-
tificial dissipation term on the momentum equa-
tions are used for stabilizing. The diffusion term
;
The factor appearing in (12) and (13) is used is added explicitly to the right hand side of equa-
for second order accuracy in time. In the formu- tion (1).Formulation given in reference[8]is ex-
lation given above, V* is a velocity vector which tended to three-dimensions. The artificial vis-
is not selonoidal. cosity term is computed in two steps at element
2.2.3 Second order implicit formulation level. First a second-order differencing is accom-
plished:
Tho implicit fractional step formulation follows -
X
the same steps as does the explicit one. How- 0’= - SV,
ever. the formulation is obtained by adopting a J=1
Crank-Nicolson representation for the diffusion These values give the second-order distributions
terms. but otherwise retaining the explicit for- to cell corners (i) for the momentum equations.
mulation as before. Then, fourth order distributions to cell corners
Using the same velocity diffrence formulas de- are formed using the above values:
fined for the second order explicit formulation
above. the second order implicit Galerkin frac-
tional step algorithm, in fully discretised matrix
form. over a single time step is defined by: J=1
second order artificial dissipation, gives low ve- Shown in Fig.1 c and d is the symmetry plane ve-
locit?' gradients in the vicinity of the Wall5 as locity vectors obtained with the first and second
seen in Fig.1 a and b. The second order accurate order schemes respectively. The fiow Reynolds
scheme. on the other hand, predicts the velocity number is 1000 and the dimensionless time level
profiles. even with a coarse grid, in agreement is 30, the steady state is practically reached.
with the results given with spectral methods[2].
0.40
0.40
0.20
V 0.88
-0.20
-0.40
-0.60
0
U X
a) vertical centerline b ) horizontal centerline.
I t 1 1
J "
c) Solution with second order dissipation, d) Solution with fourth order dissipation.
Fig.1 Cubic cavity velocity profiles for Re=1000 in comparison with the results of Ref[2] on the symmetry
plane at steady state (a-b). Present solutions with fourth and second order artificial dissipations are
shown. Flow velocity vectors on the symmetry plane(c-d). llxllxll stretched grid.
19-5
. -
_ C L . - e
- 5 - I
c
--------- -
d
-_I.--
. _- 4
Fig.3 Velocity vectors on the symmetry plane of the sphere, Re=162 000.
19-7
4--
a) Solution with second order dissipation. b) Solution with fourth order dissipation.
Fig.4 Velocity vector details at the symmetry plane on the right shoulder of the sphere.
a ) Full grid.
--
a) Solution with second order dissipation.
- d
A
7- -- -.
:e : .
----
w-
- w-
Shown in Table 1 is the drag coefficient values The drag coefficient values evaluated for the
for the sphere compared with the experimen- helicopter fuselage with both schemes are also
tal data[9]. As seen from the values, the first given in Table 1.
order scheme over estimates the coefficient val-
ues whereas the second order scheme under esti- CONCLUSION
mates them as compared to experimental values. A computer code based on a second order accu-
19-9
a) Solution with second order dissipation, b) Solution with fourth order dissipation
Fig.7 Velocity vector details at the wake region of the fuselage, Re=50 000.
si--- /-+
-/ -----FA
y- A -
-+---e A
a) Solution with second order dissipation, b) Solution with fourth order dissipation.
Fig.8 Velocity vector details at the unfavorable pressure region of the fuselage.
T=3.5, Re=50000
Expllclt Schema
5 @ @
-1 .OO
X/D , DISTANCE
Fig.9 Piessure coefficient (Cp) values on the body surface at the symmetry plane.
19-10
rate scheme is developed and implemented for [4] O.C. Zienkiewicz and R.L. Taylor, The Fi-
flows involving large separations and strong re- nite element method, volume 1, McGraw-
circulations about arbitrary shapes. Hill book company, London, 1989.
The results obtained for various test case are in
[5] A.R. Aslan, F.O. Edis, U. Giilpt and
good agreement with the existing numerical and
E. Gurgey, Prediction of Geneml Viscous
experimental data.
Flows Using a Finite Element Method.
The code is implemented satisfactorily to pre- Proceedings of the 8th International Con-
dict the drag coefficient of a generic helicopter ference on Numerical Methods in Laminar
fuselage. and Turbulent Flows, Swansea, U.K., July
18-23, 1993.
ACKNOWLEDGEMENT
[6] M.F. Webster and P. Townsend, Develop-
This work is partially supported by ITU, Re- ment of a Transient Approach to Simu-
search Fund under project number 494. late Newtonian and Non-Newtonian Flow.
NUMETA’SO, Numerical Methods in En-
REFERENCES gineering: Theory and Applications, Edts:
G.N. Pande and J . Middleton, V01.2, El-
U. Giilqat, An Ezplicit FEMfor 3-D Gen- sevier Publications, U.K., pp.1003-1012.
em1 Viscous Flow Studies Based on G B - E 1992.
Solution Algorithms, Comp. Fluid Dyn..
vo1.4, pp.73-85, 1995. [7] U. Giilqat, A.R. Aslan and A. M I S ~ I O ~ ~ J U ,
Aerodynamics of Fuselage and Store-
Hwar C.Ku, Richard S. Hirsh and Thomas camage Intemction using CFD, Agard
D. Taylor, ” A Pseudospectral Method for 76th Fluid Dynamics Panel Symposium on
Solution of the Three-Dimensional Incom- Aerodynamics of Store Intergration, 24-27
pressible Navier-Stokes Equations”, Jour- April, Ankara, Turkey, 1995.
nal of Computational Physics, 70, pp.439-
462, 1987. [8]Y. Kallinderis and K. Nakajima, Finite El-
ement Method for Incompressible Viscous
[31 A. Mizukami and M. Tsuchiya, A Fi- Flows with Adaptive Hybrid Grids, AIAA
nite Element Method for the Three- Journal, vo1.32, No.8, pp.1617-1625, Au-
Dimensional Non-Steady Navier-Stokes gust, 1994.
Equations, ht.J.Num Meth.in Fluids,
vo1.4, pp.349-357, 1984. (91 S.W. Churchill, Viscous flows, the pmc-
tical use of theory, pp.360-400, Butter-
worths series in chemical engineering, But-
terworth publishers, USA, 1972.
20- 1
SUMMARY
Chorin's method of artificial compressibility is extended to the Mach number, while others [5,8]expanded the
both compressible and incompressible fluids by using equations in terms of the first power of the Mach number.
physical arguments to define artificial fluid properties that
make up a local preconditioning matrix. In particular, In parallel with these perturbation procedures, local
perturbation expansions are used to provide appropriate preconditioning methods in which the time derivatives of
temporal derivatives for the equations of motion at both the equations of motion are multiplied by a matrix to
low speeds and low Reynolds numbers. These limiting control the eigenvalues have also been used to enhance
forms are then combined into a single function that convergence [8-161.Unlike the perturbation equations,
smoothly merges into the physical time derivatives at high these preconditioned equations are valid at all speeds, and
speeds so that the equations are left unchanged at transonic. so have a potential for providing uniform convergence
high Reynolds number conditions. The effectiveness of over all Reynolds and Mach number regimes. Two distinct
the resulting preconditioning procedure for the Navier- philosophies have been followed in developing these
Stokes equations is demonstrated for wide speed and preconditioning methods. One uses the perturbation
Reynolds number ranges by means of stability results and procedures described above and deals with the full Navier-
computational solutions. Nevertheless, the preconditioned Stokes equations and includes the Euler equations as a
equations sometimes fail to provide a solution for special case [ll-141.The intent of this approach is to
applications for which the non-preconditioned equations improve convergence at low speeds and Reynolds numbers
converge. Often this is because the reduced dissipation in only, while leaving it unaltered at high Reynolds numbers
the preconditioned equations results in an unsteady and high speeds (transonic and above) where it is already
solution while the more dissipative non-preconditioned quite efficient. This method has been applied extensively
equations result in a steady state. Problems of this type to a wide variety of practical applications.
represent a computational challenge: it is important to
distinguish between non-convergence of algorithms, and The second approach [15.161 provides a rigorous method
the non-existence of steady state solutions. for developing a preconditioning matrix for the Euler
equations, but equally rigorous extension to the Navier-
Stokes equations appears doubtful. This preconditioning
1 INTRODUCTION procedure is intended to provide optimum convergence over
the entire Mach number regime, but limited applications
Time-marching techniques have proven to be very effective have thus far demonstrated convergence enhancement only
for the computation of high Reynolds number flows in the in the low Mach number regime [ 161. Even there, this
transonic, supersonic and hypersonic regimes. These second method is generally less effective than that
methods, however, become inefficient at low speed or low provided by the perturbation-expansion-based methods.
Reynolds number conditions including the near wall Further. the convergence enhancement to be had at
regions of high Reynolds number flows. For this reason, transonic and supersonic speeds is very limited because
incompressible and low speed computations were time-marching methods are already efficient there so that
dominated by pressure-based procedures [11 for many years. substantial improvements are unlikely.
Chorin's pseudo-compressibility method [2],which has
become widely accepted for incompressible flows [3], The purpose of the present paper is to demonstrate how a
opened one avenue for applying time-marching procedures viscous preconditioning procedure can be developed from
to incompressible flows but there was little realization that the basic physics of the flow using low speed and low
this procedure could be broadened to enable computations Reynolds number perturbation expansions. As a part of
at all speeds until recently. this development, the link between our compressible
preconditioning method and the artificial compressibility
Extensions of time-marching methods to low Mach number method of Chorin is shown. Following some
compressible flows became possible with the realization representative examples of convergence enhancement for a
that it was the stiffness of the eigenvalues that slowed wide variety of problems, the paper closes by addressing
convergence at low speeds. Low Mach number perturbation the issue of the robustness of preconditioning methods.
procedures were first used to remove these problems [4]and One specific example is given in which the preconditioned
were used in pressure-based methods to compute low speed methods fail to provide convergence to a steady state.
compressible solutions. The implementation of time- Detailed investigation shows that the physical problem is
marching methods to the low Mach number perturbation unsteady and a steady solution fails to exist. The reduced
equations were first reported by Gustafsson [SI.followed by artificial dissipation in the preconditioned solution makes
extensive applications by the present authors [6]. this unsteadiness more apparent. The prospect of
Perturbation expansion methods have also been extended distinguishing non-convergence from the non-existence of
to combustion problems [7]. Of these perturbation steady state solutions is thus raised as a challenge facing
expansion methods, some (6.71 used the more CFD techniques.
conventional expansion procedures based on the square of
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
20-2
where the viscaus terms are given by the operator Lv,and where p p , P T , and h~ are partial derivatives. For a perfect
[i]I]=['I
the vectors Q. Q, and E are gas PT = - p / T ; p p = 1/m, m / ( y - l ) where 7
the ratio of specific heats. Note that hT is the specific heat
at constant pressure.
puh'
e = p e + -1
(u2+u2+w2)
(3) with analogous expressions for B, = aF / aQ, and
2
C, = X / a Q , .
The enthalpy, h, is related to the internal energy and the 3 LOW MACH NUMBER SCALING
pressure,
The eigenvalues of (6) determine the convergence rate of
ph = PE +p (4) the time-marching algorithm. These eigenvalues are
obtained from the roots of the fifth order polynomial:
and for a perfect gas can be expressed as a function of the
temperature alone, h = h(T). The stagnation enthalpy is
($&-U) = 0 (9)
.
defined as ho = h + (2+ 9 + d)/2 The formulation is
completed by the perfect gas equation of state which we
write as. which are readily found to be U. U. U. U f c where the acoustic
speed, c is given by,
eigenvalues at all speeds, thereby ensuring fast, efficient The quantity h.+ is, as yet, free.
convergence. The d e f ~ t i o nof the three parameters in the
preconditioning matrix, Tu. will be obtained from With the special values for p ; and p$ given in (15)and
perturbation analyses of the equations of motion at low
speeds and low Reynolds numbers. Their presence (16). two eigenvalues of &'A$,become equal to the
intmdws an artificial speed of sound, c'. that ensures that particle speed u.. The third eigenvalue also equals U. if h~ =
eigenvalue stiffness is avoided. Additional restrictions on h.+ or if the physical properties p p and PT are zero as in
the preconditioning matrix have been given by Viviand [9] incompressible flow. For these conditions, the full set of
and Choi and Merkle [12].
eigenvalues of G'A,is:
To overcome the difficulties at low speeds. we expand Q, in
the power series.
(20b)
where Mr = VAC is the reference Mach number. The
behavior of this eigenvalue is difficult to determine from
20-4
the algebraic form, but it is seen that as the Mach number scaled for low Reynolds numbers to see how our three
goes to zero. the eigenvalues approach a constant times the parameters p'p, pk and 4 must behave in the diffusion-
particle speed. Numerical checks verify that this constant
is of order unity and the eigenvalues are well-conditioned. dominated limit.
I Stability results for this condition are presented later.
For low Reynolds numbers, we scale the momentum
For incompressible flow, the coefficients U and 6 become: equations such that the temporal derivatives and the
pressure gradient remain of the same order as the viscous
terms as the Reynolds number goes to zero. This defmes
U = I n b= (V,"I kpu2 + 11 4 ) (21) the proper scaling for the pressure (Pr = /Lr VJL) and the
time (t, =prL2 I p r ) , but imposes no conditions on any
which are clearly well-behaved. Choosing 9
= 2 gives of our three preconditioning parameters.
eigenvalues whose ratio is no worse than 2. This choice is
identical to the artificial compressibility method of Chorin Using this reference pressure and time and requiring that the
[2]. We also note that for incompressible flow, it is not temporal term in the continuity equation balance the
necessary to set e= h+ to obtain simple algebraic convective terms at low Reynolds numbers, results in the
eigenvalues, and the third "particle" eigenvalue becomes k condition on p'p,
= uhT/h+. so the parameter h+ can be selected to control
convergence in the energy equation if desired.
4 LOW REYNOLDS NUMBER EQUATIONS To prevent the temporal derivative of the temperature from
appearing in the momentum equation, we also require,
Having obtained some understanding of the way the Euler
equations scale, we now turn to the Navier-Stokes equations p$ = k+prRe I T , (27)
and consider their proper scaling in the limit of low
Reynolds numbers. Here. we use a similar perturbation In these expressions, kp is a constant of order unity, while
expansion, but we let the small parameter. E. be the
Reynolds number, Re. We begin by premultiplying (1 1) is less than or equal to one.
by the matrix PL', Scaling the energy equation in a manner consistent with
these definitions results in the requirement.
1 0 0 0 0
-U 1 0 0 0
k' -ap
+ P - +aU P T U - =aT
P ~ U - +aP O
p at ax ax ax
0 o o p aU ap a
P -a+t- = -ax
-P-
4 au
-1 0 0 0 ph+ ax 3 ax
and results in the convective terms, aT
k i -= VekVT + @
at
~ aU + -, -
ap pu-,aw pu-ah u-a P r
(aaxp l ,pu-
axax ax ax ax Note that in the energy equation, we have assumed that the
quantity V,"I cprTr is small. Retaining it adds a pressure
gradient term to the energy equation, but does not affect the
for the x-direction. Multiplication by Pi1 does not affect requirements placed upon our three parameters. Equations
(30)are the creeping flow equations.
the viscous terms in the momentum equations, but the
corresponding terms in the energy equation reduce to the
conduction term plus the viscous dissipation. The modified
5 SUMMARY OF PRECONDITIONING
energy equation becomes:
PROCEDURE
transitions smoothly between these limits while also viscous time step and has proven effective in many
approaching the non-preconditioned equations for high problems [ll]. Clearly, this corresponds to switching
Reynolds number, transonic flows. This functional form from the inviscid to the viscous value when the Reynolds
will be developed by combining these limiting values into number goes below unity. The most appropriate Reynolds
a single continuous function and then verifying the results .number for this switch is the cell Reynolds number.
first by means of stability theory, then by simplified
computational problems, and finally by practical The function in J2q. 30 can likewise be made to merge
applications at low speed, low Reynolds number and smoothly with the physical properties at transonic
transonic conditions. conditions by noting that at Mach one. v,"
= c2. so that if
Preconditioning the Euler equations is relatively easy, but we choose kP = ki, = y ,Eq. 30 degenerates to pb = pp at
preconditioning the Navier-Stokes equations is more Mach one. (In computations for incompressible flow we
difficult for several reasons. First of all. the appropriate have generally chosen k,, = k; = L33). The remaining
Reynolds number must be determined. Stability results
show that the cell Reynolds number, UWV (where U artificial properties can be made continuous by setting
represents the local velocity, Ax represents the grid b = & = o . a n d b y s e t t i n g k h = l a n d k j , = P r . This
spacing and v is the kinematic viscosity), is the latter choice does not precisely satisfy the viscous
appropriate viscous scale, and that diffusive effects become matching condition. but since the Randtl number for most
dominant at cell Reynolds numbers less than unity. The gases is near one, it is close enough to give good results.
transition from inviscid- to viscous-dominated flows thus All the examples we give are based on this combination of
depends on both the flowfield and the grid. Viscous flows artificial properties.
can switch from convection-dominated to diffusion-
dominated because of increased grid resolution or The second procedure is similar, but instead of using a
stretching. The second reason for difficulty arises because function with a discontinuous slope for p'p we make both
the presence of boundary layers at high Reynolds numbers the function and its derivatives continuous. Here we define
requires high aspect ratio grids with fine resolution normal the three parameters as:
to the walls. Correspondingly, there are two cell Reynolds
numbers of widely differing magnitude. The one based on
the normal grid spacing is generally diffusion dominated,
while the one based on the streamwise spacing is generally
convection-dominated. The issue in viscous
preconditioning is to deal with near-wall cells that are
viscously-dominated in one direction and convectively-
dominated in the other. while simultaneously treating
convectively-dominated cells in regions away from the
walls.
We demonstrate two ways in which the limiting forms of These functions reach the proper limits at low Reynolds
the artificial properties in Table I can be combined into a numbers, low Mach numbers, and at high Reynolds
single function that can be used over the full Reynolds- number, transonic conditions. In particular, the function
Mach number domain. The parameter p'p is the primary for h+ switches continuously from unity to 1/Pr as the cell
quantity in controlling eigenvalues, and we begin by Reynolds number goes through unity. When the Mach
considering this quantity. The simplest procedure is to number approaches unity, Tu.approaches the physical
choose p'p as the minimum of the viscous and inviscid
values,
Jacobian, a/@,. and the preconditioned equations
=
become identically the physical equations. Choosing
0. as in the first example gives simpler pretonditioned
(
p'p = kp / V;)Min{ 1,Re} (30) equations, but only causes the modified eigenvalues of the
equations to approach the physical eigenvalues as the
Mach number goes to unity. The equations remain distinct.
where we have used the same constant at both conditions.
This is equivalent to using the smaller of an inviscid or In summary ,we scale the time derivatives at high cell
Reynolds numbers to keep the convective eigenvalues
well-conditioned, whereas at low cell Reynolds numbers,
Table I. Reconditioning Parameters Dictated by we scale so that the equations reduce to simple diffusive
Reynolds and Mach Number Scaling equations. We also scale the dominant convective speed so
that it is the same order as the diffusive time-scale. The low
LOW LOW Reynolds number scaling causes the convective terms to
Term Mach Reynolds become stiff, but because they are small, this doesn't slow
Numk NUmk convergence. To assess this scaling, we use Fourier
stability theory for the full Navier-Stokes equations using
p'p kplV," k; Re2I V," Reynolds number and Mach number as parameters.
0 r[
0
"I x
Figure 3: Euler: LGS-4, I/IIl, M=0.01, No Precon-
Figure 1: Euler: CD-ADI, M=0.01, N o Precondi-
ditioning, CFL=20
tioning, CFL=5
of 45'. Figure 1 shows results for the non-preconditioned preconditioned equations at a Mach number of 0.7.
case, while Fig. 2 is for the case with preconditioning. suggesting that convergence with the preconditioned
Both of these stability predictions are for central system will be similar to the efficient convergence
differencing in space and AD1 approximate factorization in observed with the non-preconditioned system at high
time. The stability results without preconditioning subsonic Mach numbers. The non-preconditioned
indicate the amplification factor is nearly unity (0.9999) eigenvalues in Fig. 1, however, indicate that this case will
over the mid- and low-wave-number regions, thereby converge very slowly, an indication that if verified by
vividly demonstrating the stiffness that is encountered at computations. This demonstrates the ease with which the
low speeds. stiffness in the Euler equations can be removed.
By contrast, the amplification factors in Fig. 2 for the To further demonstrate the effectiveness of the
preconditioned case are quite reasonable with damping rates preconditioning, we show stability results for similar
of around 0.9 over most of the mid- and low-wave-number conditions in Figs. 3 and 4, except that upwind
ranges with sharp fall-off along the axes (except at the differencing is used for the spatial discretization and line
corners) indicating that the preconditioned system will Gauss-Seidel approximate factorization is used for the
provide fast, efficient convergence at this low Mach solution procedure. Figure 3 shows the non-preconditioned
number condition. We do note that the amplification factor stability results for M = 0.01. These eigenvalues again
goes to unity in all four comers. but these peaks are easily contain an unacceptable stiffness in the low-wave-number
removed by a small amount of artificial dissipation. region. This stiffness is, however, removed by the
Companion stability results (not shown) indicate that this preconditioning as shown in Fig. 4. Again, this
preconditioned stability result is independent of Mach preconditioning renders the stability results essentially
number, and that it is nearly identical to that for the non-
(1.)
0 K 0
x
Figure 2: Euler: CD-ADI, M=0.01, With Precondi- Figure 4: Euler: LGS-4, I/III, M=0.01, With Pre-
tioning, CFL=5 conditioning, CFL=20
20-7
e
le+M '
1
7
Original
.H
5
Inviscid Preconditioning 0)
I P Y
100
le45 le-04 l).l)Ol IJ.01 0 I I
Mach Nunilrn
:
Re = 100
broad variety of applications including low speed
compressible flows, combustion problems, incompressible
flows, supercritical fluids and extrusion modeling. Space .oo
does not permit a complete demonstration of all these .oo .02 .04 .06 .08 .10
examples, but we present some representative results to
demonstrate the capabilities.
I .O
Figure 9 shows results for laminar flow over a backstep at a
Reynolds number of 200. The v-velocity contours are -1.0
shown, along with the convergence rate with the
preconditioned and non-preconditioned cases. These -3.0
computations are done with the line Gauss-Seidel algorithm. -
Clearly, viscous preconditioning provides a major 3
-0
-5.0
.-
enhancement to the convergence rate. m
-7.0
As a second example, we consider the flow through a -
011
converging diverging, rocket nozzle. The turbulent a -9.n
boundary layers in this nozzle are very thin because of the
high Reynolds number, and strong wall cooling. The - I 1.0
corresponding strong grid stretching (aspect ratios larger I
than 106) required near the wall introduces important low - 13.0
Reynolds number effects in this otherwise high Reynolds
number flow. With standard algorithms, the solution - I 5.0
converges at a reasonable rate for about four orders of
magnitude (which would appear to be sufficient). and
switches to a very slow rate of convergence. With
preconditioning, the convergence continues to machine
zero at a rate that is faster than the initial convergence of the Figure 9: Contours of velocity and convergence for
non-preconditioned solution. The heat flux to the wall is the backward-facing step a t a Reynolds number of
shown in Fig. 10 as a function of axial distance for both
calculations at several time steps. The lower plot shows 100 using the four-sweep Line Gauss-Seidel scheme.
20-9
Standard Algorithm
Enhanced Algorithm
One issue with r e g d to peconditid systems is their present poeedure povides much impoved convasencc
impact oa code robusmess. We have ~ u x ~ l t a m e dy n t u in m e of the m d i t i d prohlan arcds for time
experiences in which preconditioning i m p o v u robustness m c h i n g methods, While having M & -a CffeCfS in
(ia.. cases whae the n o n - p c a m d i t i d code Ida to ~ ~ g i m cws h thc mdhods already work efficiently.
converge while preconditioning makes mnvagence very
reliable). It must, however. be rsognizcd that ACKNOWLEDGMENT
p e c o n d i t i ~inrmses the local time step d r d d l y
and this lnge time step mmy require some ruuiction Y urly Ihk work w u Npportsd lmdQ NASA grrmr (NAGW-
6mgr.s of thc oomputuMn (although thc rsrrricud time step 1399,(NAS 8-3886). and @CC 8-46).
may stin be larga than the corresponding non-
p r s o n d i t i d lime step). The present example. howcva, REFERENCES
demoluIntu that thsrr m aomc c- for which the
peconditioning may mt impmve wnvagencc hecaw a 111 Patanka. S.V. (1980) N u m u i d Heat Trmsfcr md
study solution dou not exist. In thuc cuu. the Fluid Flow S a k s in Computational Methoda in
peamditioncd system proves its wnth in an unsteady. M s h h md 'Iharml ScienCn. McOraw-Hill
imuive solution poccdure. Book Company.
__
I21 chnin AJ. (19671. . "A N u n w i d Method for
Solving Incompusihle Viscous Flow Problems.
8 CONCLUSIONS
J o d of Compul.lional Physics, 2 12-26.
The poper limiting fnma of the equuions of motion i t [31 Rogers. S.E.. Kw& D. md W. C. (1989)
low speeds and in diffusiondominated regions have becn N u m a i d Solution of Ihe Jnwmpessiblc Navia-
OM by putwhation expuuions md used as the h i s stokC9 EqUathS fM Steadyat* urd T h e -
for defming ipmnditioning matrix for convagmcc Dcpcndent Prohlans. A I M Paper 89-0463.
mhmcanenL The expmsion r d t s show h t pp is thc 141 Rehm R.G. md E m ,H.R. (1978) 1. R e d of
mmt important vuiahle in w n t r o l l i mnvergencc while the N.tiorul Bureau of Standards, 83.297.
.
PT " h p ~0frscondnyimpat.nce convergence 151 Cucrrn, J. md Gust.fssas E.(1986) A Numerical
Method for Inannpnrsihk md Compressible Flow
control can be obtained by replacing these physical Problem with Smooth Solutions. Journal of
d a i v u i v u by utifiial ones in Ihe time daivuives. while Compuutional Physics, 63. pp. 377-396.
retaining the physical quantities in the flux tcmu 10 the 161 M d k . C.L. and Choi Y.H. (1985) Computntion of
wlutiolu pc unchanged. Appropriate replacement trims Low-Speed Compcssihle Flows with Time-
f a these qlmtitiu obuimd Irom the expmsion poccdrnca Marching Racdurcs. Intunational Journal for
m then genaalized so that they .pporh the physical Numaical Methad. in Engineering. 25: 293-311.
quantities in the lMsonic and supersonic regimes. 171 Withingum, J.P.. Shua. J.S. md Yang. V. (1991)
A Time Accurue, Implicit Method for Chemically
Following the developnart of a g m d i z e d precolditiona R d g Flows at AU M A Numbss, AIAA Paper
that CNUIW that the condition numbs of the Jacobian 91-0581.
nutricu of the equations of motion ruruinof Dlda one at 181 Rtamul A, Turk4 E. and Vasa, V. (1995)
all M r h n u m b , thc resulting wnvagencc chnraclaistics Ruanc Updating Methods for Ihe Steady-State
m fmt checked by means of subility h r y . The Fluid Equations, AIM Paper 95-1652.
effcaivcnus of the methods is then v m f d by 191 V i v i d H. (1985) Pseudc-Unsteady Systems for
mmput.tions of a variety of poblems. starling fmt with Study Inviscid Calculations. Numerical Methods
simple qplications and thm going to p r t i c a l exunples. fathe E& & p t i o n s of Fluid Dynamics SIAM,pp.
Effiiimt. uniform w n v a g a r c is dunonmated for ivuiuy 334-368.
of lpplicuions mvaing irange of Reynolds md Mach [lo] Turkel. E. (1987) Rcsonditioneed Methods for
n u m k d t i o n s . O v d l . it is dunonsuated that Solviq the Incompressible and Low Speed
=agetux mhuranent of the Eula cquuions at low Comprcsrihle Equations. Journal of Computational
rpcsds is quite usy d M bc rudily a w e d . E x m i o n to Physics. 72. 277-298.
the Nivicr-Stokes equations requires more u r e . hut the [ll] Vcnluteswum. S.. Weiss. J.M..Merkh C.L. md
Choi. Y.H. (1992) Propubion-Related Flowfields
Using the Reconditioned Navicr-Stokes Equations.
AlAA Papa 92-3437.
[12] Choi. Y.H.md M d e . C.L. (1993) The
Appliution of Rewnditioning to Viscous Flows.
l o u d of Compuutiod Physics, 105: 207-223.
1131 Shwa J.S., Chm, K.H. md Choi. Y.H.(1992) A
Timc-Accurnk Algorithm for Chemical Non-
Equilibrium Vigaus Flows at All Speed% AIAA
Paper 92-3639.
[14] Weis* JM.and Smith, W.A. (1994)
Reconditioning Applied to Vuiible md Constant
Dmdity Flows on UnsUucturcd Meshes,AlAA Paper
94-2209.
[15] van k, E.. Lk. W.T. md Roc. P.L. (1991)
Ch~ctcristicTime-Stepping or Local
Reconditioning of the Eula Equations. AIAA Paper
92-1552-CP.
Figure 13: Time History of Axial Velocity at one [I61 Codfrcy. A.G..Wallers. R.W. andvanLeer. B.
point in the Injector Flowlield for various differenc- (1993) Reconditioning for the Nivicr-Stokes
ing schemes Equations. AMA P.pn 93-0535.
Practical Aspects of Krylov subspace Iterative Methods In CFD
Thomas H. Pulliam:
Advanced Computational Methods Branch
Stuart Rogers t
Design Cycle Technologies Branch
Timothy Barth t
Advanced Computational Methods Branch
Paper presented at the AGARD FDP Symposium on “Progressand Challenges in CFD Merhodr and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
21-2
with A Q = Qnt' - Q", w" rairietcr for t h CMltES rriollioil in 1111:xiri: of the
the Jacobian ( A ) subspace rti. As it1 incrmws, thc: rni:rriory incrciuua
-
of the Function Evnluntion , R ( Q ) ,and I ) a pos- linr:arly and the cornliirtation qiiadratically. 'I'lic? pa-
itive diagonal matrix. For A1 'XI this is exactly rairietcr i n in unirally cli~rc:nbawd on storagc r(!qiiiri.-
Newton's method and for finite A1 a relaxed for111 of r r i m l s ancl clfmtivcni:sr of tlii: Inirc!r Iti?rntiiin. I n
Newton's method. In many applications of NCW~OII- the disciimiori Ixlow, we will have morcspwilir. things
like methods to the Eider and Navier-Stokr:s (:qua- to m y alroiit thin r(:quirairii:iil arid it's e l f d on lhi:
lions this time-like relaxation is used to start tho soh ovcrall proccss. 'Ib avoid 111~ iiicrenring rricrriory arid
lution process. A finitc lima stop A1 is I I S I : ~iiiitally corii~iutalionri:qiiiri:trients with incn:ariiig in, a corii-
then increawd to A1 -
to get pant the rathcr violcnl nonlinear startup arid moii rnoclilicatinri of (;M 11.1CS in lo apply rc:starts. Ail
'XI Icsdirig to rapid linoar, u p p w I m i n c l t i l , o i i 111 is cli(hw:n arid if coiivwgciici: is
21-3
not reached, the Krylov subspace process is restarted eration or more specific to this paper, GMRES is
with the current residual r,, replacing ro. In this used for the Inner Iteration.
case, the memory requirements are traded off against In the current implementation, the Jacobian A is
the convergence of the Inner Iteration process, and formed based on a first-order differencing of the con-
this will definitely affect the O u t e r Iteration . vective terms, whereas third-order differencing is used
Numerical experience has shown that success or for R(Q). In addition, approximate Jacobians of
failure of GMRES hinges critically on adequate pre- the Roe flux differences from the upwind-differencing
conditioning of the linear systems to he solved. A scheme are used in the definition of A, see [l]for more
preconditioning matrix M is usually applied in ei- details. The first-order difference operator is used
ther a left-preconditioning M ( b - Ax) = 0 or right- to reduce the bandwidth of the resulting A matrix,
preconditioning b - A M y = 0 (with y = M-'x) which has lower memory and computational require-
fashion. Ideally M should be chosen to be an approx- ments for the solution of Eq. 3. However, this use of
imation to d-l. Although, the most successful and approximate Jacobians can also slow the convergence
popular form of M appears to he ILU [13] (Incorn- to a steady state, that is, the O u t e r Iteration non-
plete Lower-Upper Factorization), we will also con- linear Newton process is affected.
sider alternate preconditioners in the next section. The GMRES implementation is preconditioned us-
The important ingredients of the Newton-GMRES ing block ILU(0) [13] and the matrix A is stored
method which we will focus on in this paper are the so that Ap products can be efficiently formed and
Ap products required to form the Krylov subspace the ILU process streamlined. For comparison, block
vectors K m ( A ,U), the choice of the preconditioner point relaxation and block line relaxation are used
M , the size of the subspace m and restart size m,, as both the Inner Iteration solver and as precon-
and the storage requirement influenced by all these ditioners for the GMRES Inner Iteration process.
factors. We will attempt to put the various trade offs Including a subspace size typically on the order of
in terms of memory requirements, convergence and m = 10 leads to additional storage requirements as
efficiency in perspective, (in particular for the two discussed below which are somewhat of a burden in
approaches discussed here, but also in general). two dimensions and would be a significant hindrance
in three dimensions. The use of the approximate Ja-
cobian (due to the first order form and the lineariza-
3 Structured Mesh Incom- tion errors associated with the Roe solver) produces
pressible Navier-Stokes an approximate Newton's method and therefore lin-
ear convergence is realized as opposed to the potential
Rogers [17] has implemented the Newton-GMRES for quadratic convergence.
algorithm into a twc-dimensional incompressible Rogers [17] examines a wide range of cases and
Navier-Stokes code (INS2D) and has made some options in his paper on the Newton-GMRES imple
significant comparisons with the conventional tech- mentation. Table 1 shows the characteristies of the
niques of implicit point and implicit Gauss-Seidel cases presented and itemizes the costs of the vari-
line relaxation. The INS2D flow code [la] solves ous schemes for each case broken down by the fun-
the Reynoldeaveraged incompressible Navier-Stokes damental steps in the algorithm. Base memory (B
equations using the method of artificial compressibil- MW) includes all overhead storage for the algorithm
ity, [9]. It is capable of handling multiplezone struc- including memory for either L (line relaxation), P
tured grids using either a patched multi-block (point- (point relaxation) or the Ap product in G (GMRES)
wise continuous) interface, or an overlaid (chimera) ( M 76 words/point). The additional memory (A
interface between zones. The boundary conditions MW) is composed of subspace size ( M 3 x (m 4) +
at the physical boundaries and at zonal boundaries words/point) and preconditioner (e9 words/point)
are applied in an implicit fashion during the solution contributions for GMRES. The timings are in ms/pt
process. A third-order, upwind-differencing scheme : milliiconds/point to convergence, (maximum di-
based on the method of Roe [E]is used to descritized vergence e lo-*). The standard approaches of point
the convective terms, and the viscous terms are dif- relaxation and line relaxation are compared directly
ferenced using second-order central differences. The with the Newton-GMRES scheme and are also as-
system of equations is integrated in pseud-time u s sessed as preconditioners for Newton-GMRES. The
ing an implicit Euler time discretization. Typically, first few cases are for a NACA 4412 airfoil an an-
the tirnestep is set to infinity (lo8) which results in a gle of attack (I = 13.87' and a Reynolds number,
Newton's method approach where the implicit point Re = 1.5 x lo6 and are computed on a set of refined
or line relaxation schemes are used for the Inner It- grids. The multi-element case is a three element air-
214
foil at a = 8' and Re = 9 x lo6, a schematic of the Iteration tolerance level z thereby solving the GM-
grid system is shown in Figure 1. RES step more accurately. In this case, to reach
Figures 2, 3, 4, 5 show comparisons for the above a certain convergence criteria, e.g.Outer Iteration
cases, where the symbols represent every 50 O u t e r residual to in the least number of iteration, re-
Iteration . It seems obvious from these results that quires decreasing E , E = gets there in 50 O u t e r
the GMRESILU comhinationis the more efficient in Iteration . On the other hand, the CPU time cost
terms of computation time. On the other hand, the (shown in milliseconds/point) and average subspace
negative aspect of the GMRES-ILU combination is size rn (which leads to addition memory requirements
the memory requirements. By examining the trade proportional to m) indicate that a loose tolerance,
offs between CPU time to convergence and memory say E w lo-' and small rn produce the most efficient
requirements optimal choices can be made. combination. This leads to m = 10 as the optimal
For example, Figure 2 shows the effect on CPU choice both in terms of CPU efficiency and memory
time to convergence for various choices of rn. A sub- requirements.
space size m = 10 seems to be optimal in terms
of computational costs, including reasonable added Memory estimates for the thredimensional code
memory requirements. Also, note that for the con- INS3D include a base memory of 146 words/point,
verging cases of m = 10,20,40, it required 50 O u t e r additional GMRES memory: 4 x (m+4) words/point
Iteration to reach the same level of convergence and preconditioner memory of 16 words/point.
(CPU times are larger reflecting the added computa- Thus for GMRES(lO)+ILU(O) the total memory
tional costs of a larger subspace size). This is not sur- is 218 words/point. Examples include a simple
prising since the inexact Jacobian used in this scheme wing: 0.2 million points (Mpoints): 43.6 MW, a
limits the Inner Iteration process to linear conver- wing+slattllap: 1.6 Mpoints: 349 MW, and a C17
gence. Therefore, after some point, it does not pay Aircraft: 25 Mpoints: 5450 MW = 5.45 GW. These
to converge the Inner Iteration past some toler- requirements are excessive in three-dimensions and
ance level without incurring additional cost in terms need to he reduced if these codes are to be used in
of CPU time and memory. practice.
CPU seconds
............. ~ . .~... .
the physical flux Jacobian] evaluated a t some combi- 4.3.1 Sparse Matrix-Vector Multiply
nation of the right and left states of the flow variables]
uR,uL. Exact analytical expressions for these terms The most straightforward strategy is t o analytically
are available [l]. In constructing the Jacobian ma- compute and store the Jacobian matrix using a com-
trix for the entire scheme it is useful to conceptualize pressed storage scheme designed for sparse matrices.
the finite-volume scheme in composition form: This strategy has the added benefit that a copy of the
matrix can also be used as a preconditioner for the
R(Q) = Li(C2(Q)), (9) iterative solver. In addition, the explicit storage also
permits the formation of the transposed matrix prob-
with C1 representing the flux quadrature and accu- lem which is often encountered in optimization pro-
mulation step and representing the data recon- cedures coupled with Newton's method. Obviously,
struction step. In this form, each operator requires a drawback of this approach is the large storage re-
distance-1 information. The Jacobian matrix can quirement.
then be written as
d R - --
_ dCldC2
dQ - dC2 dQ (10) 4.3.2 Approximate F'rkchet Derivatives
An alternative to analytically calculating Frkchet
with the critical observation that the Jacobian matrix
derivatives is to approximate them using finite differ-
can be calculated as the sparse product of two ma-
ences] [12] [8] [lo]. The required Frdchet derivative is
trices. This could potentially be an expensive task,
a limiting form of the difference approximation
but because of the special form of C1 and CZ,the
resulting sparse product produces a t most distance-2
fill and can be computed a t reasonable cost.
4.3 Exact and Approximate Jacobian The primary concern with this approach is the accu-
Mat rix-Ve ct o r Pro ducts racy of derivatives and the optimal choice for e . If
derivatives are not computed accurately then meth-
Consider the standard matrix equation b - Ax = 0. ods such as GMRES iteration may stall or fail. Using
Iterative matrix solution algorithms for this problem a forward difference approximation, 6 must be care-
requires the computation of matrix-vector products fully chosen. In general it is insufficient to choose 6
of the form d p for some arbitrary p vector. In the as a constant such as the square root of machine pre-
approximate Newton algorithm cision. Johan [12] also mentions this fact and gives
I%-:[
A=
some analysis for choosing but this analysis assumes
that R(Q) is well scaled. A common choice for 6 is
given by
where D is a positive diagonal matrix. In practice the IIQII
diagonal entries are locally scaled as a exponential € = 60 + 61 -llPll
function of the norm of the residual
with suitably chosen constants 60 and 61. An alter-
native to forward differencing is to use higher order
accurate formula such as central differencing at dou-
so that when llR(Q)ll 4 0, cfl,,, -+ 00 and the ble the computational cost.
scheme approaches Newton's method. It should be The clear attraction of this approach is the low
emphasized that by using this strategy, the scheme memory requirement. On the other hand, the nu-
is technically an approximate Newton method which merical computation of Frdchet derivatives does not
becomes exact only in the final few iterations of the produce a matrix approximation which can be used
computation. to precondition the system.
A major step in the matrix-vector product d p is
the computation of Jacobian derivatives in the direc-
4.3.3 Exact Product Forms
tion of p (a Frdchet derivative)
In this section we will present a technique for con-
D dR
d p = -p - -p. structing matrix-vector products which is an exact
At dQ calculation of the Frdchet derivative. Extension t o
Several possible strategies exist for computing the systems and the inclusion of diffusion terms are also
needed Fr6chet derivatives: handled using this technique.
21-9
Let G(E,V ) denote the triangulation in 2D or 3D Finally, the linearized fluxes are assembled using the
with n vertices and m edges. Next we define the same procedure as the residual vector assembly. In
incidence matrix actual calculations, the conservative flow variables
-1 if vi is the origin of edge 1 are not .reconstructed, thereby necessitating that a
1 if vi is the destination of edge 1 . change of variable transformation be embedded in the
(0 otherwise formulation. This is not a serious complication.
I141 \ I
10 25 50 75 100 125
Icn=looo]....................
................................................................ ~ '..................... Matrix-VectorProducts
10)
10 4
1 ................... i...................;...................i.................... i................... Figure 12: Viscous Flow matrix solution conver-
gence histories for the GMRES(30) algorithm at
C F L = 10' using ILU(0) distance-1 and distance-2
preconditioning matrices.
References
[l] T. J . Barth. Analysis of implicit local lin-
5 Summary earization techniques for upwind and tvd a l g e
Practical aspects for Newton-GMRES algorithms rithms. Technical Report AIAA-87-0595, Reno,
from working Navier-Stokes codes have been pre- NV, 1987.
sented. In particular, implementation issues, such [2] T. J . Barth. A Three-Dimensional Upwind Eu-
as memory requirements, accuracy requirements for ler Solver of Unstructured Meshes. Technical Re-
Ap products, tradeoffs between full Newton and re- port AIAA 91-1548, Honolulu, Hawaii, 1991.
lax Newton and other pertinent approximations, have
been discussed. Two approaches have been high- [3] T . J . Barth. Aspects of Unstructured Grids and
lighted. In the incompressible Navier-Stokes code, Finite- Volume Solvers for the Euler and Navier-
21-1 1
Stokes Equations, March 1994. von Karman In- [15] T . H . Pulliam. Implicit Methods in CFD. Claren-
stitute Lecture Series 1994-05. don Press, March 1985. The Institute of Math-
ematics and Its Publications Conference Se-
[4i T. J . Barthe Steiner f o r Isotropic ries,Proceeding of the ICFD 1988 Conference on
and Stretched Elements. Technical Report AIAA Numerical Methods for Fluid Dynamics.
95-0213, Reno, NV, 1995.
[16] P. L. Roe. Approximate Riemann Solvers, Pa-
[5] T . J . Barth. An unstructured mesh newton
rameter Vectors, and Difference Schemes. J.
solver for compressible fluid flow and its paral-
Comput. Phys., 43, 1981.
lel implementation. Technical Report AIAA-95-
0221, Reno, NV, 1995. [17] S. E. Rogers. A comparison of implicit schemes
for the incompressible navier-stokes equations
[6] T . J. Barth and D. C. Jespersen. The Design and
Application of Upwind Schemes on Unstructured with artificial compressiblity. Technical Report
Meshes. Technical Report AIAA 89-0366, Reno, AIAA-95-0567, Reno, NV, 1995.
NV, 1989. [18] S. E. Rogers and D. Kwak. An upwind dif-
[7] R.M. Beam and H.E. Bailey. Newton’s method ferencing scheme for the time accurate incom-
applied to finite-difference approximations for pressible navier-stokes equations. A I A A Jour-
the steady-state compressiblenavier-stokes equa- nal, 28(2):253-262, Feb 1990.
tions. Journal of Computational Physics, [19] Y. Saad and M. H. Schultz. GMRES: A general-
93(1):108-127, 1991. ized minimal residual algorithm for solving non-
[8] P. Brown and Y . Saad. Convergence Theory of symmetric linear systems. SIAM J. Sci. Stat.
Nonlinear Newton-Krylov Algorithms. SIAM J . Comput., 7( 3):856-869 , 1986.
Optimization., 4:297-330, 1994. [20] W. Valarezo, C. Dominik, R. McGhee, and
[9] A. J. Chorin. A numerical method for solving W . Goodman. High Reynolds Number Confgura-
incompressible viscous flow problems. Journal of tion Development of a High-Lift Airfoil. Techni-
Computational Physics, 2( 1):12-26, Aug. 1967. cal Report AGARD Meeting In High-Left Aero-
dynamics 10-01, 1992.
[lo] S. Eisenstat and H. Walker. Globally Conver-
gent Inexact Newton Methods. SIAM J. Opti- [21] V. Venkatakrishnan. Newton solution of inviscid
mization., 4:393-422, 1994. and viscous problems. Technical Report AIAA
Paper 88-0413, AIAA 26th Aerospace Sciences
[ll] D. C. Jespersen and T. H . Pulliam. Approxi- Conference, Reno, NV, 1988.
mate newton methods and flux vector splitting.
Technical Report AIAA Paper 83-1899, AIAA [22] V. Venkatakrishnan. A perspective on unstruc-
6th Computational Fluid Dynamics Conference, tured grid flow solvers. Technical Report AIAA
Danvers, MA, 1983. Paper 95-0667, AIAA 33rd Aerospace Sciences
Conference, Reno, NV, 1995.
[12] Z. Johan. Data Parallel Finite Element Tech-
niques for Large-scale Computational Fluid Dy- [23] L.B. Wigton, N.J. Yu, and D.P. Young. Gm-
namics. PhD thesis, Stanford University, De- res acceleration of computational fluid dynamics
partment of Mechanical Engineering, 1992. codes. Technical Report AIAA Paper 85-1494,
AIAA 7th Computational Fluid Dynamics Con-
[13] J. A. Meijerink and H. A. van der Vorst. Guide-
ference, Cincinatti, OH, 1985.
lines for the usage of incomplete decompositions
in solving sets of linear equations as they occur
in practical problems. Journal of Computational
Physics, 44(1):134-155, 1981.
[14] T . H. Pulliam. Efficient solution methods for the
navier-stokes equations. 1985. Lecture Notes
for the von KBrmBn Institute For Fluid Dy-
namics Lecture Series : Numerical Techniques
for Viscous Flow Computation In Turbomachin-
ery Bladings, von KBrmBn Institute, Rhode-St-
Genese, Belgium , 1985.
22- 1
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
22-2
defined as:
+ lvw;f(x)F(uh)da, (2)
gradient using data from surrounding cells. The same limit- local gradient would violate conservation of U in IC, which can
ing procedure can, however, be followed. In this paper the be corrected by modifying the coefficient 00:
multi-dimensional limiter from Barth and Jesperson [l], with
the modifications proposed by Venkatakrishnan [15], is used.
The limiter from Barth and Jespersen has as benefit that it is a
truly multi-dimensional limiter and yields a positive scheme. This relation is obtained from the condition
I J uh(X)dn = U K .
my(K) K
The limited flow field in cell
The limiter from Barth and Jespersen can, however, seriously IC is then equal to:
degrade convergence to steady state. This was analysed by 1
Venkatakrishnan [15] and the two main causes for this phe-
nomenon are the non-smoothness of the limiter, which uses m=O
min- and max-functions, and the fact that the limiter is active in
smooth parts of the flow, eg. in the far field. The final discontinuous Galerkin finite element discretization
is now obtained by evaluating the integrals over the element
Thelimiter according to Venkatakrishnan[l5] is directly applied
IC and it's boundary d K in equation (2). This is done using
to the conservative variables, which saves the considerable ex-
the transformation FK. between K and the master element R.
pense of computing the local characteristic decomposition.
The integrals J wluhdn, are calculated analytically, which
Define for each component O& of the cell average UK = requires quite some algebra, whereas the other integrals are
I
maS(3C) JIC Uh(x)dn: calculated with Gauss quadrature rules. Cockbum et al. [5]
proved that if the quadrature rules for the surface integrals in
+
equation (2) are exact for polynomials of degree (2k 1) and
exact for polynomials of degree 2k for the volume integrals
+
then the spatial accuracy of the DG method is k 1. In order
to preserve uniform flow it is necessary to use quadrature rules
which are exact for polynomials of order 3. For k = 1 the
with N ( K ) the set of neighboring cells which connect to cell
IC. In order to maintain monotonicity the approximate flow field surface integrals are calculated with four point Gauss quadrature
u h must satisfy uh(x) E U;;""],vx E IC, which is rules. The volume integrals require six point Gauss quadrature
accomplished with the limiter function @IC defined as: rules.
The use of four and six point Gauss quadrature rules is, however,
unnecessarily expensive. The number of flux calculations in the
approximation of the surface integrals can be reduced from four
to one using the following approximation, which is second order
1 if U& -U& =0
accurate in the mean:
Here U;. are the components of uh at the Gauss quadrature
points in R, used to evaluate the integrals in equation (2). The
function d J ~ ( yreplaces
) min(1,y) in the original Barth and
J,dn(x)nTF(U)dn = J,ijn(x)nTF(U)Jedfi
Jesperson limiter and is defined as:
r(u)lC/$,(x)nTJ,dfi
sn
-. with F(U)lccalculated at the cell face center and J , the Ja-
Defining A = U k . - U&, A+ = U k , - U k and A- = cobian of the transformation of the cell face dR to dfi on the
U;, rnin - U& and replacing A i with A i +e2 a smoother limiter master element R. The integrals JsA&(x)nTJedfi are pre-
is obtained: calculated with four point Gauss quadrature rules, which are
I
P:+ck+2AA+
if A>O exact using elements defined with linear shape functions, and
therefore free stream consistency is preserved with this approx-
imation. A similar approximation can be made for the volume
integral JK vw~(x)F(uh)dfl, with F(U) calculated in the
1 if A=O
center of R and the geometrical part of the volume integral pre-
The coefficient tK is set equal to LK = ASK)^, with ASK calculated with a six point Gauss quadrature rule. This formula-
the minimum distance between the cell face centers of two op- tion requires about four times less computing time than using the
posite faces of element K. The constant C determines the more accurate evaluation of the flux integrals and yields similar
balance between limiting and no limiting and thereby influences results. The discretization using four and six Gauss quadrature
the convergence to steady state. If C = 0 the original Barth and points for the surface and volume integrals yields, however, a
Jespersen limiter is obtained. In this paper C = 1 is used. slightly more robust scheme on coarse grids. This is mainly due
to the fact that the cross-coupling terms in the moment equations
The limiter @K is applied independently to each component of
are retained in this case.
the flow field: O& = @Lo&, m = {1,2,3}. This is slightly
less robust then using @ K = mini @k,but gives significantly
less numerical dissipation. The coefficients om, m = { 1,2,3}
For each element IC a system of ordinary differential equations
is now obtained:
in equation (1) represent the gradient of the flow field with
d -
respect to the local coordinates in I?. This modification of the [MKI - u K = RK
at
22-5
1 11
&I Medium( 1G)
Large(1G)
Long(4G)
Flux Limit
406
371
484
258
241
445 452
sx-3
Small(4G)
sx-3
Medium(1G)
C
F
C
F
1 Flux
1.5
1.6
1.5
1.3
Corn I Limit I Total Corr I MFlopls I
1.6 1.6
1.6 1.7
1.8 1.2
1.6 1.4
1.4 1.5
1.3 1.3
1.5 1.6
1.5 1.6
624
566
364
376
Long(1G) 463 318 314 sx-3 C 1.4 1.8 1.2 1.2 1.3 356
Large(1G) F 1.1 1.4 1.2 1.1 1.2 322
Table 3: Megaflop rates on single processor NEC SX-3 (based sx-3 C 1.5 1.5 1.4 1.3 1.4 614
on elapsed times) Long(4G) F 1.7 1.7 1.6 1.5 1.6 701
sx-3 C 1.5 1.8 1.3 1.3 1.4 440
egy executes the loops over the colors in parallel and vectorizes Small(4G) F 3.7 - 2.9 2.3 - 94
the inner loop over the faces. Part of the inner loop over the SGI LL 1.5 - 2.0 1.5 - 37
faces consists of an update of the residuals at the cell centers. Medium(1G) F 2.9 - 2.3 2.0 - 51
Within one color all faces connect to cells with different cell ad-
dresses, but this is not assured between different colors, causing Table 4: Speedups relative to single processor performance
a data dependency. Hence, in the above parallelization strategy, (based on elapsed times); SX-3 two processors; SGI four pro-
the residual updates have to be performed in a critical section, cessors; C: parallel loop overcolors(microtasking); F: parallel
where only one processor is active at a time. The second strat- loop overf w e s within one color (macrotasking); U :Low level
egy divides the loop over the faces within one color over the microtasking
available processors. The main problem with this approach is
that sufficient vector length should remain after loop division.
for each processor, in macrotasking the local data can be defined
The MFlop rates and speedup results are presented in Table 4. per task, and thus approximately halved with respect to the se-
The timings and speedups are influenced by the use of the ex- quential program. Memory use for the medium sized problem
ternal memory unit XMU of the SX-3. The XMU allows for is 498 MByte for the sequential program, 540 MByte for the
fast access to data which cannot be placed in core memory. Se- microtasked program and 5 15 MByte for the macrotasked pro-
quentially, the use of the XMU instead of core memory hardly gram. Speedups for the macrotasked program are presented in
decreases performance. During parallel execution, however, Table 4 and labeled 'F'.
locks applied during U 0 seriously deterioriate the performance.
If we compensate for the time spent during U 0 to the XMU The decrease in parallel performance with increased problem
speedups increase, the corrected speedups are labeled Con in size can be attributed to the reduced vector length. This is
Table 4. The MFlop rates in Table 4 are based on the corrected clearly demonstrated by the results of test case Long, which
speedups. has an average vector length of 120000 in the loops over the
cell faces. This problem reaches the highest parallel perfor-
The results for the first parallelization strategy, namely parallel mance, with a speed-up of 1.9 in routine Flux. Another factor
execution of loops over the colors, are obtained using micro which significantly reduces the performance of the flow solu-
tasking and are labeled 'C' in Table 4. The speedups are with tion algorithm on a NEC SX-3 computer is the limited memory
respect to elapsed times. It is clear from the results that the effi- bandwidth. This is especially important for the large number
ciency of the parallelization is rather low. This has two reasons. of indirectly addressed loops and a main reason for the big gap
First, the critical section consumes 20% of the computing time, between sustained and peak performance. The memory band-
and second, the parallel system overhead is about 10%. This width limitations are the most evident in Limit, where the ratio
large sequential part limits the maximum attainable speedup on between computations and loadstores is rather low.
more processors to 5.
SGL Power Challenge
The second parallelization strategy, namely parallel execution The SGI Power Challenge has scalar processors and therefore
of loops over the faces within one color, does not suffer from a no problems with data dependencies within a processor. The
critical section. At first the code was parallelized using micro- code was therefore parallelized using the second parallelization
tasking. The program structure is such that the flux computation strategy, namely parallel execution of the loops over the cell
is split into many different loops in different functional subrou- faces. Only the Small and Medium problems were tested, since
tines. Therefore the computational load per loop is low, less than the other problems did not fit in memory.
1.5 msec. It turned out that this load is too low to be efficient on
the NEC SX-3: the parallel overhead was as large as, or even 3 3 ~ 0implementations are made, one by paralleliziig each loop
larger than the parallel gain and no speedup was obtained. separately (low-level), and one using the same macrotasking
structure as described in the previous section. Parallelization is
Using macrotasking the parallel overhead could be reduced sig- straightforward using the parallel code of the SX-3. Directives
nificantly. Instead of parallelizing each loop separately,the work are changed to SGI directives. The macrotasking is accom-
is divided into two tasks in the subroutines Flux and Limit, each plished using the CONCURRENT CALL assertion.
task doing the same job as the subroutines, but on only half the
loop. This not only reduced the parallel system overhead, but Results of speedups and MFlop rates are presented in Table 4.
also reduced memory use. In microtasking local data is copied The low-level parallelization is labeled 'LL' and the macrotask-
22-8
+ +
of Hexadap is 8(12n 4 0 ) N 2 . lo8 Byte. With an avail-
able memory of 8 Gbyte and 8 bytes per variable the maximum
number of grid points N = 9 . lo6. Using the estimates given
by Chapman [3], this number allows for a LES with sublayer
resolution around a clean wing at a Reynolds number of approx-
i
\
\
\ imately lo6.
\
3.0 \
\ The computing time for one time step is estimated from the
0
5 - relation:
G n. N 1.3. 1.1. fF +-+f”))
fL
P
U - rF f L fR
\
OI
2.5 - \
,
\
\ -
-- with: SA a factor to account for grid adaptation, SA = 0.9,
SC the single processor speedup of the NEC SX-4 compared to
the SX-3, SC = 2. The suffixes S, F, L and R refer to the
following parts of the algorithm: S, serial part, F, subroutine
2.0 - Flux(lG), L, subroutine Limit, and R the remaining part of the
I
flow solution algorithm which is parallelizable. The variables
loop length f. denote flop counts in the respective parts of the algorithm
to advance one flow variable one time step in one grid point.
Figure 2: Cache dependency of speedups on the SGI Power The measured values are: fs = 90, f F = 1570, f L = 880
Challenge in routine Flux(4G) (- Small - - - - Medium) and fR = 180. The variables r , denote the measured flop
rates in the respective parts of the algorithm and are equal to:
f s = f R = 350. lo6, f F = 463. lo6and r L = 350. lo6 flOp/S.
The flop count in routine Flux is increased with 10% for the
ing results are labeled ’F’. Since the SGI has no XMU there
viscous contribution and 30% for a one-equation subgrid model
is no correction for the speedups: the entire program is run in
using the German0 approach. The parallel speedup, denoted by
core memory. The speedups for macrotasking are better than
,916 on a 16 processor NEC SX-4 is estimated as twelve. The
for the low level parallelization. The performance in MFlops of
computing time required to advanceone time step on a grid with
the SGI four processor Power Challenge, as listed in Table 4, is
9 . lo6 grid points is then approximately 28 seconds.
between 10% and 17% of the two processor SX-3 performance
and not sufficient for large scale computing. The percentage The time scale of the smallest eddies in the flow field will be
of peak performance is between 3% and 7% on the SGI Power approximately 100 times larger than the CFL limit for an explicit
Challenge and between 6% and 13% on the two processor NEC scheme. The CFL time step limitation can be removed with an
sx-3. implicit, time accurate temporal discretization using multigrid
acceleration. With these assumptions a Large Eddy Simulation
Results of the SGI Power Challenge are rather sensitive to cache
of a clean wing at a Reynolds number lo6 on a mesh with
misses. A parameter in the flow solution algorithm determines
9 . lo6 grid points which evolves 6500 time steps, which should
the number of cell faces in the flux calculation processed at one
be sufficient to obtain a reasonable statistical sample, would
time. Varying this parameter changes the amount of the data
require 50 hours on a 16 processor NEC SX-4.
being processed, and can be used to optimize the cache use of
the program. Significant differences can occur, and the optimal Conclusions of the parallelization
value of the parameter depends on the problem at hand. (see Provided that the vector length is sufficient, the most efficient
Figure 2). The speedups of Table 4 are computed using the parallelization strategy for the present flow solution algorithm is
optimal timing results. a high level parallelization of loops over faces of one color using
macrotasking. Macrotasking reduces parallel system overhead
Estimate of the computing time for a LES of a clean wing on
and memory use. Correcting for the XMU a maximum speedup
a NEC SX416 computer
of 1.9 is reached on a two processor SX-3.
The parallel performance on the NLR NEC SX-3/22 has been
used to estimate the problem size of a Large Eddy Simulation There are three causes for the not perfect overall performance
of a clean wing on a 16 processor NEC SX-4, which will be on the NEC SX-3:
delivered to NLR in 1996. The NLR NEC SX-4/16 is expected 0 U 0 between Main Memory and XMU in parallel processing
to have a peak performance of 32 Gflopk, a main memory of takes significantly more time,
4 GByte and 8 GByte XMU. With respect to the SX-3/22 its 0 Vector length decreases, and hence single processor speed,
architecture is more suited for indirect addressing and a single 0 Parallel system overhead.
processor speedup of 2 is expected for programs using indirect Concerning the latter cause, the balance between the two pro-
addressing. cessors is, when corrected for the U 0 between MMU and XMU,
The size of the LES is primarily determined by the available as predicted by the size of the parallel part of the algorithm.
memory. Let N be the number of grid points, and n the number Hence, the computational load is well balanced, and the remain-
ing performance loss can only be explained by parallel system
of flow variables. For a Large Eddy Simulation with a one-
overhead. Since the NEC SX-3 is not primarily suited for par-
equation turbulence model we have n = 6. The memory use
22-9
allel use. the relatively high parallel system overhead is not too REFERENCES
surprising. It is expected that the NEC SX-4 has significantly [I1 Barth, T.J.andJespersen,D.C. Thedesignandapplication
less overhead. of upwind schemes on unshuctured meshes. AlAA Paper
894366,1989.
Low-level do-loop parallelkation on the NEC SX-3 b i n s out
to be only sufficientfor loops with a computational load greater 121 Bey, KS. and Oden, J.T. A Runge-Kutta discontinuous
than 1.5 msec. finite element method for high speed flows. AIAA Paper
91-1575CP, 1991.
The parallel efficiency on the SDI Power Challenge is sim-
lar, the peneritage of peak performance is relatively low, even 131 Chapman, D.R. Computational aerodynamics develop-
compared with the NFL! SX-3. Moreover. the cache sensitivity ment and outlook. AIAA Paper 790129.1979.
makes the optimization problem dependent.
141 Cockbum. B and Shu, C.W. TVB Runge-Kutta local pro-
The present parallelization on the NEC SX-3 will not be suf- jection discontinuous Galerkin finite element method for
ficiently efficient on the 16 processor SX4. The parallel ex- conservation laws II: General framework. Math. Comp.,
ecution of the loops over the cell faces is inefficient since the 52411435.1989.
loop length will be too short to be divided over 16 processors.
151 Cockbum. B.. Hou. S. and Shu, C.W. The Runge-
This problem can be solved by limiting the number of neigh-
Kutta local projection discontinuous Galerkin 6nite ele-
boring cells connected to one cell face to at most four, which
ment method for conservation laws N The multidimen-
significantlyreduccs the number of colors and thereby increases
sional case. Math Comp..54545-581.1990.
vector length. The parallel executionof the loop over the colors
contains a sequential pM of 20%. and hence has a maximum 161 Cockbum. B., Lin. S.Y. and Shu, C.W. TVB Runge-
speedup of 5 . This sequential part can be eliminated using a do- Kutta local projection discontinuous Galerkin 6nite ele-
main decompositionof the grid. which also has as main benefit ment method for conservation laws UL One-dimensional
that the grid adaptation parl can be executed in parallel. systems. JCP, 849(L113,1989.
CONCLUDING REMARKS 171 Elsenaar, A., Hjelmberg. L.,BUtefisch, K.A. andBannink.
The discontinuous Galerkin finite element method with lo- WJ. The international vortex Bow experiment. AGARD
cal grid enrichment has been demonstrated on the three- SymposiumonValidation of ComputationalFluid Dynam-
dimensional, inviscid flow field amund a delta wing at ban- ics, Lisbon, AGARD CP 437, 1987. also AGARD Advi-
sonic speed. The use of anisobupic grid refinement of hexa- sory Report 303,1994.
hedron type cells is effective in capturing the shock shucture
and primary vortex on the leeward side of the delta wing. The 181 Hoeijmakers, H.W.M., Jacobs. J.M.J.W. and Van Den
discontinuous Galerkin method works well on highly irregular Berg, J.I. Numerical simulation of vortical flow over a
grids and is therefore a good candidate for Large Eddy Sim- delta wing at subsonic and transonic speed. Presented at
ulations. because it offers the oppolblNty to capture viscous 17th ICAS Congress, 1990. StockholmSweden, 1990.
sublayerswith successively finer grids thmugh local grid refine- I91 Marchant M.J. and Weatherhill. N.P. Adaptivity tech-
ment. An estimate of the required computational resources for niches for compressible inviscid flows. Comp. Meih. in
such a simulation is presented. The use of a face based data Appl. M e c h Md Eng., 10683-106.1993.
shucture works well in combination with local grid refinement
andallowsefficientvectorizationandpmllelizationofthecode. 1101 Moin, P. and Jimenez, 1. Largeeddy simulation of complex
On the NEC SX-3 the possible speedup thmugh parallelization turbulent flows. AIAA Paper 93-3099.1993.
sbungly depends on the vector length. A maximum sped-up of
1.9 on the two processorNEC SX-3 is obtained when sufficient 1111 Osher, S. and Chakravarthy, S. Upwind schemes and
boundary conditions with applications to Euler equations
vector length was available. A good parallel performance, with
in general geometries. JCP, 50447481,1983.
a speed-up of 3.7, is obtained on the four processor SGI Power
Challenge. but the results are sensitive to cache misses. 1121 Shu, C.W. and Osher, S. Efficient implementation of es-
sentially non-oscillatory shockcapturing schemes. JCP.
From the present results it is estimated that for future LES ap-
77439471,1988.
plications in wall bounded flows, the gain from the increased
computational efficiency obtained from highly adapted grids 1131 VanDerVegt J.J.W. Higherader accurateosherschemes
more than compensatesthe increased number of operations and with application to compressibleboundary layer stability.
memory use. A LES of a clean wing at a Reynolds number AlAAPaper93-3051,1993.
of IO6 will become feasible on a 16 processor NEC SX4 in
a lumamund time of one weekend. SigniGcant huther devel- I141 Van DerVegt JJ.W. Anisotropic grid refinemcutusing an
opments, such as the addition of the viscous contribution and unshuctured discontinuousGalerkin method for the three-
implicit time-accurate temporal discretization using multigrid dimensionalEulerequationsofgasdyuamics. AlAA Paper
acceleration (in progress). will. however, be needed to reach 95-1657,1995.
this goal.
1151 Venkatakiishnan.V. Convergenceto steady statesolutions
of the Euler equations on unstructured grids with limiters.
JCP. 118:12(L130.1995.
P
25
.,2
$14
-1,09
-1.04
-. ._.
0.616
0.563
0.457
0.m5
-0.352
-0.299
.-0.24/
e 0.194
0.141
0.0878
22-11
(M, = 0 . 8 5 , ~= 20')
F i g w 5. Toldpnssureloss M d a d q d 8 r i d i n cmss~Naonihmu~hprinoryvoncxcon.
:p t
-0.783
0.755
0.726
0.698
0.61i9
0.641
0.612
0 . 5B-I
0.556
0.527
0.499
0.47
0.442
0.413
0.38-1
0.356
0.321
0.299
0.271
0.242
F
23-1 I
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
23-2
2 Modelling of turbulent flow where R is the flow domain and A denotes the fil-
The equations describing compressible flow are the ter width of the kernel G which is assumed to be
well known Navier-Stokes equations, which repre- normalized, i.e. the integral of G over R equals 1
sent conservation of mass, momentum and energy: independent of x. For compressible flow Favre [2]
introduced a related filter operation f = pf/p.
&p + 8j(p.j) =0 The filtered Navier-Stokes equations contain so-
&(p~i) + 8j(pi.j) + 8ip - 8jTij = 0 (1)
called subgrid-terms, which cannot be expressed in
the filtered flow variables, and have to be modelled
&e + 8j((e + p)uj) - 8j(Tijui - qj) = O with subgrid-models. In this paper we will mainly
Here the symbols 8, and 8j are abbreviations of focus on the modelling of the subgrid-terms in the
the partial differential operators and 8/8xj momentum equations, which can be expressed in
respectively. The components of the velocity vector the turbulent stress tensor, defined as
are denoted by ui,while p is the density and p the
pressure which is related to the total energy density
e by: where ii is the filtered velocity vector. This tur-
1
P = (7 - l){e - -Pi%} (2) bulent stress tensor has several algebraic properties
2 which can be used in the construction and qualifi-
in which 7 denotes the adiabatic gas constant. The cation of subgrid-models [3, 41. Expressions for the
viscous stress tensor rij is a function of temperature subgrid-terms in the energy equation can be found
T and velocity vector U in ref. [5]. They can be neglected in simulations
2 at low Mach numbers, but have to be modelled at
Tij(T,U) = @&.U.
Re ”
+ 8%.- - 6 i j a k U k )
’’ 3 (3) high Mach numbers.
In total six models for the turbulent stress ten-
where p(T) is the dynamic viscosity for which we sor r i j as it appears in the subgrid-terms in the
either use Sutherland’s law for air or treat it as a momentum equations will be investigated and com-
constant. In addition q j represents the viscous heat pared in this paper. The first subgrid-model is the
flux vector, given by Smagorinsky model
order term in A in this expansion can be proposed cells, uniform in the stream- and spanwise direc-
as subgrid-model: tions and clustered near the isothermal, no-slip wall
in the normal direction. A second order accurate fi-
nite volume method was used.
lo", 1
The similarity and gradient model correlate much
better with the turbulent stress tensor than the
Smagorinsky model (see [9]and section 2.1). How-
ever, while the Smagorinsky model is too dissipative
in transitional regions, the similarity and gradient
model are not sufficiently dissipative in turbulent
regions.
The dynamic procedure overcomes the excessive
dissipation of the Smagorinsky model and adds suf-
ficient dissipation to the similarity and gradient
models. We consider three dynamic models. The
dynamic eddy-viscosity model [3] is obtained when
the model constant Cs in the Smagorinsky model is ,-.-.'
500 1000 1500 2000 2500 3000
replaced by a coefficient which is dynamically ob- t
tained and depends on the local structure of the 3
flow. In order to calculate the dynamic coefficient
7-J:) is substituted in the Germano identity, which
2.5.
is a relation between the turbulent stress tensor
for different filter widths [3]. The second dynamic
2-
model is the dynamic mixed model, in which a
relatively accurate representation of the turbulent
stress by the similarity model and a proper dissipa- 1.5-
I
tion provided by the dynamic eddy-viscosity con- I
I
15 20
XI
nite grid-spacing, there is a maximum wavenumber
which can be represented on the grid. Modes with a
higher wavenumber appear as low-frequency modes Figure 3: Shock-capturing in 3D turbulent rnixing-
on the grid. Therefore, numerically, the effective layer.
energy contained in the low-frequency modes can
be increased during the onset of turbulence. One
remedy could be to take a grid that is sufficiently
fine to represent the highest mode which due to a third order accurate upwind scheme in the pres-
physics would emerge in the simulation. Another ence of a shock. See Figure 3. In this way it is possi-
possibility is to use upwind-biased discretizations ble to capture time-dependent shocks which appear
of the convective flux, as has been done by Rai and spontaneously after the transition to turbulence.
Moin [12]. We have used a discretisation of the
viscous flux with a wider stencil than necessary to
achieve the desired order of accuracy. In this way
we constructed a better approximation of the vis-
cous flux. As an example, we were able to calculate
a full transition to turbulence on 963 points using
3.2 Time integration
a fourth order method on a 53-points stencil for
the convective flux, and repeated application of a For the time integration of the resulting discretized
fourth order method on 63 points for the viscous equations we use an explicit 4-stage Runge Kutta
flux, resulting in an 113-points stencil, whereas re- method. We also studied the use of a second-order
peated application of a 43 points operator for the accurate implicit method. The system of equa-
viscous flux on this grid failed. At this moment, tions resulting from the implicit discretization is
further investigation is needed to understand this solved by means of pseudo-time stepping and ac-
phenomenon more clearly. celerated by local pseudo-time stepping and a non-
The DNS mentioned in the previous section has linear multigrid technique. Since we use central
been calculated at Mach number 0.5. In the future spatial discretizations and no artificial dissipation
we intend to perform DNS at higher Mach num- is added to the equations, the smoothing method is
bers. For that purpose we need to be able to capture less effective than in the traditional use of multigrid
shocks. This can be done by switching to upwind in steady-state calculations. In the laminar regime
discretizations in the presence of a shock, which has and in the first stages of turbulence the implicit
been applied succesfully to the supersonic compress- method provides a speed-up of a factor of 2 rela-
ible mixing layer, cf. ref. [13]. In that application tive to the explicit method on a relatively coarse
a fourth order central difference operator has been grid (643). At increased resolution this speed-up is
used for the convective term, which was replaced by enhanced correspondingly. See [14].
23-6
4 Parallel implementation of by the size of the stencil, but also the number of
floating point operations increases with increasing
the explicit solver stencil-size. To see why, recall the general form of
the &-operator, eq. (11)-(12). This derivative is
In this section we consider some implementational
aspects of the explicit solver. We use a simple computed as a one-dimensional derivative acting on
domain-decomposition technique to obtain an im- two-dimensional averages over y and z. For the
plementation on a parallel computer. This is ex- derivative in an internal boundary point these aver-
plained in the first subsection. In the next sub- ages have to be computed for points in the dummy-
section we discuss how the parallel efficiency of the layers as well. But these averages are also com-
resulting code depends on the spatial discretization. puted by the processors dealing with the neighbour-
We distinguish between the intrinsic efficiency of an ing block in order to contribute to the & derivative
of some points in that block. For a discretization on
algorithm, and the hardware efficiency. The former
is related to the algorithm only, whereas the lat- a stencil with N , x N, x N , points, careful counting
ter tells us how good a certain algorithm performs reveals that the number of floating-point operations
on certain hardware. The quantity which is usually for the computation of one derivative is
called the efficiency is the product of these efficien-
cies. We show that the intrinsic efficiency of the
(3N,N,N, + 4dN,N, + 2dN,N, + 4d2N,)(2d - 1).
algorithm decreases as the order of the spatial dis- Note that this expression is not symmetric in
cretization increases. We illustrate these concepts N,, Nu,N,. For the other derivatives the discrete
by some performance results obtained from imple- averaging and differentiation operators can be ap-
mentations on 3 different parallel machines, viz. the plied in such an order that the same expression is
Cray T3d, the Intel Paragon and the SGI Power valid. In the case N , = N, = N , = N , this reduces
Challenge array. Closely related to the concept of to
efficiency is the scalability. We discuss the scalabil- + +
(3N3 6N2d 4d2N)(2d - 1). (13)
ity in the sense of Amdahl and Gustafsson (see e.g.
ref. [15]). Now consider e.g. a given partition of the computa-
tional domain into B3 equal blocks, each containing
( N / B ) 3points. Then the total number of floating-
4.1 Domain decomposition point operations to compute a & for all grid-points
is
Suppose our computational domain consists of N , x
N
N, x N , gridpoints. This domain is divided into
B, x B, x B, blocks. For a distributed memory B
3((-)3 + 6 (N~ ) ' d+ 4d2-)(2d
N
B - l)B3,
computer, we assume that each block is allocated
on a separate processor. If the total size of the which is obviously greater than (13).
stencil used for the discretisation is (2d+ 1)3 (recall
that we use central differences, cf. (11),(12)),then a 4.2 Parallel efficiency
point which has a distance less than d + l grid-points To quantify the considerations of the previous para-
from the boundary of a block not coinciding with graph, we define the concept of intrinsic efficiency.
the boundary of the physical domain, is called an in- Consider a given partition of the computational do-
terior boundary point. This definition can easily be
main into B, x B, x B, blocks. Denote the to-
extended to other discretisation methods. For the tal number of floating point operations for a given
computation of the fluxes for the interior boundary number of timesteps by f(B,, B,, B,). Then the
points, some values of the flow-quantities which re- intrinsic efficiency flintr is given by
side on processors dealing with neighbouring blocks
are needed. To store these quantities, each block
is dressed with d dummy-layers. In order to retain
the second-order accuracy of the time-integration
method, at each stage in the Runge-Kutta time- Note that, on a shared memory machine, if we use
integration, these dummy-layers have to be trans- fine-grained parallellism (on do-loop level), we could
ferred between the various processors. It may be define aintr = 1.
clear that the amount of communication increases We can estimate the dependence of the intrinsic
with the size of the stencil. efficiency on the size of the stencil just by counting
Not only the amount of communication is affected the number of floating-point operations for various
23-1
block-sizes (by using expressions like (13)). In Fig- In general, due to the finite communications band-
ure 4 this has been done for several central differenc- width of the machine, the simulation will last
ing discretizations, using equal shapes and sizes for longer, say T(B,, By,B,) seconds. Then the
all blocks. From the pictures it can be seen that the hardware-efficiency ahw is
efficiency decreases rapidly if the stencil-size grows.
Due to the wider stencil, application of higher-order T(1,1,1>
discretizations results in more floating-point oper- BY,, Bz)BzByBzaintr(Bz, B,, ~
*hw = T ( B ~ z* )
ations, but this performance penalty is even more (15)
severe on distributed memory systems, where also The traditional (total) efficiency 0 is the product
a decrease of parallel performance occurs. As an
example, consider a central differencing second or- * = ahwaintr* (16)
der 2
operator on a 3-point stencil as compared Note that, in general, these efficiencies not only de-
to a central differencing fourth order operator 2 pend on the number of blocks in each direction, but
on a 5-point stencil. To compute the former deriva- also on the number of points per block in each di-
tive on a single-cpu machine costs approximately rection, i.e. on the actual shape of the blocks. This
5/9 M 0.56 times of the time to compute the latter, is not only due to the ratio of interior boundary
whereas on e.g. a 64 x 64 x 32 grid and 128 proces- points as compared to the interior points of each
sors on a distributed memory machine this ratio is block, but also because many processors perform
approximately 0.33. better on long inner loops in the code, due to vec-
torisation or pipelining.
algorithmic eftldency tor vatiws dlscretizations
1 The efficiency CT is related to scalability in the
sense of Amdahl, meaning that a problem which
is solved on one processor in TI seconds is solved
0.9
on P processors in TIlPu seconds. We define one
notion of efficiency related to scalability in the sense
0.8
of Gustafson. Suppose we solve a problem with N
gridpoints on one processor in TIseconds, and a
P0.7
:. problem with PN gridpoints in Tp seconds. Then
5 the efficiency UG is
16 17.1 27.4
d0.7
E
'.\
.,
., .
24
32
13.5
9.9
20.4
15.7 0.6 -
'.*
'
'. *-
.
48 7.5 11.7
64 6.0 11.1 0.5- . - - _ _- - - _
...... . .. ..., . ....-
96 4.8 7.6
128 3.8 - 0.4 -
Table 1: CPU times in seconds (averaged over several
block-divisions). Figure 5: Efficiency for the T3d (dashed) and the
Paragon (dotted). The solid line is the intrinsic effi-
ciency.
dent on the actual shape of the blocks. Therefore
in Table 1 we averaged over some block-divisions
which give roughly the same (approximately best) means that increasing the algorithmic efficiency by
CPU-time. This dependency is illustrated in Table e.g. exchanging information between the processors
2 for the case of 8 blocks. All timings are accurate after every calculation of averages will not result in
to about 5 %. It can be seen that subdivisions with a substantially faster execution of the code. Fur-
an equal number of blocks in all directions are op- ther, all efficiencies eventually approach zero as the
timal. In general, better subdivisions are obtained number of processors approaches infinity. It can be
by using fewer blocks in z-direction. This is partly shown (using expressions like (13)) that the intrin-
due to the algorithm, since an asymmetry is intro- sic efficiency drops as B - 2 / 3 , where B is the to-
duced by the sequence of averaging-operators in the tal number of blocks. However, CTGremains nearly
derivative-calculations, and partly due to software- constant, as is shown in Table 3. Here each block
pipelining in the processors, which is reflected in contains 32 x 16 x 16 points. From this table it
the megaflop-rates (between parentheses). follows that, using this algorithm, doubling the size
of the problem and the number of processors re-
B, x Byx B, T3d Paragon sults in equal computation times. This can also be
1 x 1x 8 34.9 (77) 58.0 ( 47) 335 shown if in (17) the times Tp and TI are calculated
1x 8 x 1 34.3 (82) 54.4 ( 52) 353 as ideal, i.e. assuming no communications delays.
8 x 1x 8 39.2 (77) 67.3 ( 45) 378 Then CTG= 1.
1x 2 x 4 32.4 (80 ) 50.9 (52) 323
1x 4 x 2 31.6 (83 j 50.0 (53j 328 B, x By x B, T3d Paragon
2 x 1x 4 33.0 (79 ) 52.9 (50) 327 1x 2x 1 16.2 (21.1 ) 26.9 (12.7)
2x4x 1 31.7 (84 ) 51.5 (53) 335 1x 4x 1 16.3 (42.1 ) 27.0 (25.4)
4x 2x 1 32.9 (80 ) 54.6 (50) 327 1x 4x 2 16.4 (83.6) 27.3 (50.2)
4 x 1x 2 32.9 (82 ) 55.4 (49) 339 2x 4x 2 16.5 (166) 27.6 (99.4)
2x2x2 31.1 (84 ) 50.3 (53) 325 2x 8x 2 16.5 (332) 27.4 (200)
2x 8x 4 16.6 (661) 27.7 (396)
Table 2: CPU times for various subdivisions into 8 2x 8x 6 16.6 (991) 27.8 (592)
blocks. Between parentheses the Mflop-rates. The 4x 8x 4 16.6 (1322 1 -
last column is the number of millions of floating point-
operations to be performed for each block. Table 3: CPU times and Megaflop-rates (between
parentheses) for increasing domain-sizes iII ustrati ng
From the pictures it can be seen that on the T3d that DG remains approximately constant.
and the Paragon, the machine efficiency is some-
what lower than the algorithmic efficiency. This From the above results it can be concluded that
23-9 I
the T3d and the Paragon show comparable efficien- 4.3 Optimization for cache-machines
cies for this algorithm, the T3d being about 40 %
faster. In many parallel machines the processors use a hi-
erarchical memory structure, consisting of a small
amount of memory with a short access time (the
Besides the implementation on the T3d and the cache) and a large amount of main memory with
Paragon, we have made a preliminary implemen- much longer access time. This long access time is
tation on the SGI Power Challenge Array. This the main reason why the performance of these ma-
machine consists of 4 nodes each comprised of a 16- chines is way below their (often impressive) peak.
CPU shared memory parallel machine. We used In the implementation of a numerical algorithm, it
explicit message-passing between the nodes. On is essential to use the cache efficiently. Therefore,
each node, fine-grained parallelism has been em- the number of load and store operations should be
ployed using the vendor-supplied parallelizing com- kept to a minimum, and quantities which are loaded
piler. The combination of fine-grained parallelism from main memory should be reused as much as
and explicit message passing is not entirely triv- possible before being restored. Further, since el-
ial. On the one hand, using fine-grained paral- ements from main memory are loaded into cache
lelism results in an algorithmic efficiency of 1, since in chunks of a few consecutive elements, do-loops
no additional floating-point operations are intro- should be arranged such that main memory is tra-
duced. Therefore, this form of parallelism seems to versed linearly (as is also necessary for efficient use
be promising at first sight. On the other hand, how- of traditional vector-processors). Moreover, it will
ever, parallelizing a do-loop containing only a few enable software-pipelining on RISC-processors, re-
iterations (in the order of magnitude of the num- sulting in substantially faster execution.
ber of grid-points in one directions) causes much To illustrate this, we compare two different ways
system-overhead, and seriously affects pipelining ef- to calculate the viscous flux. In the first method
ficiency. Moreover, suboptimal speedup can arise (method A) the various derivatives of the velocity
due to the cache-coherency mechanism. The use fields and the temperature are calculated consec-
of explicit message-passing has two disadvantages, utively, and the viscous stress tensor and viscous
namely an algorithmic efficiency less than one, and heat flux are assembled and stored. Then the outer
usually a slow data-transfer. The advantage of ex- derivatives of the viscous flux are calculated, again
plicit message-passing as compared to fine-grained consecutively. The resulting code is very well vec-
parallelism is that parallelization takes place on a torizable and consists of very simple do-loops. In
(much) higher level, leading to less system over- the second method (method B), we use the follow-
head. ing observation. In the calculation of the deriva-
tives, some averages can be used to contribute to
various derivatives. Moreover, for all derivatives,
As an example, consider a problem with 64 x 64 x the averaging weights in one direction are equal.
32 grid-points (the same as discussed above). With Therefore we calculate all inner derivatives simul-
4 processors on one node working on one block, this taneously, which also has the advantage that e.g. a
yields an execution time of 23 seconds for 5 Runge- vector u1 needs to be loaded only once for the calcu-
Kutta timesteps, whereas on 4 nodes with 4 blocks lation of all its derivatives. An analogous fact holds
(1x 2 x 2) and one processor per node the execution for the weights. Further, the derivatives are not
time is 18 seconds. As another example, we com- stored, but directly used to assemble the stress ten-
pare the subdivision into 1 x 2 x 2 and 2 x 4 x 2 sor and the heat flux. After that, all outer deriva-
blocks, both running on 4 nodes. In the first case, tives are calculated simultaneously. This results in
each node deals with 1 block, and in the second case about 30% less floating point operations, and sub-
each node does the computations for 4 blocks, and stantially less load and store operations, resulting
uses 2 processors for each block. So in that case the in better memory-performance. The drawback is
distributed memory model is adopted also within the occurrence of (much) more complicated do-loop
each single node. It appears that the latter case bodies, which puts a severe demand on the compiler
has a shorter execution time. It may be clear that in order to obtain suitable pipelining. It appears
some restructuring of the code is necessary in order that on the T3d and the Paragon there is hardly
to obtain reasonable performance. This will be the any performance gain, and the performance is only
subject of another paper [16]. about 20 % of peak. On one R8000 processor in
23-10
the SGI Power Challenge (coupled to 4 MBytes of [9] S. Liu, C. Meneveau and J. Katz, “On the proper-
cache), the CPU-time of method B is half that of ties of similarity subgrid-scale models as deduced
method A, with a performance of about 37 % of from measurements in a turbulent jet,” J. Fluid
peak (110 Mflops). More details are to be found in 275, 83 (lgg4).
ref. (171. (101 B. Vreman, B. Geurts and H. Kuerten, “On the
Acknowledgement formulation of the dynamic mixed subgrid-scale
model,” Phys. Fluids 6 , 4057 (1994).
The time for the computations on the T3d was provided
by the Stichting Nationale Computerfaciliteiten (Na- 111 B. Vreman, B. Geurts and H. Kuerten, “Large
tional Computing Facilities Foundation, NCF) , which Eddy Simulation of the temporal mixing layer us-
is financially supported by the Nederlandse Organisatie ing the Clark model,” Memorandum No. 1213,Uni-
van Wetenschappelijk Onderzoek (Netherlands Organi- versity of Twente (1994).
zation for Scientific Research, NWO). One of the au- 12) M.M. Rai and P. Moin, “Direct numerical simu-
thors (HK) thanks the Institute for Fluid Dynamics at lation of transition and turbulence in a spatially
ETH Zurich for its hospitality during his stay there. evolving boundary layer,” J. Comp. Phys. 109,169
The use of the Paragon- has been made possible by cour- (1993).
tesy Of B. ETH Zurich* By courtesy Of [IS]B. Vreman, H. Kuerten and B. Geurts, llSho& in
Silicon Graphics Inc. we were able to use the Euro-
direct numerical simulation of the confined three-
pean Power Challenge Array of the SGI Supercomputer
Technologies Centre in Cortaillod. We would like to
dimensional mixing , physics of Fluids, to
appear (1995).
thank Ruud van der Pas (SGI) for his assistance.
(141 J. Broeze, B. Geurts, H. Kuerten and M. Streng,
“Multigrid acceleration of time-accurate DNS of
References compressible turbulent flow,” Copper Mountain
(1995).
I11 B. Vreman, B. Geurts, H. Kuerten, J. Broeze, B. 1151 E.F. van de Velde, “Concurrent Scientific Comput-
Wasistho and M. Streng, “Dynamic subgrid-scale ing,” Springer Verlag, New York (1994).
models for LES of transitional and turbulent com-
1161 M. Streng and R. van der Pas, “Implementation of
pressible flow in 3-Dshear layers,” Turbulent Shear
a compressible flow solver on the Power Challenge
Flow, (1995)
Array”, in preparation.
I21 A.Favre, Physics of Fluids, 26, 2851, (1983) (171 M. Streng and R. van der Pas, “Some performance
[31 M. Germano, “Turbulence: the filtering approach,” considerations for the R8000 Microprocessor”, in
J. Fluid Mech. 238, 325 (1992). preparation.
I41 B. Vreman, B. Geurts and H. Kuerten, “Realiz-
ability conditions for the turbulent stress tensor in
large-eddy simulation,” J. Fluid Mech. 278, 351
(1994).
I51 A.W. Vreman, B.J. Geurts and J.G.M. Kuerten,
“Subgrid-modelling in LES of compressible flows,”
Direct and Large-Eddy Simulation I, P.R. Voke,
L. Kleiser and J.P. Chollet (editors), Kluwer, 133
(1994).
I61 U. Piomelli, T.A. Zang, C.G. Speziale and M.Y.
Hussaini, “On the large-eddy simulation of transi-
tional wall-bounded flows,” Phys. Fluids A 2, 257
(1990).
I71 J. Bardina, J.H. Ferziger and W.C. Reynolds, “Im-
proved turbulence models based on LES of home
geneous incompressible turbulent flows,” Depart-
ment of Mechanical Engineering, Report No. TF-
19,Stanford (1984).
I81 R.A. Clark, J.H. Ferziger and W.C. Reynolds,
“Evaluation of subgrid-scale models using an ac-
curately simulated turbulent flow,” J. Fluid Mech.
91,1 (1979).
24- 1
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
24-2
shocks. T h e subgrid-scale turbulence models that we cur- one can then re-write (1) as
rently used are briefly presented in section 4, although
they will be presented in more details in Lesieur & MC- -ai!
+ - + - +aP- = o ad aH
, (9)
tais (1996). T h e soundest of these models is then applied at a.5 a h ax3
3. NUMERICAL SCHEME
- 1
In Cartesian co-ordinates, the compressible LES equations
can be cast, after several crude simplifications discussed 1
H = - F 3 ,
J (104
in Comte e t al. (1994), in the conservation-like form
using the chain rule (7) for the derivatives arising in the
fluxes P, G and H. Vector U is still a function of the
Cartesian co-ordinates x i and time 2 . In the limit of zero
with
viscosity and conductivity (Euler equations without SGS
U = V P , Put P V , P W ! P e ) , (2)
model), the fluxes F, - still defined by (3) - would be
in which pe stands for the resolved total energy defined, functions of U only.
(5)
J
p ( T )= ~ ( 2 7 3 . 1 5 ) - T 1 + Sl273.15
273.15 . 1 + SIT
V T > 120
(64
,
AEz
in the corrector step ( l l b )
(W and
This is only first-order accurate, which is justified by the
fact that the grids we use are not very distorted, except with
very locally. Therefore azt/a(, remains almost everyw-
here close to 6 ~ ~ . ~ t ( h , t=
) ut(k,t)/Pn with P r t = 0.6 . (16c)
In the same way, the chain rule (7) has to be applied to CK denotes Kolmogorov's constant, and U: = 1 for k / k c <e
eliminate all derivatives with respect to z1 and zz from 0.3. It rises for higher k l k . a good fit of it is (in the case
the fluxes F,. This introduces metrics to be evaluated m = 513 at least),
as said above, together with derivatives of velocity and
temperature with respect to (1 and (2. Consistency then u;(k/k,) = 1 + 34.5exp[-3.03 kJk] . (17)
determines the way these derivatives, and also ala& E
a/az,, should be discretized.
Until now, this model has been used with a fixed va-
lue m = 513, giving satisfactory results, not only in the
The boundary conditions are based on a decomposition
case of isotropic turbulence but also stratified and/or ro-
into characteristics. in the spirit of Thompson (1981,1990)
and Poinsot and Lele (1992). The Riemann invariants tating homogeneous turbulence and temporally-growing
of outgoing characteritics are extrapolated, whereas the free shear flows (mixing layers, wakes). For streamwise-
incoming ones are either prescribed (e.g. at the inflow and-spanwiseperiodic wabbounded flows, the easiest way
boundary) or set to zero (non-reflective or open boun- of accomodating grid refinement at the wall is to work
dary condition). For example, going back to Cartesian on z z planes, normal to the wall, over which 2D spec-
co-ordinates for the sake of simplicity, in the case of a tra E z D ( ~ z D ,can ~ ,be~ computed.
) Assuming again is-
boundary perpendicular to the direction 21, the Euler tropy with E(k) m h-"', one can relate EZO to E and
equations are recast in their quasi-linear form express eddy viscosity and conductivity ut(kzD,y, t ) and
~ ~ ( L ~ ~ , y , t ) (16a).
f r o mOneofus (E.L.)didit in thecase
av
-+A
av of a plane turbulent channel flow. With m = 513, results
- = 0 , with V = T ( p , p ~ ~ , p ~ z , p ~ 3. , p ) are qualitatively correct, but the wall shear stress r, are
at azl
113)
\ - ~ , underestimated by about 20% (Fig. 1, top). This is be-
The matrix A is, as per usual, diagonalized in the form cause the model is too dissipative near the wall, where
A = L-lAL. Assuming L to be locally constant and in- experimental measurements show spectra steeper than
troducing the vector W = LV, system (13) decouples into k-'l3, Much better statistics are obtained with a variable
5 equations of the form m(y,t) estimated at each timestep from E z ~ ( k ~ ~ , y , t )
through a least-square fit between b , / 2 and k 2 0 ~ .the
aw
-+A-
aw = o , cut-off wavenumber (Fig. 1, bottom).
at azl
to be solved at the boundary point N through the semi- These results correspond to simulations at R = U<,,,h / u =
implicit scheme 5000, in which h denotes the channel's half height and
U<,*,the centerline velocity of alaminar Poiseuille flow of
same flow rate (usual convention). This should yield R, =
u,h/u z 200, which is the case for the top plot of Fig. 1
(instead of z 180 for the bottom one). Both calculations
are performed by means of de-aliased pseudo-spectral me-
thods on z z planes and 6th order compact schemes in the
y direction (details will be provided in Lamballais et al.,
(15) in preparation). The resolution is 64 x 65 x 32, for a do-
For the outgoing characteristics (A;; > O), the values of main of size 2nh x 2h x nh, so that the cut-off wavenum-
W"+l are obtained from that of A,; W E and wkTl, which bers along z and z are the same. Exlension to non-square
24-4
This model enabled Ducros to perform the LES of a spa- FIQ.2 - From top to bottom: iaosurfscea ut = 213 Y d v m
tially-growing boundary layer (at Mach 0.5) between Re, = hy the SF, FSF and SFS model, nspectively, in the tran*-
3.3 lob and 1.14 l o', which widely encompasses the tran- t i o d portion of a spatidly-gmwing boundsry layer at Mach
sition region, for a cost ofahout 80 hours of Crag 2. With 0.5 simulatednith the FSF model ( D u m and D u m et d.,
the first mesh line at gt IJ 3 (i.e. with just one point in 1995,or Comte et al., 1934). T h e -e velocityfieldwas used
for the three plots (a priori test).
the viscous sublayer) and only 32 points along g, statistics
were found to be within 20% agreement with experimen- aLES w i t h the originsl structure-functionmodel
tal data, as in Fig. 1 top.
24-5
L. 01 06 08
FIG.4 - (cont'd) - Note the good behaviour of the outflow With such values. 2D simulations are not possible without
boundary conditions. flux limiters or artificial viscosities. With a viscosity 8
times as large, they become possible without such limi-
ters, and Figure 7 shows the resulting vortices, in time
Investigating sensitivity to the nature of the upstream evolution. In such a case, the code gives approximately
perturbations would not be pertinent in such a narrow the same results as the second-order Mc Cormack code
domain'. We thus doubled L. and its corresponding num- SIERRA of ONERA (Lupoglazoff & Vuillot, 1992).
ber of collocation points. This should not change things
much in the quasi-2D case ( ~ Z D , E I D =
) (10-5,10-4).H- In 3D at the trne viscosity and with the filtered struc-
wever, with ( E ? D , E I D ) = (lO-',O), helical pairings are ture function model described above, the advantages of
observed in the wider domain (Fig. 5). The interested the (2,4) scheme become evident. The following figures
readers are refered to Silvestrini et al. (1995) for more correspond to a LES at a spanwise resolution of 90 points
details equally spaced over the span L , = r H FJ 0.141 rn,
with periodic boundary conditions. The initial condition
consists of the 2D flow shown above, taken at a given in*
tant of the steady regime, with low-amplitude white noise
(of amplitude IO-' the speed of sound a! the surface of
the propellant) on all the components of U.Without this
perturbation, the flow would have remained 2D, which
proves that the code is not *noisy". After having reached
the steady regime, which took 50 hours of Cray 90 at
450 Mflop (corresponding to Erns of real time), time s*
I
I
' "Spanwise correlation lengths are of the order of 3 - 5 6,
(6, is the local vorticity thickness). However,the large vortices
typicdly have lengths of order 20 6, when the irregularities
along the span are ignored" (quoted from Browand br ' h u t t ,
I
1985).
I
FIG.7 - Contour maps of entropy at 5 equally spaced ins-
tants.
24-1
Y. H. Qian
Department of Applied Physics,
Coluriibia University, New York,’NY 10027, USA
S . Succi
IBM European Center for Scientific and Engineering Computing,
171 P.le Giulio Pastore, Roma, 1-00144, It.aly
The Lattice Boltzmann Equation (LBE) is a direct day computing architectures (increasingly faster
method to solve the Navier-Stokes equations on a on the floating point side); a wider degree of lati-
digital computer. LBE is rooted in boolean lat- tude in choosing the details of the evolution rule;
tice gas techniques, a sort of ”minimal” molecular a reduction of the separation in scale between the
dynamics scheme based on the observation that micro-world and the macro-world (i.e. the aver-
the large-scale dynamics of fluid flow is largely in- aging operation on a suitable region of the micro-
dependent of the details of the underlying micro- dynamical lattice needed in boolan simulations t o
dynamics. This suggests that in order to numer- remove statistical noise is no longer necessary).
ically integrate the differential equations describ-
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
25-2
Ever since, the method has gone from strength lular automata (LGCA). Interest in LGCA origi-
to strength up to the point where it can be put nated with the seminal paper of Frisch, Hasslacher
on a par with most advanced computational fluid &. Pomeau (1986) in which it is shown that a sim-
dynamics (CFD) techniques for a large variety ple automaton living on a 2D hexagonal lattice
cf problems, ranging from fully-developed homo- can provide, in the limit of large scale motion, a
This paper is organized as follows: first we space are involved; (3) to get from the Boltzmann
present a cursory view of the LGCA nd LBE tech- level to the Navier-Stokes continuum level, the as-
niques respectively. Subsequently we describe two sumption that the particle mean-free-path is much
applications of LBE to the area of fluid turbu- smaller than any macroscopic variation length is
lence: three-dimensional Rayleigh-Benard convec- made. The formal procedure to achieve the hy-
tion and three-dimensional channel flow turbu- drodynamic description of LGCA is based on a
multiscale formalism using the Knudsen number
25-3
0 Local interaction model, ideal for parallel pro- N ; . The problem of noise in equation (1) is ab-
cessing sent because N; is a real variable and no aver-
age at all is needed to recover the macroscopic
0 Ease of implementation of extremely irregular
fields. McNamara & Zanetti (1988) proposed to
geometries and boundary conditions
use Eq. ( 1 ) directly for hydrodynamic simula-
tions with the A, arising from the corresponding
The price t o be paid for these advantages re-
boolean models. In particular, they studied the
flects in the following disadvantages:
model defined by the FHP-I11 rules by simulating
the decay of shear and sound waves of finite wave-
0 Statistical noise
lengthi [lo]. The comparison between the numer-
0 Exponential complexity of the collision oper- ical values and the Chapman-Enskog multiscale
ator with increasing number of states/site predictions shows that the hydrodynamic value is
0 Relatively high-viscosity and therefore low ef- accurate to better than 5% even for a lattice as
fective Reynolds numbers small as 4. Also the behavior of sound waves is
satisfactory.
The issue of statistical noise is a common fea- The McNamara-Zanetti approach, while fixing
ture of all particle models; substantial space/time the problem of statistical noise, is still left with
averaging is required t o extract reasonably smooth the intractable complexity of the collision oper-
hydrodynamic signals out of the LGCA micrody- ator because all b-body interactions included in
namics. The issue of exponential complexity is the boolean collision term are still present. This
also typical of finite-state algorithms. makes their approach unviable in more than two
dimensions.
3 Lattice Boltzmann dynamics
Higuera & Jimenez (1989) [6] noticed that the
Lattice Boltzmann equation can be further sim-
Lattice Boltzmann techniques provide a way out
plified without losing any generality in terms of
of both of these problems. With the assumption
hydrodynamic fidelity. The reason is t h a t macro-
of molecular chaos, it is possible to write the fol-
dynamic equations in LGCA formally arise in the
lowing kinetic equation:
double limit of small Knudsen numbers and small
N;(ac'+ < , t + 1) - N;(ac',t)= A ; ( N ) i = 1,b (1)
25-4
Mach numbers. It is then convenient to consider bound to the Reynolds number attainable since
the expansion of the collision term on the right the LBE viscosity is exactly the same that results
side of (1) corresponding to these conditions. To from the corresponding LGCA. Given the fact that
do this, let us write N , as one is ultimately interested just in the large-scale,
hydrodynamic features of the flow, a t this point,
this appears as an unnecessary restriction.
N , = Nleq(p,.)' + Nlneq(Vp,Vu), (2)
One is therefore naturally led to regard the
and further decompose Nteq as LBE as a self-standing model of the Navier-Stokes
equations, regardless of any underlying LGCA dy-
~ , e q= N , ( O ) + N ~ I )+ N , ( ~+) o ( M ~ ) (3)
namics (Higuera, Succi, Benzi (1989)) [7].
where the upper index refers to the order in the The starting point in the definition of the 'self-
Mach number M . This expansion permits to ex- standing' lattice Boltzmann equation is again the
press the collision operator in terms of a simple linearized kinetic equation (4). The change in per-
2-body scattering matrix spective is however substantial: the choice of the
quantities A;j and N,eq in (4) is no longer dictated
by an underlying boolean microdynamics but is
&(N) 2 A ; j ( N j - NJ' 1. (4)
rather adjusted to the macroscopic equations to be
T h e Higuera-Jimenez LBE marks an important In a similar vein, Bhatnagar, Gross & Krook
breakthrough as it opens the way t o practical (1954) used a relaxation approximation t o model
three-dimensional simulations of fluid flows; as the effect of complicated collisions [a]. The ba-
a matter of fact it turns a 2b complex problem sic formulation of lattice BGK models can then
(where 6 is the number of bits at each lattice site) be described as a simplified Boltzmann equation
into a b2 complex one! The quasilinear LBE in- starting from time evolution equation as
troduced by Higuera & Jimenez is still in a one-
to-one correspondence with its underlying LGCA
microdynamics. This sets a relatively strict upper
25-5
where w is a relaxation parameter (collision fre- where IS1 is the amplitude of the strain tensor
quency in kinetic theory). The key point here and C1 is a constant.
is the choice of the equilibrium state N;" so that
A nice property of LBE is that the strain tensor
it leads to the exact Navier-Stokes equation at
Sap is available locally as an appropriate linear
hydrodynamic space and time scales. The right
combination of the particle populations Ni Other
choice is
eddy viscosity models may be implemented in a
similar way. The inclusion of standard wall condi-
tions for the eddy viscosity is equally straightfor-
where c, is the speed of sound, and t , are weights
ward.
depending on the square amplitude of the velocity
From a numerical point of view the LBE is basi-
p (since particles are either at rest or move one grid
cally an explicit finite-difference scheme working
site per timestep, p is an index from 0-2 in 2D, 0-
at the edge of the Courant-Friedrichs-Lewy con-
3 in 3 D , which labels particles at rest, in motion
dition c a t = A x and bearing a significant resem-
along or in motion diagonal to the grid). Require-
blance with the Dufort-Frankel scheme. It is char-
ments of isotropy and Galilean invariance impose
acterized by a favorable computationlcalculation
constraints on the weights t , which are model de-
ratio which is key to its amenability to parallel
pendent (Qian and Orszag (1993) [ll]).
implementations across virtually the whole spec-
A two-scale analysis in time leads to the effec-
trum of present-day parallel computers. This fa-
tive hydrodynamic equations at second order of
vorable ratio is achieved at the expense of some
the Knudsen number (the ratio of mean free path
extra-memory and CPU overhead (the number of
t o characterist ic length):
discrete populations exceeds the number of signif-
!
25-7
z has been adopted). [2] P. Bhatnagar, E.P. Gross, and M.K. Krook,
Second, we note that d" is within the error Phys. Rev., 94:511, 1954.
internal consistency checks This yields: saioli, S . Succi and R. Tripiccione, Phys. Rev.
E, R29, , 48, n.1, 1993
point per step on a IBM RS 6000 mod. 580 work- cione, Eur. J . Mech. B/Fluids, 14, n.1, 67-74,
Further work is needed t o judge upon its com- [9] P. Moin, J . Jimenez, J. Fluid Mech., 225, 213
petitiveness on a more quantitative ground. 1991
0.250
x 0.200
3
&
U
40.150
>
e
z
=0.100
......o 0...
-e..
.... 0 .-....
0.050 ...... 0 .....
........... 0.......
::::.......
...........g........
0.000 4 I I
I
1E+OO 1E+01 1E+02
AXIS 2
25-10
............................ ...............................
26- 1
INTRODUCTION
It is well known that two types of transition are possible in the boundary layer: natu-
ral and ’bypass’ transition (see review of A.M. Savill [14]). First type of transition is
observed in the artificial case of low free stream turbulence, ’bypass’ transition usually
takes place in real technical equipment: aircraft, turbine engine etc. Theoretical inves-
tigations of both type of transition excite such difficult questions as problem of model
construction, problems of accurate and effective space and time resolution.
Known models can be divided onto two parts: semi-empirical models (for instance,
Savill-Launder-Younis model [15] ( 1995)) and models based on reduction of initial-value
and boundary problem for Navier-Stokes equations (adding of artificial term of mass
force adopted by Laurien E. & Kleiser L. [ l l ] (1989), Parabolised Stability Equations
model, which was designed by Bertolotty F.P., Herbert Th. & Spalart P.R. [l] (1992),
’fringe’ model suggested by P.R. Spalart [17] (1993)). We describe now one model of
second type, namely, the Slow and Fast disturbances interaction Model (SFM) designed
by V.S.Chelyshkov [6] (1993). The model is based on the assumption that slow and
fast disturbances interaction in longitudinal coordinate is possible in such weakly non-
parallel flows as non-gradient and gradient boundary layers, jets and wakes. This idea
was developed last years in the papers [4, 5, 6, 81 (see also review by V.T. Grinchenko
& V.S. Chelyshkov [9]). The approach is valid for 3-D flows, but we shall regard for
simplicity 2-D boundary layer near semi-infinite flat plate.
It is known that two scales of flow in longitudinal coordinate (slow and fast) are possible
near a flat plate. Blasius flow is slow (weekly non-parallel) flow. Two dimensional
perturbances of Blasius flow are divided into two types: slow undamping perturbances,
which control the boundary layer thickness [12] and fast non-stationary perturbances
[16]. Both types of perturbances must depend on slow longitudinal coordinate, but
experimental and theoretical investigations show, that we can neglect this dependence for
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
26-2
second ones [16].Fast disturbances self-interaction results in fast and slow disturbances. I
The last ones make the contribution to weekly non-parallel flow compound. So the way of
SFM construction is following. Let 1 be the distance from leading edge to a fixed point on
I
a flat plate, U, - a velocity of run flow, p - the fluid density, U - the kinematic viscosity
coefficient. Cartesian coordinates (d, y') are introduced to describe 2 D non-stationary
1
flow, which depends on time t'. These coordinates beginning coincides with leading
edge, and 2'-axis directs along the plate. The velocity vector components are designated
as u',v' in this coordinate system, p' is the pressure. We choose the non-dimensional
variables using the formulae:
I
2' = 1x0, y' = S*y, t' = lUG'T, U = U,U, U' = U,Xv,
F = { U ,v,PI(X0,Y, T )
is described by Navier-Stokes and continuity equations
1
+ uaXou + vdyu = -dXop + -(ay9
E2
+ X2dXoX0)u,
X2(dTv + udXov + vayv) = -ayp + X2 -(ayy
K2
+ X2aXoX0)v, (1)
dxou + dyv = 0.
Equations (1) need suitable initial-value and boundary conditions in the flow domain,
which is not defined for the present. Boundary conditions
are set on a flat plate and far from the wall. Poisson equation for pressure
F i~ FB, F B = { ~ ~ , ~ ~ , p ~ } ( X o u, By =) ,{U B , UB },
satisfies Prandtl equations and the boundary conditions
Thus physical condition of damping v when y+m is substituted for limitness condition.
We define
+
Xo = 1 X, X = Ax, T = A t , Re = K ~ / X . (5)
26-3
F = F~ + F~ + ~f
S s s
FS = { U S , ~ S , p S } ( ~ o , Y , ~ ) ,= {U ,2, 1 (6)
Fr = {uf,vf/X,pf}(x,y,t), uf = {U f , Uf }
Here FS and Ff are the vector fields describing slow and fast disturbances. We introduce
the x-average in V
-
F=-
CY
27r
1
TICY
-T/CY
Fdx
and shall suppose that F’ = 0. Substituting (6) to initial problem (1) - (3) and throwing
away, as for laminar flow description, addends of the order of O(A2),a system of equations
and boundary conditions are obtained. We add to nonlinear equations and subtract from
them x-average of the convective addends, which contain fast disturbances. Now we can
separate in the convenient way all addends of each equation into two parts. Then we
break these two parts of addends and equate to zero each of them. The problem is
obtained:
atus + -((ax,U
IC2
Re
B US + ( U B + uS)ax0uS+ a,u B S + (?IB + VS)a,uS)-
2,
1
- -ay,us
Re
+ N,(UB, us, U f ) = 0, (7)
&,US + ayvs = 0,
&U’ + 8,Vf = 0,
26-4
,f Iy=o= f lv=o-- 0 ,
Re
+ dxou )a,uf+
In our opinion the SFM (7) - (14) describes near-wall flow in both cases of low and
high free stream turbulence. The equations have no the second y-derivative of us. That
is why the physical condition of damping vs ldr from the wall is replaced here, like in
(4), by limitness condition, and the solution of problem (7) - (14) will not be uniformly
applicable. Relationship
a,vs O(X)
is the condition of the model validity. This relationship cannot be established a priory,
but seems to be acceptable due to week dependence on time of FS. The natural conditions
of disturbances damping far from the wall, like in (2), have to be carried out for ”fast
part” of flow field. Substitution of one of Navier-Stokes equations for Poisson equation
allows us to construct time discretization schemes without the need for fractional step.
This way also gives the possibility to extend the solution algorithm to 3D-problem.
The values of velocity vector components are unknown at the boundaries orthogonal
to the wall. We cannot introduce periodicity conditions at these boundaries because the
flow is weakly non-parallel. Following the idea of boundary layer coherent structures [2],
we shall suppose that the flow is close to periodic in longitudinal direction and
To vanish slight arbitrariness in these boundary conditions we shall construct the solu-
tions, depending on both longitudinal coordinates in some special way.
26-5
Direct methods are applied for discretization of the problem (7)-(14). The known forms
of perturbances dependence on longitudinal coordinate are used for trial functions choice:
j=O j=O
PS = P Y Y , t ) , 77 = Y / G , (15)
{Uf,V f , P%, Y, t ) = b k , Vk, P k } ( Y , t ) exP(iakz)- (16)
Ikl<K,k#O
In (15) vo = 0, and power indexes vj ( j > 0) are selected on the basis of vorticity
exponential damping far from the wall, such as v1 = 1, v2 = 1.887, v3 = 2.867, v4 =
3 . 8 , . . . . Now we can expand first two terms in (6) into Taylor series in X, substitute the
result t o (7)-(14) and throw away addends of the order of o(Xz) in (11). Using (15),
(16) and expanding variable Xz into Fourier series in (11) we can separate longitudinal
coordinate by projection equations under consideration into two systems of test functions:
X k , IC = 0 , 1 , . . . , and exp(iamz), m # 0.
Sequences Xo-uJ and X k are not orthogonal to each other in the interval of their
changing. This leads to numerical difficulties for slow part of solution, when N is large,
due to necessity to inverse matrix of Hilbert’s type.
The next stage of approximation is the solution representation in coordinate orthog-
onal t o the wall in the interval [O,m).The asymptotics of the velocity and the pressure
field coefficients of fast disturbances far from the wall have the form e z p ( - a k y ) for
near-wall modes, where k > 0 is Fourier harmonic number. Therefore in the problem
class at issue for solution approximation in coordinate y it is convenient to use exponen-
tial polynomials (EP) orthogonal on semi-axis by weight of unity. Some computational
and/or algorithmic advantage can present EP &(y) = ezp( -ky)PA:y)(1 - 2 e z p ( - y ) )
obtained by orthogonalization of exponential sequence in inverse order, starting from
some number n [3]. Here P$’”’ are Jacobi polynomials. These polynomials are used for
solution representation in coordinate orthogonal to wall, and sequence €n,k is filled up by
unity for approximation of vertical velocity vector component of slow disturbances. Final
projection into phase space is carried out by Bubnov-Galerkin method, that allows one
to use the ’boundary functions’ [13] to satisfy the boundary conditions at the wall. For
precise numerical integration Gauss quadrature formulae derived in terms of properties
of EP is applied, so 3n/2 points are used in the algorithm.
The described way of spatial approximation results in triangular matrix as discrete
analog of Laplacian that allows one to employ explicit schemes in time. So variant of
Runge-Kutta method was adopted for time resolution.
The following stage of discretization is stated in details in [4]. Collocation method is
more preferable for 3-D flow modelling. Variant of collocation method, namely, combined
direct method is suggested in [7] for near-wall flow simulation.
26-6
Level of flow vorticity far from the wall y = 0 and inflow boundary conditions define the
influence of free stream turbulence in D-domain. Really, recent experiments [19] show,
that high free stream vorticity before a flat plate changes Blasius profile ancl escites fast
oscillations near the nose part of the plate. So both time-undamping slow perturbances
a.ncl fast disturbances are developed due to changing or inflow condi tions at the bounda.ry
of D-domain.
We shall consider here more simple case of exponentia.lly sinall free stream turbulence.
In this case we shall suppose that influence of time-undamping slow perturbances is
small for natural transition and slow part of disturbed flow is one-dimensional in the
boundary layer coordinates ( q , X o ) ,so N = 0 in (15). We omit the terms of the order of
O ( X ) in equations ( l l ) ,so periodicity conditions are valid at the orthogonal to the wall
boundaries for fast disturbances. Such simplifications lead to initial-value and boundary
problem, which has no functiona.1 u b i trariness in space. We also shall suppose that
modes of continuous spectrum are not excited 'and our algorithm is constructed in such
a way that disturbed flow clamps far from the wall in accordance with asimptotics of
u ear-w a.11 M O cl es .
Physical parameters Re = 520 and CY = '0.30s set 2-D flow doiiiain. Simulation
parameters are Ir' = 7 , n = 32. The parameter values yield dynamic system, which
has 409 degrees of freedom. The simulation was performed for interval 0 5 t 5 20000.
Initial values of amplitudes were determined from the solution of Orr - Soiiiiiierfielcl
eigenvalue problem. The values correspond to initiation of Tollniein - Sclilichting wave
with phase velocity eclual to 0.396. The clisturbance clevelopinent pictiire is diviclecl into
-
lhe two pa.rts. A t first ( 1 is less than 10000) the triivelliiig wave regime w i ~ ; l increiuiiig
amplitude arises. When the oscillation energy reaches some value the single-wave regime
i
1 1:. 1 I
2
harmonics, which have oscillations with frequencies according to the picks. One can see
that apart from main travelling wave, which has phase velocity equal to 0.566, other
oscillations' 'exist. Among these oscillations the largest energy has tlie oscillation with
convective speed equal to 0, SO9 Urn, which is excited by the second z-Foirrier harmonic.
It is of interest that each space scale has own nuiiibcr of oscillation frequencies. I t also
appears, that near-wall travelling wave phase velocity practically coincides with near-
wall propagation velocity of perturbations i n channel [lo]. In contrast with the result
of work [13] we have found that phase velocities of both pressure ancl friction equal to
each other near the wall. The skin friction s-Fourier harmonics f,(ak) decay rate is
shown in Fig. 4 for simulation time t 20000. One can see that the decay is enough -
0.005 1-
rapid, so the seventh harmonic is about 200 times less than the first.
Direct numerical simulation experience leads to the conclusion that non-dimensional
time, which is necessary to obtain fully developed flow, usually is very long. Curiously,
the according physical time is enough short. Let us T is the dimensional time, so
r = vRe / ULt.
If Urn = 1m/s and the fluid is water, then the physical simulation time is 10,4 s in
examining case. This time greatly differs from the computer time, which is necessary
for 2-D':modelling. Simulation of 3-D boundary layer is more dificult problem and tlie
statistically steady solution have not been obtained up to now in this case (see, for in-
stance, [IS]).
CONCLUSION
1. The new mathematical model based on Navier-Stokes equations has been devel-
oped. The model can be eRective for quantitative description of a class of weakly non-
homogeneous flows. The model was tested by consideration 'the flow stabili ty problem
near a flat plate.
26-8
2. To verify our model approach and discretization algorithms we have carried out
long-time DNS of disturbed Blasius flow for various but moderate numbers of degrees of
freedom.
3. We have found that balance between the numbers of taken in orthogonal directions
functions have to be observed. If Ii’ is the number of taken Fourier harmonics in lon-
gitudinal direction and n is the number of taken exponential polynomials in orthogonal
to the wall direction then n = n ( K ) for successful execution of our algorithms. It is
essential to notice that increasing of I< leads to n-increasing.
4. Our experience of near-wall flow modelling leads to the conclusion, that numerical
solution breakdown, the so-called ‘turbulence arising’ does not correspond to the real
physical phenomena in the boundary layer.
5. In our opinion we have found statistically steady state of flow near a flat plate. This
flow is time-organized structure, which has the background of quasi-periodic oscillations
with incommensurable frequencies.
References
1. Bertolotty F.P., Herbert Th. & Spalart P.R. Linear and non-linear stability of the
Blasius boundary layer// J. Fluid Mech, 1992, 242, 441.
2. Cantwell B. J. Organized motion in turbulent flows//Ann. Rev. Fluid Mech, 1981,
13, 4.57.
8. Chelyshkov V.S. Wave regime in the boundary layer// NAS of Ukraine. Institute
of Hydromechanics. Annual report for 1994. Kiev, 1995, 41.
9. Grinchenko V.T., Chelyshkov V.S. Direct numerical simulation of boundary layer
transition. In: Near-Wall Turbulent Flows, Ed. So, Speziale and Launder. Elsevier,
1993, 889.
26-9
I<. Becker
D e p a r t m e n t EF 11
Daimler-Benz Aerospace Airbus GmbH
D-28 183 Bremen
S . Rill
Hochschule Bremen
Hunefeldstr 1-5
D-28 I99 Bremen
Germany
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
21-2
1st Local
SUb- Bloc k
Level
Global
Fine
. Level
Global
Medium
Level
Global
Coarse
Level m
I i k
Cycles Cycles Cycles Cycles
Figure 1: Multigrid sequence with local refinement,.
meshes and of clustering grid points in a ”quasi unst,ruc- robust, fast and easy, we adopted it also for the Euler
tured” way by scattering sub-blocks and even further re- meshes.
fined blocks in regions of discret,izat,ion errors. It is en-
visaged to use this met,hod for solution adaptive mesh 4.2 ~ ~ ~ ~ between
~ , ~ alld~
~sub-blocks ~ ~
refinement if t.he regions of sub-block refinement are de- Coarse Blocks
termined automatically during the iterat.ioii by suitable
sensor functions. In general, sub-blocks cover only part of the computa-
tional domain. Boundary conditions on their outer block
boundaries must be defined such that there is no algo-
4.1 Surface and I n t e r i o r Point D e f i n i t i o n rithmic influence on the overall flow solution. Within tlie
When a sub-block is created, between each two mesh mult.igrid cont,ext,, flow variables are interpolated from
points on a coarse grid line an intermediate fine grid point the coarse mesh. If tlie sub-block boundary touches
lias to be introduced. the coarse block boundary, the same boundary condi-
On any component’s surface, t.his new point lias t o lie on tion is applied. Wall, symmet.ry or similar conditions are
the surface. This means that the new point has to be con- thus treated correctly. Special t.liings have to be done
structed using the original surface definition. However, if the sub-block boundary lies inside the coarse block.
this causes severe problems if t,he surfaces are defined Boundary values of the sub-block cannot be set as fixed
by external CAD means, for example. Therefore, most Dirichlet type conditions bccause this conflicts with tlie
oft.eii special interpolation procedures are used which cre- mixed type nature of the flow equations. T h e int,erpo-
a t e local surface approximations from t.he existing coarse lated values serve only as initial guess and the values
mesh points. T h e single approaches differ by the qualitmy are updated using the original flow equations themselves
of surface representation. For aerodynamics, the criteria on the fine mesh. Thereforr a t least one row of guard
of absolut distances t o the real CAD surface and wavy- cells has to be created around the sub-block which con-
ness of the interpolated surface play the major role. For tains the flux int.egra1 information needed for the applica-
the moment, we don’t. want to stress this problem: we tion of the cell vertex discretizat,ion a t the real sub-block
simply use Coons’ local patches. boundary. This procedure is quite the same as is applied
T h e definition of interior fine mesh points is not that con- between two adjacent blocks of the original non-refined
strained. As long as Euler meshes are considered, those mesh. In addition to this, co‘iservativeness has t o be en-
mesh points can be construct.ed using simple trilinear in- sured across the sub-block boundaries. In our code, this
terpolation of the coarse cells in t,he field. is achieved by replacing the flux integrals along coarse
For the very dense Navier-St,okes meshes, in the vicinity cell faces a t the sub-block boundary location: the coarse
of a curved surface intersections of field mesh lines wit,h mesh integral is replaced by tlie sum of the participating
the true boundary are very likely to occur with t d i n e a r fine mesh integrals.
interpolation. Therefore the filling algorithm has been This type of communicat,io:i between sub-blocks and
changed to the use of Coons’ representation for each mesh blocks is managed wit,h the help of the face group con-
plane parallel to tlie surface, not, only tlie surface planes. cept. Each block lias a t least one face group. This group
This guarantees smooth behaviour of the mesh in the of six faces consists of the minimum/maximum index
whole sub-block, especially in t,he boundary layer mesh. planes (boundaries of tlie computational domain) of tlie
Additionally, its avoids any intersection of mesh lines or block. For each sub-block that is added to tlie coarse
planes with fixed surfaces. Because this approach is t,liat, block, a new face group is denned. It contains those
27-3
Sub-Block 1 Sub-Block 2
Figure 2: Sub-blocks within a mesh block - schematic
view.
segments of coarse mesh planes that coincide with the For example, if we have a four block finest mesh, in a first
block boundaries of the respective local sub-block. So adaptation step sub-blocks might be suggested only for
this face group is the hull of the sub-block inside the three blocks. This leads to different finest levels on dif-
coarse block. The respective topological description dat,a ferent blocks within the mult,i-grid cycles. Additionally,
are nsed to drive the communication offlow variables and consecutive sub-blocking during subsequent adaptation
other relevant data within the flow solver. loops has to be allowed which means that snb-sub-sub-
If two sub-blocks within a coarse block or across the ...blocks can occur. Such and similar conditions have
boundaries of two coarse blocks are adjacent to each been investigated concerning the convergence behaviour
other, then communication should be allowed directly and the quality of solution. especially at the junction of
between those sub-blocks. The simplest way is to trans- refined and non-refined regions. No specific problem has
fer data from a sub-block to the respective face group been detected with the Eider flow solver. However, with
of the coarse block and from there to the neighbouring the Navier-Stokes solver it turned out that the imple-
sub-block. However, this path of communication con- mentation of the turbulence model has a great impact.
tains interpolation errors and should thus be replaced by In practice, the Baldwin-Lomax model used requires wall
the immediate transfer of data from one sub-block to its distance information. This information is very difficult to
neighbour. Within the topological description data, this obtain in general multi-block meshes if it is not evaluated
problem could be easily solved because sub-blocks are in a preprocessing step.
treated in the same way as usual blocks.
If anew sub-block isconstructed, the topological data are 5 SENSOR EVALUATION
updated automatically. Boundary conditions and con-
The evaluation of any sensor field always means scanning
nections to adjacent sub-blocks are detected and included
the field for a pre-specified range of values that are con-
in the description. This makes the fully adaptive incor-
sidered to indicate deficiencies of solution accuracy. We
poration of new sub-blocks into an existing multi-block
can distinguish between sensors that depend on the flow
mesh relatively easy once the respective coarse mesh face
solution itself and sensors that are defined by purely gew
group boundaries are known. One major technical dif-
metrical quantities. Mathematical analysis of discretira-
ficulty is the generality of sub-block to sub-block con-
tion leads to certain guideliues concerning the mesh. One
nections. Up to now, two sub-blocks of a coarse block
of those rules is that one should use smooth and orthog-
are only d o w e d to touch each other if it is with one full
onal meshes. Measures of those quantities can thus be
face. Touching only with part of a face would require
used to determine "bad" regions within an existing grid.
new segmentation of the respective faces and can easily
On the other hand, the flow itself shall drive the mesh
result in very complex face segmentations. On the other
in order to properly resolve special features like shocks,
hand, the above restriction hinders an effective treatment
stagnation regions. boundary layers or shear layers. The
of diagonal refinement. For the moment the drawback of
analysis of respective Sensors leads to suggestions for en-
full face touching has to be overcome by resizing respec-
hanced grid density regions.
tive sub-blocks. However, part-of-face touching is under
development.
Several topologically different sub-block configurations 5.1 Flow Independent Sensors
have been tested. Because the sensor evaluator may sug- Within the BRITE/EURAM Euromesh project, apalette
gest qnite general addition of sub-blocks, it might be nec- of geometrical quality measu:es haa been developed. Now
essary to have such arrangements run quite robust. we use these measnres for a priori qualification of meshes,
Figure 3 Local truncation error estimate for a
NACA0012 Euler caae - computational space and phys-
ical space - left: r(continuitg equation), right: r(2nd
momentum equation).
I. a
accuracy is a too high level of local truncation error. This and represent this equation on the next coarser grid with
error describes in principle how good the nonlinear op- mesh size 2h, then we end up with the multigrid coarse
erators of the Navier-Stokes equations are approximated grid correction equation
by the discrete differentiation and integration rules on a
specific mesh. It must be reminded that there is no lo- +
AzhUzh = I I i h ( f h - Ahah) AzhIihih, (5)
cality in the relation between this error and the global which contains the local truncation error estimate on
truiication error of the solution itself, i.e. the solution mesh 2h relative to mesh h:
error can occur at quite different locations than the local
truncation error [I<lim95]. This is especially due to the Tih = A2hIih6h - I I i h A h i h . (6)
transport character of the equations.
Under the assumptions Ah ii: A and i h ES U this yields
Truncation error estimates can be extracted directly
from the multi-grid cycles: Specific differences between
medium and coarse mesh residuals in a three level com- rih ii:AZhU - Au, (7)
putation yield a n estimate of the local truncation error which is the local truncation error rzh on mesh 2h. Fig.
r [Bra77]. This estimate for all equations of the Euler or 3 gives an impression on the distribution and the levels
Navier-Stokes system is used to define the locally refined of local truncation error for a 2D transonic test case.
(fine) mesh level. During the studies i t has been found very useful to have
In detail, if the continuous equation presentations of the estimate in physical as well as in
Figure 4: Suggestions for new sub-blocks - left: RA-
DIUS=5, right: RADIUS=2.
computational domain. Interestingly, the errors for t,he The user-given tolerance value RADIUS has a strong in-
single equations seem to be complementary to each other. flnence on the size of the sub-blocks. It also defines the
Near the nose of the airfoil. r(continuity - equation) minimum distance between two sub-blocks within one
suggests refinement in other parts of the flow field than block. In order to avoid that very large sub-blocks are
r(momentum - eguations), for example. For our inves- suggested which more look like a global mesh refinement
tigations, the L1-norm over all equations was taken to the maximum size of sub-blocks must be bounded. Also,
drive the refinement. More detailed studies can be found it may happen that many small sub-blocks are created if
in [Lau95]. any singular bad point is taken into account. This can be
hindered by a minimum bonud for the number of points
5.3 Sub-block Deflnition within a sub-block.
Fig. 4 shows the index cube iepresentation of suggestions
In thecontext of structured sub-block refinement, astrat-
for sub-blocks within coarse block. In the first case, a
egy has to be developed by which the location and ex- RADIUS of 5 was chosen whcreas in the second case the
tension of local subblocks can be determined. The eval- RADIUS value was 3, resulting in one more sub-block of
uation of any sensor defines a set of "bad" points or cells smaller size.
that appear as clouds in the index space of each struc-
tured block. On the one hand, the subblocks have to
cover those clouds. On the other hand, the size of the 5.4 A d a p t a t i o n Cycle
sub-blocks corresponds to the numerical effort and thus Mesh enrichment via sub-blocks should run automati-
has to be as small as possible. The strategy to define cally within the flow solutioii process. However, for the
reasonable subblocks is as follows: development, of snch a method it is reasonable to com-
bine the single elements of code in a more loose form.
Find a first bad point (1,J.K) The adaptation cycle has bren splitted into 4 steps:
Set IMIN=IMAX=I-index of bad point: same with
Start calculation on a reasonably fine mesh and
store the resu1t.s (mesh, flow solution, local trun-
e Trace the surroundiiies
(IMIN-RADIUS, IMAX+RADIUS; ...) of the cur-
- cation error),
rent (IMINJMAX; JMINJMAX; I<MIN,I<MAX) * Run the sub-block suggestion code and store
area for more bad p0int.s. MINIMAX indices for each coarse block,
If any more bad point h a s been identified, en- e Generate the enriched mesh which contains the
large the respective MINIMAX values and restart previous mesh and the new subblocks,
search.
Restart the flow solver using interpolated values as
If no more bad point can be found, define t.he sub- starting solution for the new sub-blocks.
block from the current MINIMAX values.
214
The RAE2822 test case 9 has often been used for valida-
-,.I tion purposes. Always problems with the suction peak
CP on the upper wing nose have been reported as i t is the
-1.2
c u e with the present results. Current computations have
been made for fully turbulent flow.
If (NI) is assumed to be a mesh of usual finen-, the
(Nl) result should be the target for adaptation. Results
produced with the coarser mesh (N2) obviously show up
4.8
high level numerical errors. If (N2) is refined locally as
described above, which is (N3). the result is already very
-0.5 close to the target (Nl) However, the computing time
is only about 40 p.c. of the ( N l ) computation, as can be
-0.4 seen from Fig. 6. Additional local refinement for (Nl),
which is (N4), yields again asolution which is more close
-0.2 to the experiment both near the nose and for the pres-
sure gradient in front of the shock. For cost reasons, a
0.0
target computation for a globally refined (Nl) mesh has
not been performed. The experiment has been used, in-
stead. In the (N4) case, convergence of the lift coefficient
0.2
is reached a t only minor additional expense compared to
(N1).
0.4
Fig. 6 shows the convergence behaviour of the method
for the different meshes. The residuals for all cases drop
0.6
0,s 0.1 a., 0.6 0.8 1.0
down with CPU time very quickly. There are no spe-
XIC cific observations in the case of embedded sub-block be-
ing present. However, the current implementation of the
Figure 5: Pressure distribution RAE2822 for meshes Baldwin Lomax turbulence model in the MELINA flow
(Nl),..,(N4) - comparison with experiment. solver may cause problems if the sub-block cuts the mesh
within the boundary layer. If such a sub-block does not
extend down to the wall surface, then the wall distance
This cyde can be run until the maximum number of re- needed for the turbulence model has not the right values
finement levels h a s been reached. It is assumed that each and may thus lead to bad results or even non-convergence
time only the relatively finest level can be refined. of the overall algorithm. This state of the flow solver
hinders automatic adaptation in any complex case at the
6 NUMERICAL RESULTS moment.
The sub-block concept described above has been imple-
mented in 3D. However, for cost reasons and for first 6.2 3D WingJBody Test Case
validation purpoaes it is reasonable to begin with 2D The application of the sensur analysis implemented in
Enler and Navier-Stokes flows. The basis of 3D Euler the ADAPTOR code [LauMauQS] to the F4 wiug/body
and Navier-Stokes investigations on local refinement and Navier-Stokes test case showed up nice properties. As can
adaptation was a wing/body combination. be seen from Figs. 7.8, with the current base mesh the T-
error is concentrated in the vicinity of the configuration.
6.1 2D Test Cases It clearly detects
Local mesh refinement has been tested in 2D, first: the body nose region as being not properly re-
RAE2822 test case 9 with a free stream Mach number solved,
of 0.734, angle of attack of 2.54 degrees and Reynolds
number of 6.5 million. The Navier-Stokes calculation the wing nose region .a spurious entropy produc-
should serve as a preliminary test to show the effect and tion region,
effectiveness of local refinement. Refinement was done
by hand using sub-blocks which covered the whole np- the shock region as being insufficiently resolved for
per surface including the supersonic region and extended steep gradients,
slightly on the lower surface near the nose of the air-
e the sonic line as being sensitive to numerical errors,
foil. Fig. 5 shows the resulting pressure distributions for
different meshes and the experimental values. We have the trailing edge and wake region as being sensi-
chosen four different meshes as there were tive because of rapidly changing flow including free
shear layers and
( N l ) standard fine C-mesh with 241 x 77 mesh points,
about 30 points normal to the wall in the boundary the boundary layer near the wall where pre-
layer, adaptation of the mesh to the boundary layer pro-
tiler is only possible up to a certain extent.
(N2)mesh ( N l ) coarsened once by omitting every sec-
ond point, with 131 x 39 mesh points, This makes us hope that automatic recognition of defi-
ciencies in discretization is possible, and adaptation will
(N3) mesh (N?) with sut-blocks and reduce the overall local trnncntion error.
(N4) mesh ( N l ) with sub-blocks
~
27-7
.4
,
b ko
.L
aim Bbo
I
'ieure 7 F4 winglbodv confieuration - continuitv eauation truncation error estimate for surface and svmmetrv dane.
Figure 8: F4 winglbody configuration - truncation error estimate of x-momentum equation for spanwise mesh plane.
21-8
Figure 9: F4 wingfbody configuration - Mach contours and subblocks at spanwise mesh plane.
-1.4
-1.2
-1.0
-0.n
-0.5
e
0 -a.+
-0.2
0.0
0.2
O.$
0.6
0.0 0.2 0.4 0.6 0.8 1.0
x/c
Figure 10: F4 wingfbody configuration - cp at 63 p.c. (left) and 52 p.c. (right) wing span ; -.-fine reference ;
....._non-refined : - - - with embedding.
27-9
T h e ADAPTOR run with the medium grid truncat,ion (Lit,erat.ura.uswert,ung, 1993, 93-038) iind
error estimate of all 5 flow equations leads to mult.iple I1 (Anrvendung und Beurteilung, 1994,
sub-blocks. I t mainly suggests a block in t,he vicinit,y of 94-091), Technical Report, Daimler-
the surface along the whole span of the wing, starting Ben z- A G .
somewhere below and behind the wing and ext,ending
over tlie upper surface again behind the trailing edge. [ Lau 951 Lauke Th.: ”Adaption von Rechennetzen
A second block covers the off-surface region around the zur Steigerung der Genauigkeit von 3D-
wing nose and extends about t,he supersonic region. Part,s Stromungssiniulationen”, Diplomarbeit,
of the sub-blocks can be seen in Fig. 9, where a spanwise Technical University of Berlin, June 1995.
cut with local Mach contours is shown. T h e pressure
[LauMau95] Lauke T h . , Ma.uch. H.: ” A D A P T O R
distribution at two mid-wing cut,s show quite a good im-
User’s Manual”, Daimler-Benz Aerospace
provement compared t,o the coarse mesh solution (Fig.
Airbus, June 1995.
10). Tlie suction peak as well as the pressure roof top
gradient, and the shock position are in a good agreement, [RilBec93] Rill, S.; Becker. I<.: ”MELINA - A
compared to the fine mesh reference solution. And t.he Mult,i-Block. Multi-Grid 3D Euler Code
locally refined mesh has only about 40 p.c. of t,lie points wit,li Local Sub-Block Technique for Lo-
of t,he global fine mesh. cal Mesh Refinement”, ICAS Paper 92-
4.3.R, ICAS Conf., Beijing, Sept. 1992.
7 CONCLUSIONS
Mesh enrichment based on a structured sub-block ap-
proach has been considered as an effect,ive way t,o im-
prove numerical solut,ion of flow equations. Tools 1ia.ve
been defined and st,rat,egicprovisions have been made to
test this approach under indust,rial constraint,s. Up to
now, the main procedures have been set up. Results for
locally refined meshes have been calculated for 3D and
3D Euler- and Navier-Stokes test cases. Next st.ep will
be tlie full integration of the adapt,at,ion into the flow
solver and the validat.ion and improvement of the overall
process.
Because of the problems with t,urbulence model imple-
mentat,ion in Navier-Stokes we’ll first try to sort out, t.lie
automatic a d a p h t i o n problems, sensor analysis, etc. on
1 the basis of the Euler equat,ions. More general sub-block
- to - sub-block conuect,ions are under development, which
allow a more cost-effective resolution of diagonal flow fea-
t,ures.
A lot of tests have been run mit.1~t,he ADAPTOR code,
and a lot of changes of tlie evaluat,ing strategy have been
necessary in order to find a reasonable suggestion for snb-
blocks. Tlie expense of more t,lian 50 p.c. cost saving
which we have achieved wit,h t,he current examples is al-
ready quite good under indust.ria1 conditions. Ongoing
work will be concentrated on making adaptation fully
automatic, robust, and efficient.
8 ACKNOWLEDGEMENTS
T h e basis of t.his work has beeu partly conduct,ed as
BRITE/EURAM area 5, C E C funded, applied research.
Recent resulk have been obt,a.ined within the I M T area3,
C E C funded, E C A R P project. We a.re grateful for t.liis
support.
References
[Bec93] Becker. I<., Anma.nn, P.: ” T h e Int,erac-
tive Grid Generation System INGRID -
Version 5.0”, DA-report, December 1993.
[Bra771 Branclt., A.: ” Multi-Level Adapt,ive So-
lutions t,o Boundary Value Problems”,
Mat,liemat,ics of Comput,at,ion, Vol. 31,
No. 138, pp. 333-390, April 1977.
[I<lim951 Klimetzek, F.: ”Felilerast,imat,oren fiir
Stromii ngsbereclin ungsverfaliren, Teil I
1
28- 1
Stefan0 Sibilla
Aennacchi SPA.
Dipartimento di Aerodinamica
E a Foresio, 1 21040 Venegono Superiore (VA) Italy
and
Marcello Vitaletti
IBM Semea S.p.A.
E. C.S.E.C.
Piazza G. Pastore,6 00144 Roma Italy
SUMMARY
Cartesian coordinates
Specific algorithms have been developed for specific heat ratio
numerical solution of Euler equations on multi- time step
block structured grids of general topology; these numerical viscosity coefficient
algorithms involve determination of convective spectral radius
and dissipative fluxes, residual collection from pressure sensor
fine grid levels during multigrid cycles and time curvilinear coordinates
step evaluation. They must be properly integra- density
ted with residual and flow variable averaging
when the internal boundary condition is introdu-
ced. 1. INTRODUCTION
The influence of block subdivision on the
bow-shock in front of a blunt-nosed body is Multiblock methods consist in the decomposition
analysed with different multiblock algorithms; a of complex computational domains into simpler
structured and a locally unstructured topology subdomains, which can bc more easily handled
are also compared. in the management of the simulation and in the
Results show that no additional error is introdu- subdivision of the computational task on diffe-
ced in multiblock solutions if internal block rent processors.
boundary conditions are applied at each stage Structured grid blocks can be generated in these
and edgehorner boundary cell contributions to subdomains, in ordcr to combine the efficiency
flow quantities are properly taken in account. and simplicity of CFD algorithms developed for
single-block structured grids with the geometric
flexibility needed to describe topologically
LIST OF SYMBOLS complex regions.
The main difficulty in multiblock methods lies
a speed of sound in the correct treatment of block interfaces,
Cd drag coefficient which are located in thc flow region and repre-
CFL Courant number sent a numerical boundary condition with no
CP,, stagnation pressure coefficient reference to the physical problem: their presence
D dissipative flux can introduce errors in the solution which can
E specific energy either prevent complete convergencc to the exact
H specific enthalpy solution or impose constraints on the grid gene-
P pressure ration.
e convective flux IBM has developed a parallel multiblock frame-
work called PARAGRID [1,2] which supports
4 flow quantity vector
R- residual suitable data structure for the management of
S cell face area vector data communication between adjacent blocks.
4 v,w velocity components The computation is performed in parallel mode
V cell volume at block level, thus allowing exploitation of
VC” control volume
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
28-2
workstation clusters and/or multi-processor and are solved through a cell-vertex finite
systems. volume space discretization [4]: flow quantity
A structured multiblock Euler solver had been values located at cell corners represent average
previously implemented within this framework values of flow quantities in the control volume
[3]; its results were generally good as long as the made of all the cells (e.g. 8 for an internal node
overall solution quality and global aerodynamic of a structured grid) sharing that node.
coefficient evaluation were concerned. Problems Convective fluxes through the control volume
were nevertheless detected in the convergence surface, which are represented by the second
rate and in the solution quality at the interfaces term in the left hand sidc of (3), are computed as
between adjacent blocks; moreover, only "stru- sum of the contributions of all the cell faces
ctured" block topologies were solved consistently which form the control volume surface itself;
with original structured algorithm: this means, face values are taken as the average of the values
for example, that only internal edges shared by at the corners of the face.
four blocks or corners shared by eight blocks Such scheme is equivalent to a second-order
were allowed; for all other block topologies, accurate central diffcrence on a Cartesian grid;
approximate corrections were introduced. such discretization leads to odd-even decoupling,
Some solution algorithms were therefore modi- allowing numerical oscillations, and provides no
fied in order to account for the presence of intrinsic numerical dissipation to damp these
locally unstructured topologies at block bounda- oscillations and other non-linear instabilities. A
ries; these algorithms were designed for applica- dissipative term, based on first- and third-order
tion in a parallel environment, minimi7ing the differences of the flux variables and scaled on
number of data exchanges between adjacent the local spectral radii of the flux Jacobians, is
blocks, and therefore the communications be- introduced in the form of an added flux term [5].
tween computational nodes. For a control volume centered on the grid point
i,j,k equation (3) takes the semi-discretized form
2. NUMERICAL SCHEME
where
A=A,+A,+Ic .
has been introduced, and II values are available I
in the whole extended domain.
To minimize data exchange needs in the compu- Modified time step (11) is directly introduced
tation of spectral radii at block interfaces, the into the time-discretized form of (4)
product of cell contributions (6) and of cell
volumes
is computed in each core region, and from which updated values of flow quantities are I
obtained.
28-4
n n
6. CONCLUSIONS
A. Ecer and J. Hauser, pp. 111-122, 6. Jamwon A., Schmidt W.,Turkel E., "Nu-
North-Holland, 1992. merical Solutions of the Euler Equations
by Finite Volume Methods Using Runge-
2. Paoletti S., Po@ F., Vitaletti M., "Para- Kutta Timestepping Schemes", AIAA
grid - a Parallel Multiblock Environment Paper 81-1259, 1981.
for Computational Fluid Dynamics", proc.
of YECIPAR '93 - 1st International Mee- 7. Malfa E., "IEPG - TA15 Results of Aer-
ling on Vector and Pamllel Processing, macchi Euler Code Around the Wing-
Porto, 1993. Body-Canard WB(C)-l Configuration g i n
Transonic Flow Condition", Aermacchi
3. Dellagiacoma F., Vitaletti M., Jameson A., Report 275-TA15-021, 1991.
Martinelli L., Sibfla S., Vitini L.,
"Flo67p: a Multi-block Version of Fl67 8. Mavriplis DJ., "Accurate Multigrid Solu-
Running within Paragrid", in Parallel tion of the Euler Equations on Unstructu-
Computanbnal Fluid Dynamics: New red and Adaptive Meshes", AIAA Journal,
Trends and Advances, ed. A.Ecer et al., Vol. 28, No. 2, February 1990, pp. 213-
pp. 199-206, NOrth-HOU~d, 1995. 221.
4. Jameson A., "A Vertex Based Multigrid 9. Sibfla S., Vitaletti M., "Cell-Vertex Multi-
Algorithm for Three Dimensional Com- grid Solvers in the PARAGRID Frame-
pressible Flow Calculations", RSME work", proc. of Pamllel Computafional
Symposium on Numm~cal Methodr for Fluid Dynamics '95, Pasedena, 1995.
Compressible Flows, AMaheim, 1986.
5. Jameson A., "Transonic Flow Calcula-
tions", Princeton University MAE Report
1651, 1984.
Table 1 Time and memory requirements for the examined multiblock algorithms.
Figure 1 Example of unstructured local topology in amultiblock structured grid edge shared by
five blocks.
core core
fine grid node o coarse ridnode
a) x fine grid cell b) x finegricf cell
Figure 2 Algorithm for multigrid "driver residual" computation: a) contribution of h e grid nodes
to fine grid cell values (14); b) contribution of fine grid cell values to coarse grid nodal
"driver residuals".
Figure 3 Blunt nosed body flow at Mach 2: Flow features and grid topologies.
F i e 4 Iso-pressure plot of the flow around a blunt nosed body (Mach=& B =O):a) singleblock
computation; b) multiblock computation on grid "A".
28-9
-+
Figure 5 Convergence. history of maximum density residual for multiblock solution of the flow
around a blunt nosed body (Mach=& a=O):data exchange performed at the end of each
multigrid cycle.
Figure Convergence history of maximum density residual for multiblock solution c the flow
around a blunt nosed body (Mach=& a=O):behaviour of different strategies for data
exchange between grid blocks at C n = 8 .
0
L%n(R PmaJ
1m
*m
- rlnple-block
#-block w i t h
301
.m
*m
aoD
m a0 -0 w la. 4-
Figure 7 Convergence. history of maximum density residual for multiblock solution of the flow
around a blunt nosed body (Mach=Z, a=O): comparison between single-block and
multiblock computations at CFL =6.
28-10
z .-
.I
ws
Figure 9 bo-pressure plot of the flow around a blunt nosed body (Mach=& a=O):multibIock
solution on a 7-block grid with l o d y unstructured topology.
..... structured f o r u l .
unltructvred f o w l
L
Figure 10 Convergence history of density residual on the locally unstructured edge of the 7-block
grid around a blunt nosed body for solution of the flow at Mach=2and a=O:behaviour
of different arti6cial dissipation models.
29- I
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
29-2
fkrents choix numkriques sont exposksl puis le cal- du jacobien des flux sont les suivantes :
cul d’un profil NACA en mouvement de tangage est
prksentk, ainsi que celui d’un sph&re-c;ne 3D en oscil- X = T”,.?E’ (ordre 3)
lation autour de son centre de gravitk. X = T ” , . E ’ + c (ordre 1 ) (3)
X = T’,.?” - c (ordre 1 )
2.1 Equations instationnaires
Considkrons un profil 7 en mouvement de tangage, Les vecteurs propres du jacobien des flux sont exacte-
muni d’un maillage n(t).Dans le cadre de cette ktude, ment les mtmes que dans le cas des Cquations en mail-
ce domaine n(t),supposk non dkformable, est en mou- lage fixe.
vement par rapport B un repkre absolu Ro. Pour discrktiser les flux, la dkcomposition de van Leer
Sur n(t),les Cquations d’Euler s’kcrivent, sous forme peut Ctre ktendue aux Cquations d’Euler instation-
de lois de conservation : naires. Van Leer dkcompose le flux sous la forme suiv-
ante :
+
f = f + f - , oh f + ( r e s p . f - ) a des valeurs propres
positives (resp. nkgatives).
En introduisant le nombre de Mach relatif normal
++
Mr, = v r ; n , nous avons :
avec W = ( p , p $ , p E ) , variables conservatives ( p :a Si Mrn > 1, f + = f
densitC, T’ : vitesse absolue, E : knergie totale) si Mr, < -1, =
f -
et
Pour I M,, I< 1, f + a pour expression :
PVr .n
p 7 q Z . E ’ )+ p z
p E ( G . E ‘ )+ p v . n
+
ve est la vitesse d’entrainement, $ la vitesse rel-
ative. A la diffkrence d’autres approches utilisant les
vitesses relatives, les variables de calcul sont les vari-
ables absolues, c’est-&dire les vitesses absolues ex-
primkes dans le repkre absolu Ro. C’est une approche
classique qui, par rapport aux kquations en maillage
fixe, demande une modification des flux numkriques
qui font intervenir la vitesse d’entrainement be, ainsi
qu’un calcul de mktrique variable au cours du temps.
Pour discrktiser les flux, nous utilisons les mkthodes de
dkcentrement; Vinokur en donne une analyse dktaillke 2.3 MBtrique instationnaire
dans [7]. Au cours du mouvement du profil, le maillage est mo-
bile et, par conskquent, les normales aux interfaces
2.2 DiscrBtisation des flux doivent Ctre calculkes B chaque instant. Par hypothbe,
On pourra vkrifier que le jacobien des flux a pour ex- le maillage ne se dkforme pas, les volumes ne changent
pression : donc pas. 11s sont calculks une fois pour toutes a
l’instant initial to.
F
0
I1 faut kvaluer la moyenne de E’’ sur un pas de temps
At, ce que l’on fait en considkrant l’instant tn++.
(7- l ) V 2 . Z - vn. v +
P T Nous avons alors : n ( t , + + ) = R(t,.,++).%(to),OG
R est la matrice de rotation du mouvement prise B
l’instant n + f. Le calcul des flux nkcessite kgale-
ment la connaissance, B l’instant tn++, de la vitesse
d’entrainement aux interfaces des mailles de calcul.
Connaissant les coordonnkes B l’instant t=O du centre
I d’une interface, la vitesse d’entrainement est donnke
par :
* + ---*
avec v, = + +
v . n et veri = v,. n . Les valeurs propres
--t---t
V e l ( t n + + ) = G(tn++)G ( t n + + ) * A l
29-3
F(W;z)=si. p . 3 ( p<.Z
0
) Sche'ma Runge-Kutta implicite de Iannelli-Baker
C'est un schdma Runge-Kutta B deux ktapes
implicites [IO].
oh p est la pression B l'interface, calculke par une ex- ( I - a(At)2An)AW1= g(W")At (6)
trapolation, Cventuellement corrigke par une relation
car actkr ist ique . (I - a(At)'An)AW2 = g(W" + pAW1)At (7)
t et Wn+l = W" + 71AWl + 72AW2 (8)
2. Conditions d la limite d 'entre'e-sortie.
2-Jz
avec o = -' p = 2(3&- 4), (9)
Nous adoptons la formulation proposCe par Coller- 2
candy [9]. Elle est ktendue aux Cquations en maillage 6-fi
mobile. 71 = - I72 = - 6+Jz (10)
8 8
Cinq caractkristiques de pentes A,
X E {'urn,vrn + c , v , , - c } arrivent B l'interface B Ce schima est inconditionnellement stable, avec des
l'instant n + l . Suivant le signe de la pente A, la vari- cft beaucoup plus grands que ceux utilises avec un
able caractiristique associde sera calculke avec un Ctat schima implicite d'ordre 1, de l'ordre de 400, toujours
extkrieur ou intCrieur. pour obtenir des rksultats de pricision Cquivalente A
Plus prkcisdment, le calcul des valeurs propres est ef- celle d'un calcul explicite.
, fectuC B l'aide d'un Ctat moyen:
+
W, = 51 ( W i n t e r i e u r W e z t e r i e u r ) , ce qui permet de 2.6 Cas de validation
calculer les valeurs propres et de connaitre leur signe. 2.6.1 Profil NACA64A010
On calcule ensuite les variables caractkristiques asso- Le profil choisi est un NACA64A010, correspondant
ciCes B ces valeurs propres, avec les Ctats intCrieur et aux conditions expirimentales suivantes (Fig. 1):
extCrieur .
Si X est nkgative, c'est la variable caractkristique ex- M I 0.796 I
tkrieure qui sera choisie; sinon, on prendra la variable 203321 Pa
I caractiristique intkrieure.
I
2.5 DiscrQtisationen temps 34.4 He
I L a prkcision en temps des schCmas utilisks en instation-
aire est un point important. Plusieurs schkmas d'ordre
i
I
un ou deux en temps sont dCcrits et leurs propridtb
de stabilitk sont brikvement rappeldes. L'kquation A L a loi du mouvement du profil de l'aile est donnee
I discritiser en temps est la suivante: par:
I
(5)
1
29-4
Le profil a dtd ddfini dans un rapport AGARD [ll] Le choix suivant a dte considdrd pour les coefficients du
et calculi par de nombreux auteurs [8]. modkle: C,, = 1.57, C,,= 2.: ak = 1.’ a , = 1.3, C , =
Pour chaque essai instationnaire, on prdsente 0.09 . Les termes Dk et D, ddsignent des termes addi-
l’dvolution de la portance et le moment. Les calculs tionnels lids a la formulation bas-Reynolds et destines a
des diffdrentes approches en temps sont compards a reprksenter l’amortissement de la turbulence au voisi-
l’expdrience et B un calcul explicite de rdfdrence B cfl nage des parois. Dans l’article de Jones. et Launder
=0,8. [13], l’expression de ces termes est donnde en repkre
Sur les figures 2 et 3, les calculs des approches d’ordre de couche limite. Dans le cadre de la rdsolution des
1 et d’ordre 2 en temps, pour un cfl de 400, sont dquations de Navier-Stokes, nous avons considdrk les
reprdsentds. Si nous considdrons le calcul du moment, expressions suivantes pour Dk et D,:
l’approche d’ordre 1 ne donne pas le mCme rdsultat que
2
le calcul de rkfdrence explicite, alors que i’approche Dk = -2pllVhII
d’ordre 2 donne un rdsultat identique.
A cfl = 100, pour la mdthode d’ordre 1, le coiit de
calcul est divisd par 17 par rapport B une mdthode ex-
plicite a cfl = 1 (400 pas de temps/cycle contre 40 000
Les quantitds f 2 et f , sont aussi likes a l’amortissement
en explicite). Le coiit de la mdthode Runge-Kutta im-
de la turbulence prts des parois. Elles sont fonctions
plicite d’ordre 2 est pratiquement identique B celui de
du nombre de Reynolds turbulent Ret:
la mdthode d’ordre 1, puisque la matrice implicite est
la meme dans les deux pas de calcul.
En conclusion, la mdthode implicite d’ordre 2 en temps Pk2
Ret = -
est la plus indiqude pour des calculs instationnaires. Pf
2
f 2 = 1 - 0.3ezp( -Ret )
2.6.2 Sphhre-cBne Adrospatiale
Un corps sphkro-conique 3D a dgalement CtC calculd f , = ezp(-- 2*5 1
l+%
(Fig.4); ce corps est en oscillation autour de son cen-
tre de gravitd G. Le nombre de Mach B l’infini est de Ce choix de fonctions d’amortissement ne faisant inter-
7. L’angle de tangage maximum est 010 = 1‘’ La loi du venir ni la distance B la paroi, ni le frottement paridtal,
mouvement est donnke par: permet de rdaliser une programmation du modkle de
a ( t ) = a0 * sin(k.7) turbulence independante de l’application considdrke,
ce qui constitue un avantage important pour un code
traitant des applications multidomaines complexes, tel
avec k = v;,f = 0,386. Un calcul instationnaire B cfi que le code FLU3M.
= 100, avec la mCthode Runge-Kutta implicite a dtd Le modkle de turbulence ( I C , E) qui vient d’Ctre ddcrit
rdalisi (Fig.4). Une comparaison a CtC faite avec un peut Ctre associd dans le code FLUJM, soit B une for-
autre calcul effectud par Adrospatiale [?I. Un k a r t de mulation monoespkce, soit B une formulation biespke.
+
8% est observd sur C,,, alors que C,,,,. Cmq est Dans cette dernikre formulation, on se place alors dans
identique dans les deux calculs (Fig.5). le cadre de l’dcoulement compressible turbulent d’un
mklange non rkactif de deux espkces, chaque espkce
3. ModBlisation de la turbulence i t a n t supposde Ctre un gaz parfait B chaleurs spd-
cifiques constantes. La discrdtisation des dquations
3.1 Modhle B deux Bquations de transport s’effectue de manikre analogue pour le systkme ”mono-
Le modkle de turbulence (k,E ) de Jones-Launder [I31 espkce ( k , c )” et pour le systkme ”biespkce ( k , E ) ” ~
est implant6 dans le code de calcul FLU3M. Les deux Les flux convectifs sont discritisis l’aide d’extensions
equations de transport pour pk et pc s’icrivent: du solveur de Riemann de Roe aux systimes d’ dqua-
tions coupldes. Les valeurs propres de la matrice Jaco-
+
d t p k div(pk77) = Fj : vv’+div((p + C”’)Vk)
a k
bienne ont pour expression: XI = U (ordre: neq - 2),
+
Xz = U c (ordre: I), X3 = U - c (ordre: l),oh neq
-PE+ Dk (11) ddsigne le nombre total d’dquations (7 en monoespkce
et 8 en biespkce). La quantitd c est une vitesse du son
+ € -
atpc div(pcv’) = c,,-7% : vv’+div((P
k
+
pt
-)v€)
a,
modifide donnde par :
c2 = r(P- + 4)
2
P 3
Le coefficient de viscositd turbulente p t a pour expres-
sion: Le rapport des,chaleurs spdcifiques 7 est supposd con-
Pk2 stant dans la formulation monoespkce, alors qu’en
Pt = C P f l L T (13) biespkce, il ddpend des densitds partielles et des
29-5
chaleurs spkcifiques des deux espkces, de la manikre gknkratrices respectivement kgales B 325 Kelvins et 10
suivante: bars. La pression gknkratrice du jet est plus klevke et
kgale B 42.3 bars. Le nombre de Reynolds calculi B
partir des grandeurs critiques associkes au jet et du
rayon du culot est kgal B 1.15 lo7. Le domaine de cal-
cul est divisk en trois sous-domaines: un sous-domaine
Le flux numkrique de Roe s’kcrit: D1 correspondant B l’kcoulement externe, un sous-
domaine 0 3 correspondant B la sortie de la tuykre et au
jet, et un sous-domaine intermkdiaire 0 2 comprenant
la rkgion du culot. Le nombre total de points de mail-
lage est kgal B 13,879. Des raffinements importants
sont introduits prks des parois. Par exemple, la taille
de maille prks de la paroi externe de l’arrikre-corps
oh hRdksigne la matrice diagonale des valeurs propres, est kgale B 10-4R. Sur la frontikre amont du sous-
et P R et PR-’ dksignent les matrices de passage. La domaine externe D1, on impose des profils issus des
notation R en indice supkrieur indique que les quan- donnkes expkrimentales pour la vitesse et les grandeurs
titks sont calculkes B l’aide de moyennes de Roe. turbulentes, alors que, sur la frontikre amont du sous-
La prkcision du second ordre en espace est obtenue domaine DB,les profils imposks sont issus d’un calcul
gr&e B l’approche MUSCL appliquke sur les variables prkliminaire de l’kcoulement dans la tuykre.
primitives. Les flux visqueux sont kvaluds B l’aide La figure 8 qui reprksente la solution sous forme de
d’une discrktisation centrke en espace. Dans le cas courbes iso-nombre de Mach, montre la forme clas-
de la formulation biespkce, on tient compte de la dif- sique en tonneau du jet, ainsi que l’onde de choc situke
fusion entre les espkces par l’intermkdiaire d’un nom- dans le jet. Une comparaison avec l’expkrience [15]
bre de Lewis Le et d’un nombre de Lewis turbulent est reprksentke sur les figures 9 et 10, sous forme de
Let . Une accilkration de convergence est rkaliske B profils d’knergie cinktique de turbulence et de pression
l’aide d’une phase implicite et de la technique du pas pitot dans deux sections situkes en aval du culot a des
de temps local. La phase implicite s’appuie sur une distances kgales A 0.59R et B 6 R. Les points expkri-
linkarisation des flux de van Leer pour les flux con- mentaux ont Ctk obtenus par vklocimktrie laser et par
vectifs, une linkarisation similaire B celle de Coakley un tube de Pitot. Bien que les donnkes expkrimentales
pour les flux visqueux, une linkarisation simplifike de pour l’knergie cinktique de turbulence ne soient rela-
la partie nkgative des termes source et une inversion tives qu’h la partie externe de l’kcoulement, l’accord
AD1 de la matrice implicite. apparait comme satisfaisant.
A titre d’exemples, on prksente ici les rksultats
obtenus dans le cadre de la formulation ”monoespkce 4. Factorisation DDLU
(k,E)” , sur une configuration d’interaction onde de 4.1 Description de l’algorithme
choc/couche limite dans un canal bidimensionnel, puis L’analyse de stabilitk linkaire de von Neumann de
sur une configuration d’arrikre-corps axisymktrique. la factorisation approchke AD1 rkvkle une instabilitk
La premikre configuration correspond Q une expkrience inconditionnelle en 3D (Cf. Ying [21]). Mime si
[14] rkaliske B 1’ONERA dans une tuykre symktrique. les termes non linkaires jouent un r6le stabilisateur
Le nombre de Mach en amont de l’interaction est kgal comme tendent B le prouver les codes de calcul util-
a 1.45. Sur la figure 6 qui reprksente les courbes iso- isant une telle approche, la factorisation triple AD1
nombre de Mach calculkes, on peut voir la structure reste pknaliske par un nombre d’opkrations important
classique de choc en A dans la rkgion d’interaction et et surtout une skvkre restriction de cfl due B l’erreur
I’important kpaississement de la couche limite rksul- de factorisation en At3. Une factorisation de type
tant de l’interaction avec le choc. La figure 7 prksente DDLU a donc itk dkveloppke pour amkliorer l’efficacitk
une comparaison avec l’expkrience portant sur la dis- de l’algorithme implicite. Les premikres mkthodes de
tribution de pression paridtale. Le plateau de pression dkcomposition DDLU de la matrice implicite ont ktk
obtenu par le calcul dans la rkgion d’interaction est proposkes simultankment par Jameson et Turkel [17]
plus petit que dans I’expkrience, ce qui correspond A et Steger et Warming [20] en 1981. Alors que les tech-
une lkgkre sous-estimation de la taille de la rkgion dk- niques de directions alternkes consistent & substituer
collke. Le rksultat obtenu est sur ce point comparable B l’opdrateur implicite un opdrateur factorisk suivant
a des rksultats obtenus antkrieurement avec d’autres les directions du maillage, la mkthode L U le dkcom-
codes de calcul mettant en oeuvre le modkle (k,E ) , sur pose en deux matrices triangulaires supkrieure et in-
la m i m e configuration. fkrieure. Jameson et Turkel montrent qu’un tel sys-
La deuxikme configuration traitke correspond a un tkme est bien conditionnk si les matrices sont a diago-
arrikre-corps axisymktrique muni d’une tuykre. Les nales dominantes. Aussi ont-ils propose une dkcompo-
conditions de l’kcoulement externe sont les suivantes: sition des matrices jacobiennes de la matrice implicite
nombre de Mach kgal B 4.18, temperature et pression augmentant la diagonale. Dans le schkma original, le
29-6
systkme implicite est mis sous la forme : Dans la dicomposition de Jameson et Turkel [17] 2,0n
vise B augmenter la dominance diagonale :
+’
- B^- + c+ - c - )
-
- - - ’+)+
’
(25) ces L et U en ivitant les ricurrences non vectorisables.
La ricurrence entre points de la factorisation AD1 de-
vient une ricurrence entre plans lors de la factorisation
UULU.
Les matrices jacobiennes A^+, g+ et e+(respective-
En outre, le balaya e diagonal fait intervenir, autour
3
du point courant, es points dont la mise B jour a
ment A^-, B^- et E - ) sont construites de facon B ce effectuCe llCtape pr,+idente. On peut ainsi les
qu’elles ne posskdent que des valeurs propres positives ajouter au membre de droite : il n’y a done aucun
(respectivement nCgatives), c’est-&-dire : bloc B inverser.
FIACRE Mach 4 . 5
” “““7
a p roche itdrative de sur-relaxation symCtrique avec
barayage oblique :
4.2 Application
On prksente ici l’application de l’algorithme DDLU
dans sa version fluide parfait sur le cas test d’un fuse-
lage lenticulaire avec ritreint. Le nombre de Mach
vaut 4’5 et l’incidence est de loo. Le maillage est com-
pose de 42 x 27 x 44 points. La figure 11 reprksente la
solution obtenue B partir du schdma DDLU. La solu-
tion donnde par le schCma AD1 est identique. La figure
13 reprksente l’histoire de la convergence des rCsidus Fig.13: Comparaison des vitesses de conver-
implicites de la densite. La montCe en CFL est effec- gence des algorithmes
tuCe jusqu’8 une valeur de 500 pour les formulations 5. Conclusion
DDLU et SSOR et jusqu’8 100 pour la formulation
ADI. La vitesse de convergence de l’implicite DDLU Trois dkveloppements rCcents dans le code FLU3M ont
est meilleure que celle de l’implicite ADI. L’algorithme CtC prisentks.
SSOR permet une convergence plus rapide que celle de Les Cquations d’Euler instationnaires (en maillage non
l’algorithme DDLU. Mais le nombre d’itCrations in- diformable mobile) ont it6 discrCtisdes avec un schCma
ternes, une doueaine, augmente les temps de calcul Runge-Kutta implicite d’ordre 2 en temps associC aux
qui deviennent comparables B ceux du schkma ADI. flux de van Leer. Les cas de validation prksentCs, 2D
Le tableau 1 donne le temps de calcul par point et par et 3D, montrent la pricision et la rapiditd de la mCth-
itCration et le nombre de tableaux 3D ndcessaires pour ode, aussi prCcise qu’un calcul explicite, mais 70 fois
le stockage de la matrice implicite, pour les dicompo- plus rapide.
sitions AD1 et DDLU. L’implCmentation du modkle k-e de Jones- Launder,
La version DDLU est 2,3 fois plus rapide que la version y compris pour un gae biespkce, a ensuite Cti dCcrite.
ADI, et, d’autre part, le schima LU requiert prks de 1’5 Nous utilisons le solveur de Roe pour rdsoudre le sys-
fois moins de place mdmoire pour le stockage des ma- tkme complet des Cquations de Navier-Stokes couplies
trices implicites. En effet, on ne stocke en chaque point avec les Cquations de transport pour k et e. Si des rC-
courant qu’un s e d vecteur Di,k alors que la factorisa- sultats satisfaisants ont CtC obtenus, notarnment sur un
tion AD1 demande la riservation mdmoire en chaque cas d’arri6re-corps, des Ctudes concernant l’application
point de trois blocs D t i j k rD,,..et D C ~ .De~ plus,
. du modkle restent B effectuer; en particulier, le champ
*iJ
la factorisation AD1 fait appel a des tabfeaux tempo- initial des variables k et E doit Ctre ddtermind pour
raires lors de I’inversion. commencer le calcul; de plus, des phdnomknes de re-
laminarisation peuvent apparaitre. Pour terminer,
l’algorithme implicite de decomposition DDLU per-
met de rdduire les coiits mCmoire et temps de calcul B
chaque itCration. Cet algorithme peut Ctre Ctendu aux
Cquations avec modkle k-e.
Remerciements: Les dCveloppements relatifs au
modkle k-6, ainsi que ceux relatifs a l’instationnaire,
Algorithme Temps CPU ( p s ) Mkmoire ont CtC soutenus par Adrospatiale Espace et DCfense
AD1 Euler 3D 49 225 et par le CNES. Les travaux sur l’algorithme DDLU
DDLU Euler 3D 21 155
~
[12] F.
Coron, F. Ruffino , Pre'vision de l'ae'rodynamique
instationnaire autour de lanceurs pour l'e'tude de
l'akoklasticite' , 316me Colloque AAAF d' ACro-
dynakiqpe AppliquCe, Mars 1995.
29-9
hi-
FIG. 1 - SchCma impliciie d'ordre 1 en icmpa FIG. 2 - Schima Runge-Kuita impliciic d'ordm 2 n
kP=lOO) icmpr (ep-@O)
Calcd DDLU 9D
..
..
. ,. a
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
30-2
Having reached the point of not a 3-D version FJ3SOLV was relatively
finding a complete package for grid easy as in the 2-D code the logic is
generation and solver, we then had set up so that it is driven by edges
to decide whether to develop our own of a triangle with the flux across
codes. The idea of developing a 3-D the edge being computed once and
grid generator was not relished added/subtracted to the total flux
whereas the code to carry out an balance for the triangles on each
unstructured grid Euler solver (and side of the edge. The same principle
later Navier-Stokes) appeared to be was used for 3-D but now the edge is
quite feasible. Thus IAR, with a a 'face' of a tetrahedron. In the
view to first acquiring a grid far field there is no vortex
generation package, contacted correction as in 2-D and the Riemann
vendors and evaluated several invariants alone are used.
unstructured grid generators
including I-DEAS [ 8 ] (mainly used To validate the new Euler code
for structural analysis) and GFEM FJ3SOLV, we first considered the RAE
[18]. These codes were eventually 2822 airfoil spread out as a
rejected as they were either very straight wing between two solid
cumbersome to use or were very slow walls. A boundary condition of no
in generating fairly simple grids. normal flow was imposed at the walls
so that the flow should be two-
Finally, the code ICEM [lo] was dimensional with no variation across
evaluated and was found to be quite the span. The ICEM grid generation
promising. A copy of this software package was used to generate a grid
was also obtained for evaluation. It and two views are shown on Fig 3a.
has a good user interface, This grid was then used to generate
preprocessing and postprocessing. a'solution shown in Fig 3b. Note
Its CAD software can build that the grid was not refined about
complicated wire frame surface the shock wave at about 70% chord on
models efficiently, and can take the upper surface and so produced a
point data in PLOT3D format, and result that was quite smeared out
IGES files from other CAD systems. around the shock wave. On refinement
This grid generation package of the grid around the shock, shown
supports the point, line and volume on Fig 4a, a much improved shock was
sources for density control and can obtained and good two dimensionality
generate 3D unstructured meshes was shown with little spanwise
efficiently. The Octree method is variation in the pressure. The
used, which refines the grid by accuracy for the airfoil pressure
subdividing the tetrahedron into distribution is demonstrated in Fig
eight smaller tetrahedra until a 4b where the solution at one
satisfactory grid density has been spanwise station (FJICEM), obtained
reached. Some examples of these by interpolation from nearby values,
grids are shown later. is compared to a completely 2-D
solution. The latter solution was
The idea of generating structured obtained with the 2-D solver FJSOLV
layers of tetrahedra near the with 30 points on each of the upper
surface will be pursued with the and lower surfaces (called FJDJ30 on
vendor ICEM, or IAR may develop its the figure); it was also run without
own capability using advancing a vortex correction and with roughly
normals as was done for 2-D. In the the same far field distance as in
meantime it will be used solely in the 3-D case. The solution using
its unstructured form which may be FJ3SOLV took 8 hours on the SGI
acceptable for Euler solutions. Power Challenge computer and used
about 299,000 grid cells.
4 . Development of a 3-D Euler
Solver and Validation Having achieved success with the '3-
Rather than trying to acquire a D' airfoil a more challenging case
commercial 3-D solver, IAR decided of some practical interest was next
it was more suitable to develop a 3- considered. The M-100 wing-body
D Euler solver from our existing 2-D configuration (ref 19) had been used
solver especially since the code when checking the RAMPANT code. It
would eventually have to be made is a good case as the experimental
into an unsteady version. To make data is quite reliable and it has
the existing 2-D code, FJSOLV, into been used as a test case by Grumman
30-4
a distance of 0.03 (chord=l) from initial grid before the store starts
the airfoil. Within this window the to move, the grid at the bottom of
grid was fixed relative to the the store's cycle and the grid after
airfoil and movement of the grid was one complete cycle. A third window
only allowed outside of the window. was also used in this case so that
the grid was only allowed to move
Next the code was tested on a within a distance of 4 units from
NACA0012 falling in free air with the centre of the airfoil/pylon. The
M=0.8, a=O and a downward velocity grid was fixed outside this window
of 0.08 (relative to unit allowing for greater efficiency in
freestream). The trend in CN with grid movement. The CPU time for this
increasing time was compared to the case on the Power Challenge was
actual steady state result for the about 5 hours.
equivalent angle of attack. This
grid was completely unstructured and This is the current status of the
the window was now fixed at about unsteady development of the program.
1.4 units from the body. Fig 7a Several more tests will be performed
shows the initial grid and also the to check accuracy and then different
grid after a plunge of about 1.6 schemes for moving the grid and for
units (for a chord length of 1). integrating in time will be studied.
This CN development is shown in Fig Implicit time marching schemes will
7b where it can be seen that the be investigated so that larger time
result looks quite accurate as the steps can be taken. These
normal force CN is tending enhancements will be very useful in
asymptotically to the true steady the future development of the 3-D
state value. version of the code.
The next computation was for a more 6. Conclusions
realistic (store) type of body such All the pieces are now in place to
I
as an ogive-cylinder-ogive as shown complete the development of an
in Fig 8a with an airfoil/pylon as unsteady calculation applied to the
the parent body. For a freestream prediction of the store trajectory
Mach number of 0.4, a reduced after release from the aircraft. A
frequency of 0.8 and a maximum suitable 3-D grid generator has been
velocity of 0.064, this 'store' was identified in ICEM and a 3-D Euler
moved down and up, in a cyclic solver has been developed in-house.
manner, a distance of 0.16 units To optimize the development of the
(based on an airfoil/pylon chord 3-D unsteady version of the final
length of 1) to check physical code a 2-D version has first been
consistency of the results. Figs 8b developed and presented here.
and 8c show the CN and Cm Lessons learned from this
developments with time for three development will be incorporated
cycles. The results look quite into the 3-D version at a later
physical as the lift first increases date. The six degrees of freedom
as it moves downward (seeing an (6DOF) equations defining the
upwash from the fluid), then as the motion, given the aerodynamic forces
gap increases the lift decreases as as computed from the Euler code,
the 'channel' effect above the body will be incorporated into the
is becoming less noticeable. When package to provide a complete
the body returns upward, the lift at trajectory. Another future
first decreases as the body sees a development will be to move from the
downwash from the fluid but finally Euler formulation to a Navier-Stokes
increases as the channel effect is one, where structured grid layers
stronger. The first window in this near the surface will be especially
case was set at a distance of 0.03 beneficial.
from the body which basically
covered only the structured layers
of the grid. A second window for References
fixing the grid was also set around 1. Fox, J.H., Donegan, T.L.,
the wing/pylon. This enables good Jacocks, J.L., and Nichols, R.H.
grids to be maintained near the (1991): "Computed Euler Flowfield
bodies where it is felt to be for a Transonic Aircraft with
necessary to achieve an accurate Stores", Journal of Aircraft,
solution. Shown in Fig 8a is the V01.28, pp.389-396.
2. Lijewski L.E. and Suhs H.E.
30-6
e e
.-hdium r d l
+ - n m .nh
a0 w I .n
30.8
n
. #
n
W
n n
N
a
W
N
&
0
W
1
J P
a
W
U 0
O
E E
*
4
4
5
m *d
CI
m
4
m
4 L
L,
30-9
30-10
uprimant
“IC
0.0 I.l . XIC
N
r(
0
0
!4
0
Y)
30-12
EL
Solution of the Euler- and Navier-Stokes Equations
on Hybrid Grids
Martin GaUe
DLR Institute of Design Aerodynamics
Lilienthalplatz 7
38108 Braunschweig
Germany
1. SUMMARY action. Though much effolf has been spent in the last yean to
A three dimensional finite volume scheme is presented. The develop powerful tools, the generation of appropriate structured
scheme is based on the employment of hybrid grids, containing grids for complex geometries appears to be much more time con-
tetrahedral as well as prismatic cells. suming than the flow simulation.
The application of hybrid grids offers the possibility to combine A possibility to circumvent this bottleneck is the unstructured
the flexibility of tetrahedral mesheswith the accuracy of regular approach [3,4]. The flow simulation is performed on a grid con-
grids. An algorithm to compute an auxiliary grid of control vol- sisting of tetrahedral cells instead of hexahedra. As tetrahedral
umes for the entire computational domain was formulated. The cells offer a high flexibility the discretization of complex three
dual mesh technique guarantees conservation in the entire flow dimensional domains can bedonealmost automatically IS].with
field even at interfaces between prismatic and tetrahedral do- less user interaction as required for generating structured grids.
mains and enables the employment of an accurate upwind flow The weak point of the unstructured approach is the generation
solver. Convergenceto the steady state can be accelerated by a of grids for high Reynolds number flows. The efficient simula-
multigrid algorithm based on the agglomeration of control vol- tion of such flows requires extremely stretched cells. The edges
umes, The formulation of such an algorithm is presented. of tetrahedral cells of high aspect ratio are connectedunder very
acute angles, This may cause numerical errors when the fluxes
Thecodeis testedonseveralviscousandinviscidcasesfortran- for comernodesof suchcells are evaluated. Hence, convergence
sonic and subsonic flows. and even solution accuracy can be deteriorated.
("i
As it is feasible to stretch hexahedral cells in one or two direc- where
tiom without losing grid quality, structured grids are appropriate
for the simulation of high Reynolds number flows. The major
drawback of structured schemes is related to the generation of m= PW
suited grid for complex geometries. Grid generation normally is
an iterative process [Z]that requires a high level of user inter- PE
Paper presenred ar the AGARD FDP Symposium on "Pmgress and Challenges in CFD Merhods and Algorithm"
held in Seville, Spain, from 2-5 October 1995. and published in CP-578.
31-2
is the vector of the COnSeNed quantities. V denotes an arbitrary From e+quation ( I ) the temporal cbangeof the conservative van-
control volume with the boundary a V and the outer normal ii. ables W can be derived as:
a
with representingthe fluxesoverthe boundatiesnfthe control
a
volume. If the boundary is divided into n faces, is given by
ai
where @ and " denote the inviscid and the viscous flux over
the respective face.
4. DATA S T R U m
The dual mesh technique is perfectly suited to be utilized in
a scheme that is based on hybrid grids. From the initial grid
an auxiliary grid of control volumes is generated. For a vertex
basedscheme, wherethe flowvariablesarestoredinthenodesof
the initial grid, each node is surrounded by a control volume. The
boundaries of the control volumes are determined by the mid-
points of cells, cell faces and edges of the initial grid. This strat-
egy results in non overlapping auxiliary cells that fill the phys-
ical space without gaps. Figure 1 depicts such an auxiliary grid
(dashed lines) for an initial hybrid grid (solid line). As it can be
seen from the figure, the auxiliary grid is defined even at inter-
faces between the different cell types. Hence, focusing on the
fluxes crossing the boundaries of the control volumes, conser-
The normal and tangential stresses depend on the derivatives of vativity can be guaranteed in the entire flow domain.
the velocity and on the dynamic viscosity p:
az --
mu 2
$I
au av my
-+-+-
az (ax ay az
Tq, = P($+$)
p=(K- I)p(E-
U2 +v= +wz (4)
tional time.
2 1 .
31-3
be described
describes the by and thevectors
sizenormal orientation
the sum
of the
of entire
the normal
face. The
vectors
re- &=[ FIL--
sulting vector is also related to the respective edge. Paw
Pan .p d F]
paw
R
and
Hence, both the description of the grid and the computation of M.0 = M'I + h q (9)
the grid fluxes are based on the edges, Informations the initial where the split Mach numbers are defined as
gridcellsarenotrequiredanymore.Thisstrategyleadstoavery
efficient memory allocation of less than 100 real variables per
node and a good vectorization of the flow solver. MJJ = { M
@+I)'
0
i f M 2
if
if
IMI
M
<
5 -1
I
1
(7)
314
The coefficient@ F controls the dissipationof the scheme. In the 5.3 Calculation of Viscous Terms
original schemeofliou. QF is set to The determination of the viscous terms is also performed edge-
wise. The obtained fluxes are related to the nodes associated
*F=IMFI . with the respective edge. For an edge connecting the nodes PO
and PI with the face vector& (figure 3) one obtains:
Forsmall Machnumbersthedissipative charactervanishessince
also @F becomes small. In order to prevent the disappearance 0
1
0 0
of the dissipation for small Mach numbers Q p is determined as
proposed by Kroll and Radespiel in [8].
The values left and right of the face F are taken directly from
Po and PI for first order calculations. For second order accurate
calculations the independent flow vanables are Linearly recon-
&,=
(. b
V,+K$
7v
:;
V,+K$
7z
7.m
9
V,+K$
&,I
withV,,V,andVxasdescribedinsection5.1.
structed on the control volumes around Po and P I .For the con-
trol volume of node PO it reads: The derivatives of a flow variable U have already been ob-
1-
tained for the second order discretization as described in sec-
U L = uo+ vi&.-Vo,t
2
. (12) tion 5.2. They are the components of the gradient vectors Vil =
(uz,uy,uJT.The face values are determined by an arithmetic
The gradient V& of a variable U is obtained by employing a averaging of the respective values in the nodes and P I .
GreenGauDformula:
6.TEMPORAL DISCRETIZATION
The temporal variation of the flow quantities can be written in
general form for a node POas:
where f2o is the volume of the dual cell around No and lo,,is the
normal vector of the dual mesh face F as shown in figure4.
A comparison with equation (6) gives for the residual 20:
.-( z p-"
"i,.-u" if u ;<
~ rq
where denotes the maximum eigenvalueof the flux Jacobian.
U,,.-"" if u ;>~rq . (15)
It can be detemned as a integration over the surface of the con-
I if u i=
~ uo trol volume:
Herein U- and U";" denotethemaximumlminimumoftheval- 6
ues of u at the nodes Po..6 and u ; denote
~ the reconstructed value
at the faces of the control volume between POand E . The values
%=
,=I
c 130, x S0,;l +nor'l$l,,l (20)
Following [ I I ] the viscous time step has to he scaled with a The selection of the Stan node and some strategy, which of the
factor Kv = 0.25: neighboringcontrolvolumesareto be fused with thecontrol vol-
ume of the start node, is the only possibility to control the qual-
ity of the coarse grid. It appears that the best grid quality can
Employing an integration around the control volume, the vis- he obtained when the agglomeration is marching along coars-
cous eigenvalue hr; can be written as: ening fronts throughout the grid. Furthermore, nodes lying on
solid walls should be preferred to remain in the coarse grid. So
the highest priority to become the next start node will be given
wall nodes lying on the coarsening front.
new control volume is children have to he summed up. The contributions are weighted
4
with respect to the size of the children cells:
v0,k = xvi,k-l
i=O
In this equation V,, denotes the sum of the volumes of the chil-
dren of Po,k. According to equation (22) V,,,, equals the’coarse
grid control volume vo,k around PO,& so equation (25) also be
written as:
(26)
are connectedto form prismatic cells in the lower part of the grid
and tetrahedral cells in the upper part.
The flow is coming from the left side parallel to the boundary
planes. In rear part of the lower plane a no slip plate is located.
As it can be seen from the pressuredistribution onthe boundary
planes in figure 9 the beginning of the plate is characterized by
a flow stagnation.
For fine grid nodes which are also existing in the coarser grid,
the corrections directly added. Fig. 9: Hybrid grid for flat plate and isobars
The node P2 in figure 8 does not exist in the coarser grid k+ 1. In order to create a tetrahedral grid the prismatic cells are sub-
The control volume of this node bas been agglomerated with the divided. Hence. the point distribution is identical in both grids.
control volume around Po. If the corrections are assumed to be Figure IO presents the tetrahedral grid and the respective solu-
constant over the coarse grid control volume, one can write: tion.
. .
(30)
withS(l,il,ktlrepresentingthenormal vectorrelatedtotheedge
between node Po and Pi. The correction in node 4 can then be
obtained as:
where ?~,O,Z
denotes the vector from POto Pz, Fig. 10: Tetrahedral grid for flat plate and isc-
bars
When the solution also on the finest grid is corrected, the next
iteration n stam with:
p/O)(n)= iC;(b)(n- I ) In figure 11 the convergence history is presented. For the calcu-
lation on the hybrid grid a convergence of 2.5 orders of magni-
tude is obtained after about 1600iterations. The calculation was
8. NUMERICAL RESULTS performed in the single grid mode in order to make it compara-
8.1 Laminar Flow over a Flat Plate ble to the results obtained on a pure tetrahedral grid. As one can
Thesimulationofalaminarflowoveraflatplatewasusedtoval- see from figure 11 the convergence is worse for the tetrahedral
idate theformulationofthe viscoustermevaluation.Figure9de- case. This may be due to the disturbance of the solution caused
picts boundary planes of the three dimensionalhybrid grid. The by the diagonaledges near the wall. Furthermore. becauseof the
grid consist of three layers with 60x40 points each. The points higher number of edges more computational work is required for
31-8
$!
-0
-
0
10. ACKNOWLEDGMENT
The three dimensional hybrid grids that are used in this paper
have been generated by J.W. van der Burg and J.E.J. Maseland of
NLR within the framework of the NLR/DLR Cooperation CFD
for complete Aircraf.
11. BIBLIOGRAPHY
[I] J.W. Sloof and W. Schmidt, editors. Computational Aero-
dynamics Based on the Euler Equations. AGARD-AG-
325, 1994.
[2] A. Ronzheimer, 0. Brodersen, R. Rudnik, A. Findling,
and C.-C. Rossow. A new interactive tool for the man-
-5 ' ' ' " ' ' .-., " ' ' "
500 1000 1500 agement of grid generation processes around arbitry con-
Seconds figurations. In N.P. Weatherill, P.R. Eiseman, J. Hauser,
and J.F. Thompson, editors, Numerical Grid Generation in
Fig. 16: Convergence history for the simulation
Computational Fluid Dynamics and Related Fields, pages
of subsonic flow around ONERA M6 wing
441452,1994.
[3] A. Jameson, T.J. Baker, and N.P. Weatherill. Calculation
9. CONCLUSIONS of inviscid transonic flow over a complete aircraft. AIAA-
A finite volume scheme basedon hybrid grids is presented. The 86-0103, 1986.
employed grids consist of prismatic cells near body surfaces and [4] T.J. Barth. A 3-D upwind solver for unstructured meshes.
tetrahedral cells connecting the prismatic domains and the outer AIAA-9 1-1 548-CP, 199 I .
boundaries. The use of prismatic cells offers the possibility to [ 5 ] C. Gumbert, R. Lohner, and P. Parikh. A package for un-
resolve viscous dominated flows such as boundary layers effi- structured grid generation and finite element flow solvers.
ciently and accurately by applying high aspect ratio cells in the AIAA-89-2175,1989.
respective areas. Due to the use of tetrahedral cells, grids be-
[6] T. Minyard and Y. Kallinderis. A parallel Navier-Stokes
come quite flexible and the generation of grids, even for com-
method and grid adapter with hybrid prismatichetrahedral
plex configurations, is relieved considerably compared to struc-
grids. AIAA-95-0222, 1995.
tured approaches.
[7] M.S. Liou and C.J. Steffen. A new flux splitting scheme.
In the preprocessing an auxiliary mesh of control volumes is Computers and Fluids, 107(1):23-39, 1993.
computed from the initial grid. The auxiliary mesh covers the [8] N. Kroll and R. Radespiel. An improved flux vector split
entire computational domain and can be used in both the tetra- discretisation scheme for viscous flows. DLR-FB 93-53,
hedral and the prismatic domains. In the flow solver part of the 1993.
scheme an edge based data structure is utilized, so the cell struc- [9] T.J. Barth and D.C. Jesperson. The design and application
ture given by the initial grid becomes unnecessaq.The feasibil- of upwind schemes on unstructured meshes. AIAA-89-
ity of employing hybrid grids even for three dimensional flow 0366,1989.
calculations are presented.
[IO] A. Jameson. Transonic flow calculations. MAE Report
The multigrid algorithm based on the agglomeration of control 165 I , Princeton University, Princeton, New Jersey, 1983.
volumes is a natural extension of the dual mesh technique. It fits [ 1 I] D.J. Mavriplis. Accurate multigrid solution of the Euler
perfectly to the edge based data structure and results in a small equations on unstructured and adaptive meshes. AIAA-
memory requirement. 88-3706-CP, 1988.
[I21 D.J. Mavriplis and A. Jameson. Multigrid solution of the
The calculation of inviscid fluxes are demonstrated to be effi- two-dimensional Euler equations on unstructured triangu-
cient and accurate. Shocks are captured nicely by the employed lar'meshes. AIAA-87-0353, 1987.
upwind flow solver. Also the formulation to calculate the vis- [ 131 R.H. Bailey. A multigrid algorithm for the solution of the
cous fluxes has proved its accuracy. At the interface between the Navier-Stokes equations on unstructured meshes for lam-
prismatic and the tetrahedral region in some cases wiggles in the inar flows. DLR-IB 129-92/10, 1992.
flow solution occur. Those wiggles will be subject to more de-
tailed investigations. [I41 K. Riemslagh and E. Dick. Multistage Jacobi relaxation in
multigrid methods for steady Euler equations. Submitted
Though the multigrid formulation works nicely for the case pre- November 1993 to International Journal of Computational
sented here, it still have to be improved, as the gain for cases with Fluid Dynamics.
high aspect ratio cells is less than one would expect. In order to 151 M.H. Lallemand, H. Steve, and A. Dervieux. Unstruc-
enable the simulation of viscous flows also around three dimen- tured multigridding by volume agglomeration: current sta-
sional geometries the next step will be the implementation of a tus. Computers and Fluids, 21:397433, 1992.
1 suited turbulence model. With the improvement of the multigrid 161 V. Venkatakrishnan and D.J. Mavriplis. Agglomera-
algorithm even the simulation of high Reynolds number flows tion multigrid for the three-dimensional Euler equations.
are expected to become feasible for complex geometries, like AIAA-94-0069, 1994.
flapped wings or complete aircraft configurations. 171 V. Venkatakrishnan. A perspective on unstructured grid
flow solvers. ICASE report 95-3, 1995.
Finally a grid adaption algorithm, either based on local refine-
32- 1
P. Brenner
Atrospatiale Espace & Mfense
BP2 78133 Les Mureaux CEDEX, France
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms''
held in Seville, Spain, from 2-5 October 1995. and published in CP-578.
32-2
2- I'tkoulement varie trbs brutalement du haul dik?rations sur cette mCme maille est important,
supersonique au bas subsonique et le rapport de sinon, il est faible. I1 s'agit d'une methode
pression rencontr6 dans les chocs forts p u t atteindre d'integration temporelle adaptative qui peut Cue
plusieurs milliers. consider& comme une technique consistante et
I 3- le mouvement relatif des diffbrents corps est conservative de pas de temps locaux.
totalement quelconque (rotation complexe, translation Dans la description qui suit. nous insisterons sur les
importante...), problbmes de conservativid, de consistance et de
4- l'hulement m m e la position des &ages peut &e stabilid qui ont conditionnC le choix des algorithmes
rapidement Cvolutif et les phdnombnes acoustiquesou utilids.
de propagation de chocs sont souvent pr6pondhnt.s
quanta l'kvolution du champ ahxlynamique. THODE & R I O E
La premiere contrainte nous a amen6 h utiliser une
formulation non structurk qui pdsente une grande La mCthode des volumes finis (F.V.) repose sur la
souplesse dun point de vue ergonomique pour la rhlution des &pations sous forme indgrale, c'est h
&finition des maillages. dire que l'on fait un bilan des valeurs conservatives
La seconde now a conduit au choix dune methode de sur un element de contrdle. Notons que, en toute
volumes finis du type Godounov (dordre deux) qui rigueur, le bilan doit Ctre vBrifiC que1 que soit
est trbs robuste et precise. Notons ,que dans la 1'61Cment de contrdle considere appartenant au
formulation utilisee, les caracteristiques de domaine &udiC.
l'koulement sont localides au centre de gravid des
mailles. I- EQUATIONS G ~ ~ R A L E S
La troisibme 6limine toute technique utilisant les
dbformations de maillage. En effet, lorsque les Les &pations d'Euler sous forme indgrale en multi-
mouvements Sont importants et quelconques, le gaz compressibles pour les dcoulements
resultat dune deformation peut conduire B une tridimensionnels et lorsque l'6lCment de contr6le est
modification locale des cellules de contr6le tellement mobile, peuvent se m e w sous la forme suivante:
forte que le vrillage peut retourner les mailles jusqu'h
l'obtention de volumes nkgatifs.
Quant aux methodes utilisant des remaillages
adaptatifs. elles nous ont semble trop lourdes et
contraignantes d'un point de vue instationnaire.
lorsque l'on desire assurer la conservativite d'un
systkme (ou l'accroissement entropique par exemple).
Nous avons donc opt6 pour une technique de
recouvrement de maillage conservative. Pour tenir
compte du mouvement relatif des maillages associCs
aux corps mobiles, nous utilisons une formulation des
flux Euler-Lagrange mixtes (A-L-E) qui simplifie la
mbthode et surtout assure la conservativite du Oh, 6Q est la frontibre qui entoure l'elbment de
systtme. En effet, bien que les maillages soient contr6le R, p est la masse volumique, V est la vitesse
du fluide dans le referentiel Galilkn de calcul ,U est
rigides, cette technique permet de travailler dans un
seul referentiel contrairement aux formulations
la vitesse de la frontibre 6R,Et est 1'6nergie totale
Euleriennes pures qui nhssitent I'emploi d'un Specifique. Cv la chaleur Specifique a volume constant
dfdrentiel par maillage puis l'introduction de forces et N. le nombre de moles par unid de masse.
dentrainement qui sont traides comme des termes La premiere Quation est relative h la variation du
source nuisant B la conservativitdglobale. volume de contr6le. Puisque les maillages sont
Notons que pour une m6thode d&n& de calcul des rigides, elle n'est utile qu'au niveau de l'interface entre
flux, I'A-L-E ne modifie que dune faGon mineure maillages comme nous le montrerons par la suite.
l'algorithme car il suffit de prendre en compte la Les deux Quations sur Cv et N sont prises en compte
vitesse des faces lors du dhntrage. Enfin, l'interface pour simuler le melange de gaz suppods parfaits et
existant entre maillages est trait& comme les faces thermodynamiquement parfaits. Cette modelisation
d'une quelconque maille sans faire intervenir de trbs simple n'est correcte que si le milieu est non
changement de dferentiel. rl?actif.
Quant h la dernitre contrainte, elle nous a conduit B
Climiner les methodes dinteggration implicites qui 2- SOLVEUR &ODYNAMtQUE
6touffent une grande partie de I'acoustique et, au
mieux, Ctalent les chocs qui se propagent. Nous avons Bien qu'il soit difficile de dtkoupler la discdtisation
donc mis au point une mkthode explicite dintkgration spatiale de la discretisation temporelle, nous allons
permettant toutefois de tenir compte des caracBres proceder ainsi afin de mettre en lumibre le
specifiques de l'tcoulement local: lorsque les cheminement que nous avons suivi.
phenomtnes sont rapides dans une maille, le nombre
32-3
I ;zt
du premier ordre (sur les maillages reguliers ou
non), ainsi la reconstruction est bien du second
soit (2) = 1(0- v) continue dCrivable ordre.
2 - que1 que soit le schtma d'inttgration, la mCthode de Godounov, en multidimensionnelle pour des
est fortement oscillante h proximitt des zones de maillages de parallClCpip2des rtguliers.
discontinuid (chocs, variations importantes de taille
de mailles...). G&e Zi ces conditions, nous retrouvons facilementles
rtsultats ttablis en monodimensionnel pour des
LA solution la plus simple au premier problbme maillages rkguliers. lorsque I'on intkgre par un Euler
consiste h mettre en oeuvre un schtma temporel plus explicite:
Clabort (schtma de Heun explicite. Euler implicite...). - la mtthode de Godounov du premier ordre est stable
A CFL tgale 1,
Pour rtsoudre le second point dur, par contre, la - le limiteur minmod est stable Zi CFL, tgale 2/3,
solution que nous utilisons est apparent& B celle de - le premier limiteur de Van Leer, qui correspond la
Van Leer qui consiste h limiter les pentes comme suit: condition (9c) est stable h CFL tgale 1/2...
Considtrons pour simplifier, le cas d'une Quation de Bien que le raisonnement prtctdent soit issu de la
convection linthire multidimensionnelle A la vitesse lintarisation d'une Cquation scalaire, I'inttrCt
primordial de cet ensemble de conuaintes provient du
C (constante)de la variable scalaire a. fait que:
1- il est utilisable en multidimensionnel,
aa
soit (6) -+ E.GrZd(a) = 0 2- il est applicable Zi de nombreux schtmas en temps,
3- il n'est pas lit au type de reconstruction.
at
4- il est local, donc il autorise l'6tude de la stabilid
que I'on intkgre en temps sur la maille i, par une des mtthodes h pas de temps locaux.
mtthode A un pas, aprbs discrttisation du type .. . .
volumes finis, sous la forme: Dans notre code. nous uti1isons le Iim iteur
. .
GorresDondant Zi CFL Ceal 1/2 en I i m m
globalement le madient (su- r
b e variable) en le mu ltipliant DZ un c s f fICient
wi Dermet de sau'sfaire (9& (9bl et (9c),
Oh At repr&nte la dude comprise entre les instants n Nous noterons au passage que I'emploi d'un tel
et n+l et 6..la valeur de a qui determine le flux au limiteur peut parfois (rarement si le maillage est
13 rtgulier) faire chuter la prkision des flux h I'ordre 1.
centre de gravid de la face Sij entre n et n+l. Par exemple pour le limitew correspondant Zi CFJ.,
&gal 1/2, I'ordre 2 n'est effectif que lorsque le polyiklre
On desire c r k r un schtma localement Zi variation maximal ayant pour sommet les centres des voisins j
born& c'est h due vtrifiant la contrainte suivante: de i contient Zi la fois les centres de gravit.6 des faces
Sji et leur symttriques par rapport au Centre de gravid
G de 1'CICment i considtrt?.
Le cas bidimensionnel ci-dessous permet d'illusuer
cette condition: le polygone maximal est mathialist
Nous remarquerons qu'un tel schtma est positif, et par le triangle (1.2.3). le centre de gravid de la face
aprbs quelques manipulations, on en d6duit les SG1 et son symttrique le point (a) sont bien contenus
conditions suffsantes mais non nckessaires: dans ce polygone. par contre, bien que le centre de
gravitt de S(33 vtrifie la condition, son symtuique, le
point (c) ne la vtrifie pas et pour S G ~ c'est . le
contraire, seul le symttrique (b) est dans le triangle
(1,2,3). Cette configuration de maillage peut donc
rendre le schtma numtrique inconsistant du fait de la
limitation globale.
(9d) CFL = -
1 At
-
2 0 j
p.SijI
Cette formulation du CFL coihcide bien avec la
formulation usuelle monodimensionnelleet avec celle
32-5
Nous n'en ferons pas la dtmonstration mais. nous regroupement avec la limitation locale sans perte de
pouvons dire que cette condition est suffisante sans rendement.
ttre toujours ntcessaire: elle dtpend du champ Nous rappelons que le schtma de Heun consiste h
ahdynamique ttudit. calculer les flux (et les gradients) h l'instant n, puis
grice h cette approximation, on calcule l'ttat h
Enfin, la limitation que nous utilisons ne garantit pas l'instant n+l donc les flux h l'instant n+l. Alors. la
la monotonie du schtma en monodimensionnel. variation entre l'instant n et n+l correspond h la demie
Lorsque I'on ttudie par exemple le cas d u n e dttente somme des flux prkddemment calculCs en n et n+l.
dans le vide, cette caracttristique peut devenir Ce schtma est trh stable pour les phhombnes non
phlisante: il est alors difficile sans passer h l'ordre 1 lineaires.
de conserver des CFL corrects. Pour pallier ce
1.
problbme. nous avons mis au point une procuure Schema d
"monotonisante" qui consiste h prendre la valeur la Le schtma de Heun possikle de nombreuses qualids
plus proche de a i panni 6..et 6 .. (donc la valeur la numtriques, malheureusement, comme tout schtma
11 J1 explicite. il est pratiquement inutilisable lorsque la
plus proche du premier ordre) et nous procuons de durk h simuler est importante.
mCme du cod j. Nous voyons bien que de cette fwon, Si l'on analyse les phtnombnes intervenant lors dune
la reconstruction devient monotone et de plus, sur un dparation d'ttages. on remarque immtdiatement
maillage rbgulier, cette correction n'est que du qu'ils sont quasi-stationnaires sur presque tout le
misibme ordre pour les flux donc le r&ultat global champ de calcul. Seules, quelques rtgions sont
reste du second ordre lorsque le limiteur global n'est balaytes par des courants fondamentalement
pas effectif. instationnaires. Donc, il est inttressant, dans ces
zones, dutiliser de petits pas de temps, par contre,
2.2- Discrktisation temporeue ailleurs, de grands pas de temps sont suffisants.
Nous avons donc mis au point une technique de pas
de temps local qui est conservative, consistante et
Nous avons prktdemment soulignd le fait que le stable: I'int6gration temporelle adaptative (Rtf. 5 et
schCma du second ordre non limit6 est instable 8)-
lorsque l'on intkgre en temps par une mtthode de Dans chaque maille, on travaille en utilisant le pas de
Euler explicite. Pour remuier h cet inconvtnient et temps le plus proche possible du pas de temps
afin daugmenter la stabilitb du schtma limid (donc explicite maximum admissible.
travailler avec un CFL, suphieur 21 1/2), nous avons Soit Atmin le plus petit pas de temps sur tout le
impltmentt un schema d'intbgration explicite du domaine, pour simplifier la gestion des difftrentes
second ordre. classes temporelles, on ordonne les pas de temps en
En effet, l'ttude de la stabilit6 lint%re par I'analyse puissance de 2, proportionnellement h Ahiin..
de Fourier montre que les schdmas classiques Cest h dire que si le pas de temps admissible dans la
d'intdgration du second ordre sont stables pour un maille i vaut Dti alors on le transformera en:
CFL dgal A 1, lorsque I'on utilise une discrdtisation (10) Ati = AtminZLi
spatiale ddcentrke du second ordre avec QSI&
oii Li reprtsente le niveau temporel de la cellule i tel
centrCe. que:
Pour obtenir un schtma rclativement bon march& il
faut tenir compte du coOt informatique des difftrents
algorithmes intervenant lors de la discrttisation (11) Atmin .2Li I Dti < Atmin .2Li+1
spatiale. Ainsi, approximativement, la rtsolution du
problbme de Riemann reprtsente 15%du coQttotal, le Entre deux mailles, on posera comme principe que
calcul des gradients 30%.la limitation globale environ I'interface est du niveau temporaire le plus fort.
30% et la limitation locale environ 5%. le reste ttant Pour que la mtthode soit conservative, il faut que les
difficilement dpemriable inttgrales de flux de part et dautre de l'interface Sij
Nous voyons donc qu'il faut tviter si possible de soient identiques. Donc, il suffit de dtfinir en tout
recalculer les gradients et de les relimiter instant le flux de facon univoque sur Sij. Ensuite, si
globalement. Par contre, il est acceptable de refaire par exemple Li=Lj+l alors, dans la maille j on fera
une limitation locale et de recalculer les flux. deux iterations pour une seule dans i. Ainsi dans j
I1 faut, dautre part, prendre en considtration le fait l'inttgrale de flux vaudra:
que les Quations h rtsoudre ne sont pas lintaires: tous
les schtmas l i n h i d s sont Quivalents pour l'analyse :+At: t+Z.At : t+Z.At:
t
Nous avons donc choisi le schkma de Heun mais J
avec un seul calcul de gradient. Concernant la
limitation globale, une pr&ure simplifik permet un et dans i:
32-6
expdrimentale (la valeur moyenne correspond A une 1- Pour chaque face de M2, dbterminer la partie
rdpartition dquiprobabledes pas de temps) comprise dans chacune des mailles de M1. La somme
de ces parties &termine la couverture des faces.
2- Pour chaque face formant la frontitre de M1,
dtterminer la partie comprise dans chacune des
mailles coup&s (une maille 6tant consid6rb comme
coup6e si l'une de ses faces est partiellement ou
totalement couverte p si au moins une de ses faces
n'est pas totalement couverte).
Nous voyons donc qu'il s'agit bien du mCme
algorithme de base: determiner la surface dun triangle
contenue dans un polyMre.
Pour rtsoudre ce problbme, le plus simple est de
travailler dans le plan de la face triangulaire. On
d&rmine alors la trace polygonale du poly&re dans
ce plan (chose simple puisque le polyMre est form6
de triangles) puis la partie commune au triangle et A
Notons que ces valeurs sont intimement l i k s A ce polygone. Cette demitxe opt?ration ntkssite juste
l'algorithmique utilid. et que les formulations non la connaissance des segments onends qui constituent
structurtsessont bien adapt& A ce type de technique. la m e . I1 faut 6viter tout algorithmequi d&xmine le
chainage des segments entre eux: c'est inutile et
3- CHEVAUCHEMENT D E MAILLAGES excessivement coQteux.
Calculer la surface couverte n'est pas suffisant, il faut
L'idk de base est trbs simple (Ref. 6 et 7): aussi d6terminer son centre de graviu5. On d6termine
considkons qu'un maillage M1 se comporte comme alors le centre de gravid des interfacescoup& de M2
un masque qui se d6place et fecouvre partiellement un et celui des morceaux de la bordure de M1 qui
autre maillage M2. L'interface entre maillages est ferment les mailles couptk.
c r 6 k naturellement: il s'agit de la surface Les volumes et les centres de gravid des mailles
dintersection form& par 1'6videment dans le second coup6.e~de M1 sont alors calcul6s en utilisant les
maillage A42 du volume occupt? par le masque M1. formules suivantes (Green):
Les mailles de M1 ne subissent donc aucune
modification. Par contre, dans M2. il y a prbence de (21) ai =
1
-E
3 J
OM...%.
1J 1J
trois types de mailles (cf. figure 1):
1- les mailles totalement couvertes. 1
2- les mailles partiellementcouvertes, (22) oG=-cog
40.
..(OM ...S..)
3- les mailles totalement d6couvertes. 1 J 1J 1J 1J
Les mailles de la seconde cattgorie sont donc
Ob:
modifiks car une partie des faces qui les constituent
est couverte et de nouvelles faces correspondant A la - Mij est un point quelconque de la face plane Sij
limite exteme du masque sont cr&s. Ces nouvelles orient& vers l'exu5rieur de i,
faces forment l'interface entre mailles de M2 - j est soit un voisin "naturel" (donc une autre maille
(coupks) et mailles de MI (non modifih). de M2). soit recouvrant (donc une maille de M1 qui
est voisine par I'interface Ml-M2),
3.1- Cakul des intersections - G ttant le centre de gravig de i et gij. le centre de
gravitt de Sij.
Pour simplifier le probltme gbmttrique nous avons
considCr6 que toutes les faces des 616ments sont 3 2 - Optimisatwn du nombre d'opkratwns
planes. Les faces quadrangulaires sont donc trai&s
comme deux faces triangulaires. Nous n'avons donc Pour que la m6thode soit utilisable, il faut que le
plus qu'un seul type de faceue: le mangle. temps de calcul des intersections soit au plus du
L'algorilhme de calcul d'intersection se limite A deux mCme ordre de grandeur que le !?'TIPS de calcul d'une
&apes: ithtion de solveur drodynamique.
1- d6terminer le niveau de couverture de chaque face Soit N le nombre de mailles, alors, le nombre de
du maillage par le masque, facettes h la bordure de M1 est de l'ordre de N u 3 et le
2- determiner la partie de la limite exteme du masque nombre de cellules coup6es est du mike ordre.
qui feme chacune des mailles de M2 coup6es. Pour d6terminer la surface de la bordure qui ferme
Les deux Ctapes sont en fait identiques dun point de chaque cellule coup&, il faudra, ceIlule, environ
vue algorithmique lorsqu'on les reformule comme N ~ ofimtions.
D
suit: Pour toutes les cellules, il faudra donc de I'ordre de
N4D op6rations. Le solveur a6rodynamique nkessite
32-8
de I'ordre de N op6rations par itbration. II faut donc recalcule toutes les intersections (ainsi, le bilan
optimiser le calcul des intersections. volumique est exactement v6rifiC sans utiliser Aq).
La solution que nous avons retenue consiste A Cette technique permet de diviser le coat des calculs
dCminer, sur une grille c&sienne (ij,k) contenant dintersectionspar un facteur tr&s important (de I'ordre
N mailles, I'appartenance des differentes facettes de cent).
formant la bordure de M1. Ensuite, pour chaque
cellule coup&. on determine sa position dans la grille Finalement, il reste A traiter le problbme des mailles
et donc quelles facettes peuvent la fermer. Le fortement couvertes.
prkonditionnement des facettes nkessite environ N En effet, lorsque le recouvrement dune maille par le
ofiration, et le calcul par cellule coup& est de I'ordre "masque" conduit A des volumes tr&s faibles, la
de une op6ration soit, pour mutes les cellules envuon condition de CFL (9d) devient trop penalisante
N ~ op6rations.
D puisque le pas de temps doit tendre vers dro. La
Le coOt global est de N operations (le solution consiste A assembler ces mailles avec des
prkconditionnement est plus cher que le calcul mailles voisines suffisamment dkouvertes. De cette
d'intersection !) donc compatible avec facon. I'ensemble forme dune maille suffisamment
I'a6rodynamique. dkouverte et de ses associes constitue une "macro-
maille" dont le volume est assez grand pour ne plus
3.3- Priorit6 de mailhges p6naliser le pas de temps.
Nous utilisons comrne critkre le taux de couverture:
L'utilisation d'un masque et dun maillage masque une maille doit ttre assemblk lorsque sont volume est
manque de souplesse. En effet, les mailles du maillage couvert A plus de 70%. D'autre part elle sera
masque sont par exemple mieux adapttks au calcul assemblCe avec le voisin qui possikie avec elle en
dune couche limite autour du corps lie h ce maillage commun l'interface la plus grande et qui est d h u v e r t
que les mailles du masque. A plus de 30%en volume.
I1 est donc intbressant de dCfinir des zones prioritaires Lorsque l'assemblage n'est pas possible directement,
que le masque ne peut couvrir mais qui au contraire, une procedure iterative est mise en oeuvre: on
couvrent le masque. assemblera alors par I'intermddiaire dune cellule qui
La figure 2 nous montre le rtsultat d'une telle elle mtme est assemblk (...).
strat6gie. Sa mise en oeuvre ne pose pas de problbme Lorsque I'assemblage n'est pas possible du tout, on
particulier. evince du calcul les mailles incriminks.
Les flux A la fronti&resont calculCs de la mtme facon Nous prdsenterons ici des cas de calcul illusttant les
que les flux entre deux mailles appartenant au m t m e possibilitbs de la mCthode.
maillage: puisque nous travaillons en non structure, la Les figures 3 et 4 montrent le type dapplications
topologie importe peu donc une interface entre trait&s &e au code de calcul FLUSEPA.I1 s'agit de
mailles sera trait& toujom de la mtme facon. que ces simulations tridimensionnelles. Le Mach exteme est
mailles appartiennentou non au mtme maillage. compris entre 5 et 6.
Quant A la premibre equation du systkme (1) Pour la separation d'Ctage de missile sous incidence
concernant la variation de volume, elle permet, (figure no 3), la phiode simulk est d'environ 150 ms
lorsque les mouvements sont lents, d'eviter de et la dur& du calcul est de 12 heures en pas de temps
recalculer les intersections aprb chaque iteration global. Notons que ce type de simulation nkessite
aerodynamique:' aussi bien le calcul de I'koulement externe que de
- pour chaque maille coup& i, on Cvalue I'increment I'koulement inter Ctage puisque ils interagissent tr&s
de volume A % dQ au mouvement relatif des fortement entre eux. Dautre part, dans I'inter Ctage.
maillages, les pressions peuvent devenir importantes (lorsque la
- on determine I'increment relatifmaximum sur mutes section de passage vers I'exdrieur est faible). De ce
ces mailles fait, I'koulement dans la tuyhre peut Ctre fortement
dkollC: il faut donc imp6rativementle calculer aussi.
(23) AIma
0
1 I
= m p ( do. / 0;)
11
Pour le largage d'acctlerateurs (figure no 4). la
pCriode simulCe est d'environ 1.1 seconde. Les
oh designe le volume initial, maillages comportent environ 100 000 mailles et la
- lorsque AIrna est faible, on fait le bilan durk du calcul est denviron 40 heures en temporel
adaptatif (sur Cray YMP). Le gain de temps par
volumique pour tenir compte des dkplacements sans rapport A un schema A pas de temps global est
remettre A jour les caractCristiquesdes interfaces, d'environ un facteur 20. Afin de souligner la
- lorsque AImm est grand (ou bien lorsque la robustesse de la mCthode. nous prkisons que les
somme des AI,, calculks depuis la dernikre phCnomencs rencontrks lors de cette simulation sont
fortement instationnaires (acoustique...) et que dans
remise A jour des intersections est grande), on
32-9
les zones de chws fons, les rappons de pression son1 "Temporal Adaptive Euler/Navier-Stokes Algorithm
de I'ordre de 4 OOO. Involving Unstructured Dynamic Meshes"
AIAA Journal, Vol30, n"8.1992
Notons pour finir que des etudes de validation (aussi
bien bidimensionnelles que tridimensionnelles) (6) S.L. HANDCOCK
comprenant des comparaisons avec des mesures "Finite difference equations for PISCES 2 DELK. a
exrnmentales ont dtk m e n h aveC sucds. coupled Euler-Lagrange continuum mechanics
Afin de ruuire wtre expod, nous ne les prbenterons computer program"
pas ici. Physics Internatonal Technical Memo - TCAM 76-2,
1916
CONCLUSIONS
(7) P. BRENNER
Les ddmarches thbriques que nous avons men& "Three-Dimensional Aerodynamics with Moving
nous prouvent aussi bien la consislanceque la stabilitk Bodies Applied to Solid Propellant"
lindaire de la mdthode sur des maillages AIAApaper91-2304,1991.
multidimensionnels non reguliers en ,espace et en
temps. L'exphimentatjon numdrique nous a dkmontr6 (8) P. BRENNER
le bon comportement des schdmas lors de la "Numerical Simulation of Three-Dimensional and
r~%lutimde syskmes non lineaires. Unsteady Aerodynamics About Bodies in Relative
Quant A la pkision de la mdlhode, elle est d'ordre 2 Motion Applied to TSTO Separation"
en espace et en temps sur les maillages d'hexatdres AIM paper 93-5142.1993.
srmctuds dguliers et d'ordre 1 au moins ailleurs.
Nous notemns finalemem que le potentiel d'dvolution
du code est imponant puisque, par exemple. il est
envisageable dadapter le maillage par ddformation
(dtant donnde notre formulation A-L-E.). par
enrichissement (nous travaillons en non suuctun5) ou
gdce au chevauchement d'un maillage localement
adaptk 2 I'koulemenl...
EJiF&mS
(1) S.K.GODOUNOV,A. ZABRODINE,
M. IVANOV, A.KRA&O, G. PROKOPOV
"R6solution numdrique des p r o b l h e s
multidimensionnels de la dynamique des gaz"
Editions MIR - Moscou
(2) M. POLLET
"Comparison of transport schemes for Navier-Stokes
equations. application lo mket propullion"
7h International Conference on Numerical Methods
in Laminarand Turbulent flow, Stanford USA.1991.
Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
33-2
Assume that the computational domain is discretized in Equilibrium of the state (L) is obtained if the flux in be-
intervals with the centers denoted by i-1, i, i+l, ... and tween (LJ and (R) is obtained by full upwinding. Also,
the cell faces by i-1/2, i+1/2, .... Then, the numerical the state (R) is in equilibrium if an upwind flux is used
flux at the interface i+1/2 can be approximated accord- for the interface. This requirements can be fulfilled by
l
ing to Liou [ 1,2] defining the speed of sound at the shock
(4)
Fc = $SI
1 [
M,, 1 sum
advection
I-@[ diff
advection
1)
quantities are extrapolated to yield left and right states. where (0 is a function [SI of the spectral radii, A,. in the
The extrapolation function is designed such that the ac- coordinate directions i and j so that
curacy is limited to first order at discontinuities in order
to guarantee shock capturing without spurious oscilla- @ = IML,RI for 4 >>+
tions. Unfortunately, we find that the two flux vector @=6 for 4 << Aj
split approaches described above should not be com- m i c a l values of 6 used in the present work are 6=1/4.
bined with the same extrapolation functions. This adaptive dissipation formulation makes sure that
The AUSM scheme works well with the van Albada boundary layers are not numerically smeared but there
limiter function is sufficient damping of modes in the direction of long
cell sides. A similar formulation has been implemented
into the HCUSP scheme.
The capabilities of the present discretization schemes
for perfect gas flows with shocks and shear layers are
where A+ = U ~ + ~ - U, A- ~ = ui-ui-l assessed by computations of transonic and hypersonic
We exvapolate the primitive variables and the total en- two-dimensional flows. Fig. 5 compares distributions of
thalpy using equ. (7). Extrapolation of the latter quantity pressure coefficient,total pressure loss and grid conver-
is needed in the energy flux in order to allow steady gence of the aerodynamic coefficients for transonic in-
state solutions with constant energy. Also,the parameter viscid flow over NACA 0012 airfoil. AUSM' and
E is made large if the contravariant velocity is smaller
HCUSP yield comparable shock resolution whereas the
than a certain fraction of the speed of sound. Doing this, hybrid AUSM appears to be more dissipative at the
clipping within boundary layers and false interpolation shock. The HCUSP scheme generates more entropy at
values of the contravariant velocity components are the leading edge and lift and drag values converge
avoided. 'I)pical results of limiter applications for high somewhat slower with grid density as compared to
Reynolds number viscous flows are shown in Fig. 3. AUSM. On the other hand HCUSP is more rapid with
The pressure contours at the rear part of RAE 2822 air- respect to the residual convergence as compared to
foil at transonic flow conditions shows oscillations near AUSM. Typical convergence rates of the multigrid
the edge of the turbulent boundary layer if limiting of method described below are 0.90 for HCUSP and 0.94
the Cartesian velocity components is applied in the tradi- for AUSM.
tional manner. These oscillations disappear if the limit- The resolution of very strong shocks and hypersonic
ing operator is switched off for small values of the Mach shear layers is shown in Fig. 6. The Mach number con-
number in the contravariant coordinate direction. More tours obtained for inviscid flow around a blunted wedge
technical details of the limiter can be found in Ref. [ 5 ] . demonstrate almost perfect shock capturing within one
Unfortunately, we find that the application of the van cell for AUSM' and HCUSP whereas hybrid AUSM
Albada limiter with the HCUSP scheme yields some needs one interior point for this case. The resolution of
preshock oscillations. Hence, we use the limiting func- the thermal boundary layer which is displayed on the
tion described in [4] for the HCUSP scheme. That func- right part of Fig. 6 is similar for HCUSP and AUSM.
tion has also been extended to avoid limiting in low We note that both schemes give much better results
Mach number regions, by which the accuracy of the re- compared to a conventional central-difference scheme
33-4
with a single scalar viscosity (not shown here). schemes can be combined with multigrid algorithms in
order to accelerate convergence to steady-state, accord-
4. OPERATOR SPLI'ITING AND IMPLICIT ing to Ref. [8].
TREATMENT OF THE CHEMICAL SOURCE
Coarse meshes for the multigrid are obtained eliminat-
TERMS
ing alternate points in each coordinate direction. Both
For flows with nonequilibrium chemistry additional the solution and the residuals are restricted from fine to
conservation equations with chemical source terms oc- coarse meshes. A forcing function is constructed so that
cur, which render the system of equation stiff if the time the solution on a coarse mesh is driven by residuals col-
scale of the chemical reactions is significantly smaller lected on the next finer mesh. The corrections obtained
than the fluid mechanics time scale. A simplified form on the coarse mesh are interpolated back to the fine
of the conservation equation is given by mesh. This multigrid scheme is now widely used in the
a
-W = -F+ S
CFD community and it works quite well for a wide
at range of subsonic and transonic flow problems.
However, a number of modest modifications of the orig-
inal multigrid scheme are necessary for high Mach num-
ber flows with strong shocks and strong variations of
S = (O,O,O,O, 0, S,, ...Sn)T , F=discr. flux
viscosity and conductivity coefficients. We employ a
The full set of equations used for reacting flow is given special set of Runge-Kutta coefficientswhich are opti-
in Refs. [ 11, 121. In order to overcome the time step lim- mized for damping with upwind discretization and re-
itations due to small chemical time scales we employ sidual smoothing [14]. Courant numbers of about 5 are
implicit discretizationof the source terms, used in the present work which is about twice the ex-
plicit stability limit. Strong variations of viscosity and
= -Fn+Sn+l conductivity occur in hypersonic viscous flows. Typical
At
time scales of the viscous diffusion process may be
Using a linearization of the source term at time level (n) much smaller than the convection time scale which puts
one obtains a point-implicit update of the solution vec- a severe restriction on the time step if purely explicit
tor W for the time level (n+l), time integration is sought. This problem may be circum-
vented by locally adjusting the coefficientof the implicit
(8) residual smoothing scheme [15], such that the original
time step based upon the inviscid flux vector is recov-
The Jacobian matrix has no entries in the first five rows ered,
because these equations have no source terms. Hence, At = CFL V
the update of equ. (8) can be broken up into? fully ex-
plicit update forW, = (p, pu, pv, pw, PE) followed Ih,I lhtll+ lhrl
+
with Shocks. AIAA Paper 954466,1995. Fig. 1 AUSM+ concept of exact shock capturing
5. Radespiel, R. and Kroll, N.: Accurate Flux Vector
Splitting for Shocks and Shear Layers. J. Comput. - W
- slates L and R
Phys., Vol. 121, No. 1, (1995). satisfy lump conditions
6. Wada, Y. and Liou, M.S.: A Flux Splitting Scheme - L is supersonic
with High Resolution and Robustness for Disconti-
nuities. AIAA Paper 94-0083,1994.
7. Coquel, F. and Liou, M.S.: Hybrid Upwind Splitting Fig. 2 CUSP shock concept of single interior point
(HUS) by a Field-by-Field Decomposition. NASA
T M 106843,1995.
8. .Jameson, A: Multigrid Algorithm for Compressible /Mach number independent limiter]
Flow Calculations. MAE Report 1743. Princeton
University, October 1985. Text of Lecture given at
2nd European Conference on Multigrid Methods,
Cologne.
9. Van Leer, B.: Flux-Vector Splitting for the Euler
Equations. Lecture Notes in Physics 170, pp. 507-
512, (1982).
10. Roe, P.L.: Approximate Riemann Solvers, Parameter
Vectors, and Difference Schemes. J.Comp. Phys. 43,
pp. 357-372, (1981).
11. Brenner, G.: Numerische Simulation von Wechsel-
wirkungen zwischen Stllpen und Grenzschichten in
chemisch reagierenden HyperschallstrOmungen. ~t=2.79"
DLR-FB 94-04, (1994). Re=6.5 E6
[Mach number dependent limiter]
12. Briick, S.; Radespiel, R.; Schwamborn, D.: Exten-
sion of the Euler-mavier-Stokes Code CEVCATS to
Inviscid Nonequilibrium Flows. DLR-Internal Re-
port 223-95 A03, (1995).
13. Kroll, N.Private communication, (1994).
14. Radespiel, R. and Swanson, R.C.: Progress with
Multigrid Schemes for Hypersonic Flow Problems.
J. Comput. Phys., Vol. 116, (1995).
15. Radespiel, R. and Kroll, N.:Multigrid Schemes with
Semicoarsening for Accurate Computations of Hy-
personic Wscous Flows. DGLR Report 90-6, (1991).
16. Blazek, J.: Verfahren zur Beschleunigung der U-
sung der Euler- und Navier-Stokes-Gleichungenbei
stationdren Ober- und HyperschallstrOmungen. Fig. 3 Effect of false interpolation by traditional
DLR-FB 94-35, (1994). limiters for AUSM
17. Atkins, H.: A Multiple-Block Multigrid Method for
the Solution of the Three-DimensionalNavier-Stokes
Equations. DLR-FB 90-45, (1990). S
18. Weilmuenster, K.J.; Gnoffo, P.A.; Greene, EA.:
~
Navier-Stokes Simulations of the Shuttle Orbiter the spectral radii of the inviscid flux Jacobians in i and j,
Aerodynamic Characteristics with Emphasis on +>>A
Pitch Trim and Body Flap. AIAA Paper 93-2814, Fig. 4 Cell with high aspect ratio in two dimensions
(1993).
33-1
1.oo
CP
0.50
0.00
-0.50
-1 .00
1.oo
CP
0.50
0.00
-0.50
-1.OO
1.00
CP
0.50
0.00
-0.50
-1.oo
0.0 0.2 0.4 0.6 r/c 0.8 1.0 0.0 0.2 0.4 0.6 x/c 0.8 1.0
- HCUSP
grid convergence: N = no. cells
__--A
0.060
0.3 0
0.000 0.002
1/N O'Oo4
Fig. 5 Pressure. total pressure loss and grid convergenceof total force for hansonic inviscid airfoil flow
33-8
0.06-
M_=lO, a=O"
TJT_=lORe,=lOOOo
0.04
0.06-
St
0.5
0.04 - .\
0.02
-
+ grid32x24
grid64x48
0.00 1
0.06
St
0.04
0.02 -
- grid-4
gridMx48
0.oc
20 -100 -80 -60 -40 -20 0
X
Fig. 6 Mach contoursand wall heat flux for hypersonic flow over blunted wedge
33-9
T
1000
500
0
' 0
0.5 1 .o 1.5
2.0; 6
\
0.0: 1
II)
mulglid. 3 levels
2
v
4.0 mdugrid, 3 levels pits. m -me mash
cn
-
0 -single glid
3.0
2.0
-
1.o
0.0 -.__._
-.-I'-----_
',_r--__
---_
-1.0 1
0.0 100.0 200.0 300.0 400.0 f 1.0
multigrid cycles
Fig. 9 Convergence histories for flow of dissociating air over 2D cylinder
1 - bow shock
4 - 2nd reanachment
5 - 2nd separation
0.5
0.0 0.5 XI1 I.a
Hg. 11 Grid convergence study for HALIS and annparison with wind m ddata in ONERA S4Ma
33-11
1.50 -
1.oo -
Fig. 12 Grid convergence for heat flux along windward Fig. 13 Ressure distributions along windward
si& of HALIS forebody at flight trajectory symmetq line of HALIS
point
3:
h .
v)
1
0
oarse
mediu
"
-lo' ' ' ' I " '
500 1000
cycles
fig. 14 Convergence histories for HALIS forebody computations
34- 1
* Laboratoire d’Analyse Numirique, CNRS URA 189, Universitk Paris VI, 75252 Paris Cedex 05
t Division de 1’ACrodynamiquethiorique 1, ONERA, B.P 72 92322 Chitillon Cedex, France.
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
34-2
robustes dans la pratique mais prisentant le principal oh E est l’inergie totale du milange par uniti de masse
difaut d’ignorer la structure de la solution relaxie, en et v = ( u l l u 2 ) la vitesse barycentrique du milange.
particulier les ondes liniaires (discontinuitis de con- eu;p disigne 1’ inergie de vibration par uniti de masse
tact) qui la composent. Le respect des ondes liniaires de l’espice moliculaire p.
est crucial dans le cadre des icoulements visqueux La pression du gaz est donnie par la loi de Dalton:
et sa violation rend les mithodes de dicomposition
de flux inappropriies a leur simulation. L’approche (3)
de type Godunov permet de satisfaire cette exigence
nioyennant une complexiti accrue de l’approximation. oh R,, disigne la constante universelle des gaz parfaits
Mais elle trouve un avantage decisif sur l’approche et M, est la masse atomique de l’espice a.
de dicomposition de flux dans le cadre des problimes A ce systime est associie une relation de fermeture
visqueux. Toutefois, ces mithodes peuvent prisenter thermodynamique ginirale telle que la pression du
divers difauts de stabiliti dans la capture des ondes milange virifie:
non-liniaires (ondes de choc et de detente).
L’approche du dicentrement par hybridation (mith-
odes HUS “Hybrid Upwind Splitting”) combine les
deux approches pricidemment cities de maniire A n’en
retenir que les propriitis jugies idoines pour la sim- (4)
ulation des icoulements visqueux hyperenthalpiques. a
Dans le code de calcul, c’est le schima dicentrd risul-
oh hr = rtr - 1. e, et hz disignent respectivement
tant de 1’ hybridation du schima d’Osher et de celui
1’ inergie des modes internes a l’iquilibre avec la tem-
de van Leer qui a i t 6 implanti. D’autre part ont igale-
pirature de translation et la chaleur de formation de
ment Cti implanties dans le code la mithode d’Osher
l’espice a par uniti de masse.
et une mithode de type van Leer.
Les expressions ditaillies des termes source et du
2. ModClisation et Cquations de bilan
tenseur des phinomines dissipatifs peuvent 2tre trou-
vies dans de pricidents articles [4], [12]. Nous
Dans cette etude, nous considirons un milange idial
rappellerons seulement que le modile de chimie
de gaz parfaits constitui de ns espices dont n m espices
choisi est celui de Gardiner [9]. I1 met en ceuvre
moliculaires. Dans le cas de l’air, lea cinq espices prin-
17 riactions comprenant quinee riactions de dis-
cipales Nz10 2 , NO,N et 0 seront prises en compte.
sociation et deux riactions d’ichange. Les iqua-
Les modes de translation et de rotation, et le mode
tions pour les inergies de vibration peuvent inclure
ilectronique sont toujours considiris a l’iquilibre et
les ichanges d’inergie Translation-Vibration (T-V),
sont donc caractirisis par une temperature unique T
Vibration-Vibration (V-V) ou Vibration-Dissociation
alors que les modes de vibration peuvent s’icarter de
(V-D). Le taux d’ichange d’inergie T-V est modilisi
l’iquilibre. Nous supposons que parmi les nrn es-
par un modile de Landau-Teller, lea temps de re-
pices moliculaires, nv, nv 5 n m , d’entre elles ont
laxation entre espices itant donnis par la loi semi-
leurs modes de vibration en disiquilibre (N2, 0 2 et
empirique de Millikan et White [14].
iventuellement NO pour l’air). Nous nous intiressons
Le tenseur des contraintes visqueuses utilise pour la
aux ivolutions bidimensionnelles de ce milange. Ces
viscositi du milange le modile d’brmaly et Sutton
ivolutions sont gouvernies par le systime de lois de
[2], la viscositi de chaque espice i t a n t diterminie par
conservation suivant:
la relation de Blottner [3]. La vitesse de diffusion des
+
8 , ~div(f(u) - D(u)gradu) = n. (1)
espices virifie une loi de Fick et un coefficient de dif-
fusion binaire. Les flux de chaleur du milange et de
f disigne les flux de fluide parfait. Les phinomines vibration sont supposis suivre des lois de Fourier. Le
dissipatifs sont ici modilisis par le tenseur 2). Le ditail des expressions des coefficients de conductiviti
terme source n traduit la prisence des phinomines thermique est donni dans [4].
de disiquilibre. Dans la suite, U ouvert de RP avec
p = ns + n v + 3 disigne l’espace des 6tats.L’inconnue
U : R+ x R a-+ U a pour expression: 3. MBthode numerique
et non admissible des courbes de ditentes issues de L’application p$(u) - pi(.) est strictement croissante
ces deux itats. L’ordre de parcours des courbes de et admet au plus une racine, notie U*. Afin de calculer
detente retenu dans ce travail correJpond a l’ordre na- cette racine, il est utile d’introduire :U et uk difinies
+
turel (U - c, U ,U c). Un tel chemin, noti @ ( u L , U R ) Par
dans la suite, existe sous des conditions thermody-
namiques ginirales tant qu’il n’y a pas cavitation. I1
est compos6 de deux sous-chemins de type vraiment
nonliniaire (VNL) notis 91 et @3 et d’un sous chemin, et de remarquer [5] que U* s’exprime comme combinai-
@2’ de type liniairement diginiri (LD) associi a la son convexe de ces deux vitesses particuliires. I1 existe
discontinuiti de contact. Ce chemin une fois construit donc un riel z* E [O, 11 tel que
permet de difinir compktemen. . flux numerique as-
% 4 6 a u schima d’Osher-C ’om . seion : U* = z*u; + (1 - z*)uk. (20)
Notons que U; - U> > 0 sauf pricisiment lorsqu’il
y a cavitation. En utilisant (20)’ le problime de la
recherche de la racine de l’iquation (19) peut alors
Ctre reformuli en ces termes. Trouver le riel z* E [ O , 1 ]
solution de l’iquation :
Nous renvoyons A [5] pour l’icriture ditaillie de ce flux.
Nous nous consacrons ici a l’exposi de l’algorithme
de construction du chemin @(uL’ U R ) que nous avons
associi au milange de gaz qui nous intiresse. Nous oi nous avons posi
renvoyons a Abgrall et Coll. [I]pour un autre procidi.
Disignons par U; et U$ les itats siparis par la dis-
continuiti de contact se propageant H la vitesse U*.
Ces itats sont construits en risolvant le problime suiv-
ant, exprimant la conservation des invariants de Rie-
Lorsque 7~ # 7 ~l’iquation
’ pricidente n’admet pas
mann et la continuiti de la pression et de la vitesse A de racine explicite, sa ditermination nicessite la mise
la traversie de la discontinuiti de contact.
en oeuvre d’une procidure itirative de type Newton.
Afin d’en optimiser la vitesse de convergence, nous pro-
posons de substituer ir la risolution du problime (21)
celle du problime Cquivalent suivant, prisentant un
tris bon conditionnement. Trouver le riel z* E [0,1]
2 2 solution de l’iquation g ( z ) = 0 oh:
U* + lc;
7L-
= U L + -CL’
7L-1
(16)
I1 est ais6 de voir que la risolution du pricident prob- Afin d’initialiser l’algorithme de Newton, nous difinis-
lime peut itre ramenie ir la recherche de U*, solution sons
de l’iquation
(23)
traduisant la continuiti de la pression et de la vitesse I1 est possible de virifier que la plus proche valeur de
a la discontinuiti de contact. Ce problime une fois la racine z* est donnCe par
risolu conduit a la ditermination des autres quantitis.
Ici, nous avons : m=(z11z2), si < 1’
%nit =
{
min(z1, za), sinon
1 (24)
34-5
coincide avec la racine z* de l’iquation considirie. et ce pour n’importe que1 chemin ip connectant U L
L’algorithme (22)-( 24) converge giniralement en au et UR dans l’espace des itats. Cette propriCtC est a
plus 3 itirations pour un test d’arrit de portant la base de la technique d’hybridation des mithodes
sur l’erreur relative - 11. d’Osher-Solomon et de van Leer [7] dont nous pro-
PZ posons l’extension ci-dessous au cadre des milanges
3.4 MQthode de decomposition de van Leer de gaz en disiquilibre chimique et thermique.
L’extension de la mithode de van Leer aux Cqua-
tions d’Euler multi-espkes et multi-tempiratures a 3.5 MQthode de d k c e n t r e m e n t par h y b r i d a t i o n
fait l’objet de quelques travaux (voir en particulier c h a m p par champ
[ 111). Les extensions proposies conduisent a une L’introduction du dicentrement par hybridation a i t 6
famille de schimas un degri de liberti paramitrant motivie par l’analyse des avantages et des difauts re-
la dicomposition de flux d’inergie. Le choix d’une spectifs aux schimas d’Osher-Solomon et de van Leer.
dicomposition de flux particulikre doit itre opiri de Ainsi si la mithode de van Leer se revile itre trks ro-
maniire a assurer que les matrices jacobiennes des flux buste dans la capture des ondes non liniaires (choc
*
dicomposis f V f (U) n’admettent que des valeurs et ditente), elle est en revanche trks peu pricise dans
la risolution des ondes liniaires (discontinuiti de con-
propres rielles positives ou nulles. Toutefois, il ressort
de ces travaux qu’une telle propridti est difficile a tact). Ce manque de pricision la rend inappropriie
garantir pour tout U de l’espace des itats dans le cadre dans le contexte d’iquations de fluides visqueux qui est
d’une thermodynamique non liniaire en T. Nous avons le nijtre. A l’opposi, la mithode d’Osher-Solomon au-
privildgii dans le code CELHYO le reprisentant de torise par construction la risolution exacte des discon-
la famille considirie permettant de priserver la con- tinuitis de contact stationnaires. Toutefois associi a
stance de l’enthalpie totale A la traversie d’un choc cet avantage, cette dernikre souffre d’un manque de ro-
stationnaire. Ce schima, briivement dkcrit ci-dessous, bustesse dans la capture d’ondes nonliniaires intenses.
s’est rivilk entiirement satisfaisant dans les applica- Les avantages et les difauts inhkrents aux deux mith-
tions pratiques. odes se rivklent donc disjoints et complimentaires.
En introduisant le nombre de Mach M = v/c, les La technique d’hybridation se propose de tirer parti
flux dicomposis se riduisent A f+(u)= f(u), f - (U)= d’une telle complimentariti avec pour but d’associer
0 lorsque M 2 1 et symitriquement ir f+(u) = la robustesse de la mOthode de van Leer dans la ri-
0, f-(u) = f(u) lorsque M 5 -1. Pour ]MI 5 1, solution des ondes non liniaires et la pricision du
ces flux sont difinis par les expressions suivantes : schima d’Osher-Solomon dans la risolution des ondes
liniaires. C’est ainsi que chacun des trois sous chemins
composant le chemin d’Osher-Solomon est associi soit
avec la mithode de van Leer soit avec la mithode
d’Osher suivant la nature nonliniaire du sous chemin
considiri. Dans la suite, nous notons V N L ( i p ) =
ipl U ip3 et LD(ip) = 9 2 . Le flux numirique resultant
de l’opiration d’hybridation trouve alors l’expression
suivante
dant a la tuykre n02.Sa longueur totale est de 3.42 sion gindratrice itant de 441 bar (soit les conditions
m et le rayon du col est de 0.005m. Les conditions du cas test numiro 4):
dans la charnbre correspondent une enthalpie totale Tm=187 K;
riduite de 260 et une pression de 430 bar. La T,,~2=4078 K; T,,o2=2485 K;
%,To
tempirature de la paroi est de 300 K. Elle est supposie p,=1.557 10-3Kg/m3; u,=3934 m/s; T,=300 K;
totalement catalytique jusqu’a une distance de 0.5 m c ~ 2 = 0 . 7 2 5 4 ;co2=0.1354; CNO=0.0895; CN=lo-”;
en aval du col, puis noncatalytique aprks. co=0.0497.
Le domaine de calcul a i t 6 divisi en huit parties. La paroi est supposie totalement catalytique.
Dans le premier domaine, l’icoulement dans le conver- Le deuxikme calcul correspond B des conditions en vol
gent et dans la rigion proche du col est calculi. Ces sur une giomitrie homothitique de la pricidente dans
risultats servent ensuite pour diterminer la solution un rapport 1.4:
dans les zones suivantes de l’icoulement hypersonique. Tm=268 K;
Chaque domaine comprend 89x85 points. ~ ~ ~ 2 . 10-%g/m3;
6 0 8 u,=5083 m/s; T,=1000 K;
Pmz201.5 Pa.
Un calcul laminaire et un calcul turbulent, respec-
La paroi est igalement supposie totalement, cataly-
tivement notis (1) et (2) ont i t 6 rialisis. Pour le cas
tique.
turbulent, le modkle de turbulence utili& est le mod-
kle algibrique de Baldwin-Lomax et le point de tran- Le m2me maillage est utilisi pour les deux calculs
sition est situ6 a 0.5 m en aval du col. Les risultats qui tiennent compte de la diffirence d’khelle. I1 con-
prksentis ont it6 obtenus aprks 6000 itirations dans tient au total 401x110 points. Trois sous-domaines ont
le premier domaine. Pour les autres domaines, 600 6 t i utilisis afin de diminuer les temps de calcul et la
a 200 itirations environ suivant le domaine considiri taille nicessaire de la mimoire. Les domaines se re-
ont it6 nicessaires. Le nombre de Courant peut at- couvrent sur quatre points. Ces domaines (nez, rigion
teindre une valeur de 500 (pas de temps global) dans intermidiaire et rigion du volet) comprennent respec-
les derniers domaines du divergent. Dans tous les cas, tivement 80x110, 123x110 et 206x110 points. Pour le
les rksidus maxima dicroissent au moins de 10 ordres nez et la rigion intermidiaire, on obtient une dicrois-
de grandeur. sance des risidus quadratiques explicites de 8 ordres de
grandeur a p r L 2000 itirations. Le nombre de Courant
La figure 1 montre l’ensemble du maillage utilisi pour atteint 10 pour le nez et 70 pour la deuxikme zone.
la tuykre. Les risultats pour les calculs laminaire et Dans la rigion du volet, une dicroissance de 5 ordres
turbulent sont prisentis sur les figures 2 a 6 . Le champ de grandeur des risidus est obtenue aprlrs 20000 itira-
des nombres de Mach est visualisi sur la figure 2 dans tions et un nombre de Courant de 10. Notons que
le cas de l’icoulement laminaire et montre une onde les risidus n’atteignent pas de plateau et continuent
venant perturber l’icoulement proche de l’axe de la de dicrgtre lorsque les calculs sont poursuivis. Cette
tuykre. La naissance de cette onde correspond A un convergence lente eat due a la prisence d’une impor-
point d’inflexion de la giomitrie. Sur la figure suiv- tante zone de recirculation dans la rigion de volet.
ante sont porties les distributions de tempiratures le
long de l’axe dans le cas laminaire ou turbulent, aucun Les risultats sont prisentis sur les figures 5 a 17. Les
effet notable de la prise en compte de la turbulence sur figures 5 A 9 rnontrent des courbes d’isovaleurs du nom-
ces distributions ne pouvant Ctre observi. Les distri- bre de Mach et de la pression pour les deux calculs.
butions transversales de nombres de Mach pour le cas Dans la rigion du volet, la zone de siparation est bien
laminaire et le cas turbulent sont montries sur la figure difinie pour les deux calculs (figures 5 a 7). La fig-
4 en sortie de tuykre. On observe une bonne uniformiti ure 6 montre un agrandissment de cette zone pour le
du nombre de Mach dans le noyau de l’icoulement. cas du vol. Les effets visqueux sont importants du fait
du faible rayon du nes. Des oscillations ligkres sur les
5.2 Calculs d’koulements externes courbes d’isopression sont visibles et correspondent a
Deux series de calculs ont i t 6 rialisies autour des sauts de mailles dans la rigion du choc. La pres-
d’une configuration d’hyperboloide plus volet. Cette sion atteint la valeur maximale de 22432 Pa pour le
giomitrie a i t i proposie pour le cas test numiro 4 du cas en soufflerie et 63073 Pa pour le cas du vol.
Workshop. La longueur totale de la maquette est de L’icoulement est relativement fig6 derrikre le choc,
0.1114 m et l’angle entre le volet et l’axe est de 43.6 comme le montrent les profils de tempirature obtenus
degris. Le premier calcul correspond aux conditions pour le premier calcul (figure 10). La distance du choc
de l’icoulement dans la tuyire no 2 de la soufflerie F4; est dans ce cas &galea 3.7 m. Les figures suivantes
l’enthalpie totale riduite itant igale A 122 et la p r e s montrent des valeurs B la paroi pour les deux calculs.
34-8
Le nombre de Stanton le long du corps est prksentd [4] Coquel F., Flament C., Joly V., Marmignon C. :
sur la figure 11 pour le calcul en soufflerie. La valeur Viscous Nonequilibrium Flow Calculations , Com-
maximale (0.284) correspond B un flux de chaleur Cgal puting Hypersonic Flows, Volume 3, ed. Bertin
J.J., Pkriaux J., Ballmann J., Birkhzuser, Boston,
a 1.43 l o 7 W/m2. Les quatre figures suivantes mon- 1993.
trent, dans la rCgion du volet, les nombres de Stanton
et les coefficients de frottement pour les deux calculs. [5] Coquel F., Joly V. : De‘veloppement d’un
Code de Calcul d’Ecoulements Hypersoniques en
La zone de siparation mesure environ 1.1 IO-’ m pour De‘se‘quilibre Chimiq ue et Vibrationnel, Rapport
le cas de la soufflerie et 2 m pour le cas du vol. interne ONERA non publiC, Juillet 1991.
Un tourbillon secondaire peut 2tre observd dans le cas
du vol sur ligne charnikre avec le volet. Enfin. nous [6] Coquel F., Joly V. : Diveloppement d’un
Code de Calcul d’Ecoulements Eypersoniques en
montrons les courbes de convergence dans la rCgion in- Dise‘quilibre Chimique et Vibrationnel, Rapport
termddiaire et dans celle du volet (figures 16 et 17). interne ONERA non publie, Octobre 1992.
References
(11 Abgrall R., Fezoui L., Talandier J.: Extension of
Osher’s Riemann Solver to Chemical and Vibra-
tional Nonequilibrium Gas Flows, INRIA Report
1221, May 1990.
1
laminaire ( 1 )
aooo. PO -
m.
6ooo.
11.0
10.0 --- (2)
---
9.0
m. 8.0
7.0
4ooo. 6.0
3ooo. 5.0 -
3ooo.
4.0
3.0 --- ~ 3 . 5 9m
1m. 20
-
1.0
0. 0.
34-10
[SO-MACH
Fig. 6 - Champ des nombres de Mach (volet) Fig, 7 - Champ des nombres de Mach (cae Val)
Deuxiemc doniei ne A Deuxieme domaine /l
Plage: 0 Pa 4 I1000 Pa Plage: 0 Pa --.. 30000 Pa
Pas: 1000 Pa Pas: 2000 Pa
VaIeur min.: fl0 Pa Valeur min.: 201 Pa
Valeilr msx.: 10599 Pa Valeur max.: 29457
ISO-PRESSION
Troisieme domaine Troisiime domaine
Plage: 0 Pa 4 2 0 0 0 0 Pa Plage: 0 Pa -~60000 Pa
Pas: 1000 Pa Pas: 2000 Pa
Valeiir min.: HH Pa Valeur min.: 201 Pa
Valeiir max.: 19651 Pa VaIeur mar.: 55600 Pa
Fig. 8 - Champ des pressions (cas tuyire) Fig. 9 .- Champ des pressions (cas vol)
300.
m.
1 Trib(N2)
n
/ \
0.15 1
cas tuyire
3ooo.
m.
1ooO.
Fig. 10 - Distributions des tempiratures sur I'axe Fig. I1 - Distribution des nombres de Stanton
B la paroi
34-1 1
r \
*.I
Fig. 12 - Distribution des nombres d e Stanton Fig. 13 - Distribution des nornbres d e Stanton
cas tuygre. volet cas vol. volet
C! Cf
.a
04
0.
.& 0 1
I 1 L xp(m)
0.04
0-3=-&-
Fig. 14 - Distribution des coefficients d e friction Fig. 15 - Distribution des coefficients d e friction
c a s tuvere. volet cas vol. vole1
Fig. 16 - Courbe d e dicroiseance des rCsidus Fig. 17 - Courbe d e dCcroissance des rthidus
3’5-1
A. Pentaris
S. Tsangaris
L a b o r a t o r y o f Aerodynamics, N a t i o n a l T e c h n i c a l U n i v e r s i t y of A t h e n s
PO Box 6 4 0 7 0 , 1 5 7 1 0 Z o g r a f o u , A t h e n s , Greece
SUMMARY r i t h m w h i c h was i n i t i a l l y d e v e l o p e d by
B e a m a n d Warming (Ref 5 ) f o r c o m p r e s s i b l e
I n t h i s paper, an i m p l i c i t projection f l o w s b u t h a s s u c c e s s f u l l y u s e d f o r incom-
m e t h o d o l o g y f o r t h e s o l u t i o n o f t h e two- p r e s s i b l e s t e a d y f l o w s a s w e l l (Ref 6 , 7 ) .
dimensional, t i m e dependent, incompressi- Regarding t h e m a t h e m a t i c a l model, a pro-
b l e Navier - S t o k e s e q u a t i o n s i s p r e - j e c t i o n method i s d e v e l o p e d , which u s e s a
s e n t e d . The b a s i c p r i n c i p l e o f t h i s m e t h o d Poisson equation f o r t h e e x p l i c i t pressure
i s t h a t t h e e v a l u a t i o n of t h e t i m e evolu- derivation, while t h e numerical algorithm
t i o n i s s p l i t i n t o i n t e r m e d i a t e s t e p s . The i n v o l v e s o n l y t h e momentum e q u a t i o n s .
c o m p u t a t i o n a l m e t h o d i s b a s e d on t h e a p -
p r o x i m a t e f a c t o r i z a t i o n t e c h n i q u e . The C o n c e r n i n g t h e t u r b u l e n c e model t h e r e a r e
coupled approach i s used t o l i n k t h e equa- p l e n t y o f o p t i o n s . The s t a n d a r d k--E m o d e l
t i o n s o f m o t i o n a n d t h e t u r b u l e n c e model w i t h t h e w a l l f u n c t i o n s e q u a t i o n s (Ref 8 )
e q u a t i o n s . The s t a n d a r d k--E t u r b u l e n c e was s e l e c t e d b e c a u s e i t i s w e l l t e s t e d a n d
m o d e l i s u s e d . The c u r r e n t m e t h o d o l o g y , w i d e l y u s e d , i n s p i t e of i t s d i s a d v a n -
which h a s been t e s t e d e x t e n s i v e l y f o r t a g e s . I n a d d i t i o n , s m a l l v a l u e s o f t h e y-
s t e a d y p r o b l e m s , i s now a p p l i e d f o r t h e p l u s a r e n o t r e q u i r e d , so c o a r s e g r i d s can
numerical s i m u l a t i o n of unsteady flows. be used near t h e w a l l s and t h u s l a r g e t i m e
Several cases w e r e t e s t e d , such as plane s t e p s are p o s s i b l e . I t i s expected, t h a t
o r axisymmetric c h a n n e l s , a backward f a c - t h i s t u r b u l e n c e model w i l l sometimes p e r -
i n g s t e p and a flow behind a s q u a r e c y l i n - form p o o r l y , e s p e c i a l l y i n t h e r e c i r c u l a -
der. t i o n zones.
1. I N T R O D U C T I O N The o b j e c t i v e o f t h i s p a p e r i s t o d e s c r i b e
a new p r o j e c t i o n m e t h o d o l o g y d e v e l o p e d f o r
The n u m e r i c a l p r e d i c t i o n o f u n s t e a d y i n - c o l l o c a t e d g r i d s and t o p r e s e n t p r e d i c -
compressible f l o w f i e l d s h a s always been t i o n s f o r s e v e r a l t e s t c a s e s w h e r e t h e un-
one of t h e most c h a l l e n g i n g a r e a s o f f l u i d steadiness is e i t h e r forced or inherent.
d y n a m i c s . The p r i m a r y d i f f i c u l t y i s i n
f i n d i n g a s a t i s f a c t o r y way t o l i n k c h a n g e s 2 . THE GOVERNING EQUATIONS
i n t h e v e l o c i t y f i e l d s t o changes i n t h e
p r e s s u r e f i e l d . T h i s i n t e r a c t i o n must be The f u l l f o r m o f t h e momentum e q u a t i o n s i s
a c c o m p l i s h e d i n s u c h a manner a s t o e n s u r e u s e d , w h e r e a l l v a r i a b l e s a r e i n non-
t h a t t h e d i v e r g e n c e of t h e v e l o c i t y van- d i m e n s i o n a l form. Concerning t h e t u r b u l e n t
i s h e s a t e a c h l e v e l o f p h y s i c a l t i m e . The f l o w s t h e h i g h - R e y n o l d s number (Ref 8 )
m o s t common s o l u t i o n t o t h i s p r o b l e m i s f o r m o f t h e k--E m o d e l i s u s e d .
t h e use of an a r t i f i c i a l c o m p r e s s i b i l i t y
methodology o r a p r o j e c t i o n methodology. T h i s f o r m u l a t i o n r e q u i r e s t h e u s e of t h e
w a l l f u n c t i o n s t o b r i d g e t h e v i s c o u s and
The p r o j e c t i o n m e t h o d f o r t h e s o l u t i o n o f boundary l a y e r s i n p r o x i m i t y t o t h e s o l i d
t h e time-dependent Navier-Stokes e q u a t i o n s w a l l . T h i s approach i s s t r i c t l y v a l i d
w a s i n t r o d u c e d i n d e p e n d e n t l y by Chorin o n l y f o r a t t a c h e d s h e a r l a y e r s a n d may
( R e f 1) a n d T e m a m (Ref 2 ) . S u b s e q u e n t l y , perform poorly i n t h e r e c i r c u l a t i o n zones.
a n e x p l i c i t v e r s i o n o f t h e method w a s p r e - I n a d d i t i o n t h i s model i s v a l i d u n d e r t h e
s e n t e d b y F o r t i n e t a 1 (Ref 3 ) . The p r o - h y p o t h e s i s o f e q u i l i b r i u m a n d may n o t s a t -
j e c t i o n method i s a n i n t e r p r e t a t i o n o f a i s f a c t o r y perform i n unsteady flows.
f r a c t i o n a l - s t e p method as a d a p t e d t o t h e
u n s t e a d y N a v i e r - S t o k e s e q u a t i o n s (Ref 4 ) . On t h e o t h e r h a n d , e x p e r i m e n t a l o b s e r v a -
t i o n s showed t h a t t h e g e n e r a l b e h a v i o u r o f
The p r o c e d u r e o f t h e p h y s i c a l t i m e l e v e l t h e b o u n d a r y l a y e r a n d t h e s t r u c t u r e of
i n c r e m e n t i s s p l i t i n t o two s t e p s . F o l l o w - t h e turbulence a r e not fundamentally a f -
i n g t h e d e c o m p o s i t i o n o f C h o r i n ( R e f l), a f e c t e d by t h e u n s t e a d i n e s s of t h e f l o w
t e n t a t i v e velocity f i e l d is f i r s t calcu- ( R e f 9 , 1 0 , 11). From t h e s e o b s e r v a t i o n s
l a t e d b y t h e d i s c r e t i z e d momentum e q u a - i t i s w e l l f o u n d e d t o s u p p o s e t h a t t h e hy-
t i o n s without t h e pressure gradient. A t p o t h e s e s used i n c a l c u l a t i o n s methods f o r
t h e second s t e p , t h e v e l o c i t y components t h e s t e a d y case a r e s t i l l v a l i d f o r t h e
a t t h e new t i m e l e v e l a r e e v a l u a t e d b y unsteady case.
correcting the tentative solution i n order
t o s a t i s f y t h e i n c o m p r e s s i b i l i t y con- The r e f e r e n c e q u a n t i t i e s a r e some r e f e r -
straint. e n c e v e l o c i t y u r e f , a r e f e r e n c e l e n g t h Lret.,
a r e f e r e n c e d e n s i t y pref a n d a r e f e r e n c e
The s o l u t i o n a l g o r i t h m w e u s e i n t h e p r e s - k i n e m a t i c v i s c o s i t y v r e f . The r e f e r e n c e
e n t study, is t h e approximate factoriza- v a l u e f o r t h e t i m e i s d e f i n e d a s tref=
t i o n technique. This is an implicit algo- Lref/uref a n d f o r t h e p r e s s u r e i s t h e p r o d -
Paper presented ut the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
35-2
uct pref=prefuZref.
The reference quantity is a matrix that contains the pressure de-
for the turbulent kinetic enerqy is uZref rivatives of the momentum equations.
and for the dissipation rate U ref/Lref.
In the expressions above, <,q are the cur-
Performing a generalised coordinates' vilinear coordinates, connected to the
transformation from the physical (x,y,t) Cartesian ones x,y through the generalised
to the computational ( < , q , r ) domain, the coordinates' transformatior.:
following non-dimensional form of the
equations is obtained (Ref 12):
a,Q + dSF + d,G + aE + K = d,V + d,W + aC + D and J is the Jacobian of the transforma-
tion :
where is a=O for the two dimensional equa-
tions, a = l for the axisymmetric equations
and the subscripts x,y,< , T ) , T denote deri-
vation. For convenience we express the In addition, U, V are the contravariant
above equation in the following form: velocities along the <,q directions re-
spectively, given by the following rela-
tions :
where
[F(u, v)] = d,V + d,,W + aC + D - d,F - d,G - CLE Re is the Reynolds number and 5 is the ki-
netic energy production term:
In equation (l), Q is the vector of the
conservative variables:
E = 2[(UJ2 + (VJ] + (UY + v.)'
J The stresses are:
E =
J '[ U V+
2
3
- [U, v, k, &IT
V
3
2
- qxk,v V+ - qyk,k V, E V
4 Finally, for the turbulence model equa-
tions are:
JY
V,W,C are the viscous fluxes:
where v I is the kinematic viscosity and v,
is the turbulent viscosity, which is given
by the relation:
k2
V, = Re C, -
E
i
ZY The constants are:
5YY - Tw
V2 cp = 0.09, C, = 1.44, C, = 1.92, ck = 1.0, cc = 1.3
c=- rkky + 2v, -
1 y
J Re Y For the above model the concept of wall
py
I E v2
+ 2v,c, - -
k Y
functions has been employed. The central
idea is that the flow in the region near
the wall can be assumed to behave as an
D is a vector that contains the source one-dimensional Couette flow. This is a
terms of the k and E equations: reasonable assumption except for regions
of high pressure gradient, separation or
k2 -- +
reattachment. Once this assumption is
G - &, C,C,kG - C, - made, it is rather easy to arrive at exact
k or semi-empirical relations (Ref 8, 14),
which link the shear stresses and the
and, finally other variables at the wall to the values
of velocity, turbulence energy, etc. at
the outer edge of the Couette layer, where
the first interior grid point is located.
3 . NUMERICAL ALGORITHM
I 35-3
(1) t h e i m p l i c i t , f a c t o r e d , f i n i t e d i f f e r - scheme. E q u a t i o n ( 3 ) i s a c t u a l l y t h e s a m e
e n c e scheme o f B e a m a n d Warming ( R e f 5 ) i s w i t h ( Z ) , except t h a t it c o n t a i n s equation
u s e d . The t e m p o r a l d e r i v a t i v e i n e q u a t i o n (1) w i t h o u t t h e p r e s s u r e g r a d i e n t s , a n d i s
(1) i s a p p r o x i m a t e d v i a a g e n e r a l i z e d t i m e
differencing: AQ"=Q* -Q"
A n o n - l i n e a r e x p r e s s i o n , eq. ( 3 ) , o c c u r s
f o r t h e t i m e increment of t h e conserva-
t i v e s v a r i a b l e s ' v e c t o r AQ" (Ref 1 2 , 1 4 ) .
In order t o derive a l i n e a r algebraic sys-
which t a k e s t h e form:
t e m of e q u a t i o n s , a l i n e a r i z a t i o n of v i s -
c o u s and i n v i s c i d f l u x e s must be p e r -
AQ"
- e 1-8
+- f o r m e d . The i n v i s c i d f l u x e s , w h i c h a r e
1+<
a,Qn"
A7 l+< f u n c t i o n s o f Q, a r e l i n e a r i z e d u s i n g a
T a y l o r series e x p a n s i o n , f o r example:
+--
l + < A7
AF" = A " . AQ" + O(A7')
where A a n d are v t h e f o r w a r d and back-
ward d i f f e r e n c i n g operators, respectively, w h e r e An=8Fn/aQ" i s t h e J a c o b i a n m a t r i x o f
the superscript n denotes the t i m e instant t h e v e c t o r F".
and 0 d e n o t e s t h e o r d e r of t h e t r u n c a t i o n
error. The a b o v e l i n e a r i z a t i o n o f t h e i n v i s c i d
f l u x e s e n s u r e s t h e second o r d e r t i m e accu-
A f t e r s u b s t i t u t i n g (1) i n t o ( 2 ) a n d p e r - r a c y o f t h e scheme. I n o r d e r t h a t t h i s a c -
forming c a l c u l a t i o n s t h e f o l l o w i n g rela- curacy is retained i n the corresponding
tion is derived: l i n e a r i z a t i o n of t h e v i s c o u s f l u x e s , i t
must be t a k e n i n t o a c c o u n t t h a t t h e l a t t e r
Q"" - Qn e 1-8 a r e f u n c t i o n s of a l l Q,Q<,Q,, f o r example:
-
- -Mu, V)]"+l + -[& v)]"
A7 I+< I+<
-- e 1-8
K"" - - K" +-- < AQ~-'
V" ( Q , Q r r Q q ) = V , " ( Q , Q < ) + V , "( Q I Q , )
and The s u b s t i t u t i o n o f t h e l i n e a r e x p r e s s i o n s
o f t h e f l u x v e c t o r s i n t o t h e o r i g i n a l non-
l i n e a r e q u a t i o n f o r AQ", l e a d s t o a
s t r o n g l y c o u p l e d s y s t e m of e q u a t i o n s i n
b o t h s p a t i a l d i r e c t i o n s . T h i s coupled sys-
t e m i s s o l v e d by t h e Approximate F a c t o r i -
z a t i o n T e c h n i q u e (Ref 5 , 14), w h i c h l e a d s
t o t h e f o l l o w i n g two t r i d i a g o n a l s y s t e m s ,
w h e r e Q' i s a n i n t e r m e d i a t e , o r t e n t a t i v e , o n e f o r e a c h o f t h e two d i r e c t i o n s <,q:
f l o w f i e l d . U s i n g e q u a t i o n (1) t h e above
r e l a t i o n i s w r i t t e n i n t h e form:
{I + 5 [a,(A - P + R,) - a,R - aN, + OaH]
e 1-8
--
1 + < K' - -
l+CK"
and a f t e r sope s i m p l e c a l c u l a t i o n s and as-
{I +
[a,,(B - Y + S,) - a,$ - u(N, + N3 - T) + @,H]
"I
s u m i n g Knt'=K i s obtained:
w h e r e A,B,P.,Y,R,S,N,,N,,N,,T and H a r e J a -
c o b i a n m a t r i c e s ( R e f 14), a n d
E q u a t i o n ( 4 ) i m p o s e s t h e c o n d i t i o n 1+<-
f k 0 . T h u s w e u s e 8=1 a n d < = 0 . 5 w h i c h l e a d s R. H. S. =
t o t h e second o r d e r t h r e e p o i n t backward
35-4
AT o s c i l l a t i o n s from t h e s o l u t i o n are r e - .
-[a<(-F + V)n + a,(-G + W)" + a ( C - E)" + Dn] moved. I n t h e p r e s e n t work o n l y e x p l i c i t
I+< t e r m s De a r e u s e d i n ( 5 ) . T h e s e t e r m s a r e
a b l e n d e d s e c o n d a n d f o u r t h o r d e r non-
l i n e a r m o d e l w h i c h i s w i d e l y u s e d i n com-
p r e s s i b l e f l o w s (Ref 1 6 , 1 7 , 1 8 , 1 9 ) a n d
w a s u s e d f o r t h e f i r s t t i m e i n incom-
p r e s s i b l e f l o w s b y P e n t a r i s e t a 1 (Ref
14), where i s p r o v e d t h a t t h e e x i s t e n c e of
where Q = JQ i s t h e v e c t o r o f c o n s e r v a - t h e second o r d e r d i s s i p a t i o n t e r m s do n o t
t i v e v a r i a b l e s i n t h e p h y s i c a l domain, De a f f e c t t h e s p a t i a l a c c u r a c y o f t h e method.
i s t h e a r t i f i c i a l d i s s i p a t i o n t e r m s (Ref
14), a n d e, 8, a r e w e i g h t i n g f u n c t i o n s The d e f i n i t i o n o f t h e t i m e s t e p
(Ref 1 5 ) u s e d t o a d d t h e J a c o b i a n m a t r i x H
i n b o t h t h e sweeps. Although t h e s o l u t i o n method i s i m p l i c i t ,
t h e a c t u a l s t a b i l i t y o f t h e scheme i s n o t
The P o i s s o n e q u a t i o n independent of t h e t i m e s t e p used. I n t h i s
work s m a l l t i m e s t e p s a r e u s e d w h i c h h e l p
E q u a t i o n ( 4 ) leads t o t h e f o l l o w i n g rela- t h e f a s t convergence of t h e Poisson equa-
t i o n s (8=1): t i o n . When a p r o b l e m w i t h o s c i l l a t i n g f l o w
rate is t o be simulated, t h e Navier-Stokes
e q u a t i o n s m u s t b e i n t e g r a t e d f o r a s many
c y c l e s a s are needed t o r e a c h a p e r i o d i c
steady state, i f such a state e x i s t s . I n
t h e p e r i o d i c s t e a d y state, of p e r i o d T,
t h e s o l u t i o n s a t t i m e i n s t a n t s t and t + T
must r e a c h a s p e c i f i e d convergence c r i t e -
r i o n , w h i c h i n t h e p r e s e n t work i s l ~ l O - ~ .
With t h e p r e s e n t m e t h o d t h i s c r i t e r i o n i s
Assuming t h a t t h e c o n t i n u i t y e q u a t i o n i s reached a t t h e second period, because
satisfied a t the n+l t i m e instant: 10000 t i m e i n t e r v a l s are u s e d p e r p e r i o d .
U s i n g l e s s t i m e i n t e r v a l s p e r p e r i o d , more
i t e r a t i o n s a r e needed f o r t h e convergence
o f t h e P o i s s o n e q u a t i o n . I n a d d i t i o n more
t h e f i r s t two o f ( 6 ) a r e combined t o g i v e p e r i o d s are n e c e s s a r y t o r e a c h t h e above
the Poisson equation: c r i t e r i o n and t h u s t h e t o t a l computational
cost is increased.
When a p r o b l e m w i t h s t e a d y u p s t r e a m c o n d i -
t i o n s i s s o l v e d , where t h e P o i s s o n e q u a -
t i o n i s r a p i d l y converged, t h e t i m e s t e p
(7)
i s e s s e n t i a l t o be as l a r g e as p o s s i b l e .
Then t h e t i m e s t e p i s d e f i n e d a s :
The s p a t i a l d e r i v a t i v e s i n t h e a b o v e s y s -
t e m of e q u a t i o n s are approximated by t h r e e
p o i n t c e n t r a l second o r d e r d i f f e r e n c i n g
e x p r e s s i o n s . So t h e s o l u t i o n o f t h e s y s t e m where t h e l a s t p a r t o f e q u a t i o n ( 7 ) v a n i s h
of e q u a t i o n s ( 5 ) r e q u i r e s t h e i n v e r s i o n of f o r unsteady flows because i s Ar=const i n
two b l o c k t r i d i a g o n a l s y s t e m s , o n e i n e a c h t h e e n t i r e domain. I n t h e e q u a t i o n a b o v e ,
d i r e c t i o n . On t h e o t h e r h a n d , t h e u s e o f ij i s t h e o u t w a r d u n i t v e c t o r n o r m a l t o t h e
c e n t r a l d i f f e r e n c e s on c o l l o c a t e d g r i d s
b o u n d a r y A w h i c h e n c l o s e s t h e s o l u t i o n do-
l e a d s t o t h e n e c e s s i t y of a d d i n g e x t e r n a l
main.
a r t i f i c i a l d i s s i p a t i o n t e r m s , so t h a t t h e
s t a b i l i t y i s r e t a i n e d and high frequency
Concerning t h e o t h e r variables, t h e veloc-
35-5
The a n a l y t i c s o l u t i o n f o r t h e v e l o c i t y and
Some r e p r e s e n t a t i v e r e s u l t s o f s e v e r a l
the pressure gradient f o r the developed
t e s t c a s e s a r e shown i n t h i s s e c t i o n . I t
p a r t of t h e channel, i s g i v e n b y Moore
must be mentioned t h a t a l l t h e q u a n t i t i e s
(Ref 22). The S t r o u h a l number is equal t o
u s e d a r e d i m e n s i o n l e s s . The d i m e n s i o n l e s s
10 a n d t h e R e y n o l d s number i s equal t o
n u m b e r s R e y n o l d s , S t r o u h a l a n d Womersley
1.6.
a r e defined as:
A 15x29 g r i d i s u s e d f o r t h e c u r r e n t t e s t
c a s e , w i t h 4b l e n g t h a n d l b h e i g h t . The
l o w e r b o u n d a r y i s a s o l i d w a l l a n d t h e up-
p e r o n e i s a symmetry a x i s .
One c y c l e o f t h e i n f l o w v e l o c i t y o s c i l l a -
t i o n i s s p l i t i n 10000 t i m e i n t e r v a l s and
r e s p e c t i v e l y , w h e r e wref i s t h e r e f e r e n c e the dimensionless t i m e s t e p obtained is:
c y c l i c frequency.
F i n a l l y it must b e n o t e d t h a t a l l t h e re- 2x
dt = = 2x.
s u l t s have been t e s t e d f o r v a r i o u s g r i d s Str . 10000
and are i n d e p e n d e n t from t h e g r i d d e n s i t y .
I n F i g 2 t h e developed v e l o c i t y p r o f i l e s
One-dimensional o s c i l l a t o r y flow at different physical t i m e instants are
p r e s e n t e d . A s can be seen t h e numerical
I n o r d e r t o c h e c k t h e r e l i a b i l i t y of t h e r e s u l t s c o i n c i d e w i t h t h e a n a l y t i c solu-
p r e s e n t method i t w a s i n i t i a l l y d e v e l o p e d tion. In Fig 3 the velocity a s a function
f o r one-dimensional flows and i t w a s o f t i m e a t t h r e e d i f f e r e n t d i s t a n c e s from
t e s t e d t o a n o s c i l l a t o r y c h a n n e l f l o w (Ref t h e w a l l , and t h e p r e s s u r e g r a d i e n t i n t h e
21). I n t h i s p r o b l e m t h e b a c k p r e s s u r e o f d e v e l o p e d p a r t a s a f u n c t i o n of t i m e a r e
t h e channel is o s c i l l a t i n g according t o : p r e s e n t e d . The a g r e e m e n t i s e x c e l l e n t com-
p a r i n g t h e numerical r e s u l t s w i t h t h e ana-
p,,(t) = p, + p e s i n ( s t r . t) l y t i c s o l u t i o n . I t i s c l e a r t h a t t h e un-
s t e a d y motion i s p r e d i c t e d w e l l a f t e r t h e
one f o u r t h of t h e f i r s t p e r i o d , and t h i s
An a n a l y t i c s o l u t i o n t o t h i s p r o b l e m c a n
i s one r e a s o n f o r t h e use of s m a l l time
only be obtained i f t h e p r e s s u r e p e r t u r b a -
steps.
t i o n pe i s s m a l l c o m p a r e d t o t h e mean b a c k
p r e s s u r e p,. I n t h i s work t h e s e p a r a m e t e r s
P e r i o d i c flow i n axisymmetric channel
a r e p e = O . l a n d p , = l . The S t r o u h a l number,
b a s e d on t h e t i m e mean i n f l o w v e l o c i t y U,
The t h i r d t e s t c a s e u n d e r c o n s i d e r a t i o n i s
a n d t h e c h a n n e l l e n g t h 1, S t r = w r e f l / u o i s
t h e periodic Stokes flow i n a c i r c u l a r
chosen t o be e q u a l t o 1 0 .
t u b e , e x t e n s i v e l y p r e s e n t e d and a n a l y s e d
b y many r e s e a r c h e r s (Ref 23, 24, 25, 26).
The a n a l y t i c s o l u t i o n f o r t h e v e l o c i t y and
I n t h e p r e s e n t p a p e r t h e R e y n o l d s number,
35-6
b a s e d on t h e r a d i u s a o f t h e t u b e a n d t h e
maximum i n f l o w v e l o c i t y U,, is considered I n F i g 6 t h e S t r o u h a l numbers S t r = f a / u ,
t o b e e q u a l t o 0.1, i n o r d e r t o a p p r o x i - p r e d i c t e d f o r a l l t h e g r i d s and f o r sev-
m a t e t h e S t o k e s f l o w . A t x=O t h e i m p o s e d e r a l t i m e s t e p s a r e shown. C o m p a r i s o n s a r e
v e l o c i t y p r o f i l e i s (Ref 2 6 ) : made t o o t h e r e x p e r i m e n t a l d a t a a n d nu-
m e r i c a l r e s u l t s . The a g r e e m e n t i s v e r y
good. I t can be s e e n t h a t t h e r e s u l t s are
u ( t )=u ( y ) . c o s( S t r . t ) , v ( t )=O
s l i g h t l y a f f e c t e d by t h e g r i d d e n s i t y o r
t h e t i m e s t e p u s e d . On t h e o t h e r h a n d , t h e
where u ( y ) i s e q u a l t o u n i t y e x c e p t t h e
d i s a g r e e m e n t between t h e e x p e r i m e n t a l d a t a
n e a r t h e w a l l r e g i o n were p a r a b o l i c a l l y
p r e s e n t e d i n F i g 6 show t h e u n c e r t a i n t y
approaches zero. For t h e present case we
and t h e s e n s i t i v i t y of t h e flow.
s e l e c t t h e t y p i c a l Womersley number o f
W = a d ( w r e f / v r e f ) = d30 a n d t h e S t r o u h a l number I n F i g 7 t h e v o r t i c i t y i s o l i n e s are pre-
becomes Str=aw,,,/u,=300. The t i m e s t e p s e n t e d f o r R e y n o l d s numbers 100 a n d 2 5 0 .
used i s 2.094.10-6. I n F i g 8 t h e t i m e h i s t o r y o f t h e v-
v e l o c i t y behind t h e c y l i n d e r and t h e cor-
A 45x40 g r i d i s u s e d , w i t h 1.2a l e n g t h r e s p o n d i n g power s p e c t r u m a r e p r e s e n t e d .
a n d l a h e i g h t . The l o w e r b o u n d a r y i s a I t m u s t be m e n t i o n e d t h a t f o r R e y n o l d s
solid w a l l a n d t h e u p p e r o n e i s a symmetry numbers 1 0 0 a n d 250 t h e f l o w i s p e r i o d i c .
axis. S o l u t i o n f o r t h e above r e l a t i o n s are F o r l a r g e r R e y n o l d s numbers t h e f l o w be-
given b y G o l d b e r g e t a 1 (Ref 2 6 ) , i n t h e i r comes t r a n s i t i o n a l o r t u r b u l e n t , a n d t h e
Table I. t i m e h i s t o r i e s of t h e v e l o c i t y and t h e
p r e s s u r e show a c h a o t i c b e h a v i o u r .
I n F i g 4 t h e comparisons between t h e s e m i -
a n a l y t i c s o l u t i o n a n d t h e n u m e r i c a l re- U n s t e a d y t u r b u l e n t f l o w b e h i n d a backward-
s u l t s p r o v i d e d by t h e c u r r e n t method a r e facinq step
g i v e n , f o r t h e u - v e l o c i t y component, a t
f o u r i n s t a n t s o f t h e p h y s i c a l t i m e . The I n t h e present paper a numerical investi-
a g r e e m e n t of t h e c u r r e n t n u m e r i c a l r e s u l t s g a t i o n of t h e c o h e r e n t v o r t i c e s i n t u r b u -
with the semi-analytic solution i s very l e n c e b e h i n d a b a c k w a r d - f a c i n g (Ref 3 2 )
good a t a l l t h e t i m e i n s t a n t s . The d i s - s t e p i s p r e s e n t e d . The r a t i o o f t h e c h a n -
crepancies t h a t occur a t c e n t r e l i n e veloc- nel height W t o the s t e p height H is 2.5.
i t y a t w t = O a n d wt=n d u e t o t h e s e m i - The g e o m e t r y a n d t h e i n f l o w v e l o c i t y p r o -
a n a l y t i c s o l u t i o n (Ref 2 6 ) . f i l e U ( y ) a r e t h e same a s i n t h e e x p e r i -
m e n t s o f E a t o n a n d J o h n s t o n (Ref 3 3 ) . A
The m a i n r e a s o n t h a t t h i s t e s t case i s e x - 250x50 g r i d i s u s e d , a d e t a i l of which i s
amined, i s t h a t t h e r e s u l t s p r o v i d e d by shown i n F i g 9 . The t o t a l l e n g t h o f t h e
t h e a n a l y t i c s o l u t i o n concern t h e e n t i r e channel i s 50 s t e p h e i g h t s . Both t h e lower
flowfield along the tube, i n contrast t o and t h e upper b o u n d a r i e s are s o l i d s u r -
t h e f l o w between t h e two p a r a l l e l p l a t e s f a c e s . The R e y n o l d s number based upon t h e
where r e s u l t s o n l y f o r t h e d e v e l o p e d p a r t s t e p h e i g h t H a n d t h e maximum i n f l o w v e -
of t h e flow w e r e available. I n a d d i t i o n l o c i t y U, i s 3 8 0 0 0 . The t i m e s t e p u s e d i s
t h e S t r o u h a l number i s much l a r g e r t h a n i t 0.0075.
was i n t h e p r e v i o u s t e s t c a s e .
I n t h e f i r s t r u n t h e o r i g i n a l k-c m o d e l
Unsteady flow behind a s q u a r e c y l i n d e r was u s e d . The f l o w t h a t o c c u r r e d was
s t e a d y . The r e c i r c u l a t i o n l e n g t h was 7 . 1 H .
The u n s t e a d y f l o w b e h i n d a s q u a r e c y l i n d e r The m a i n r e a s o n t h a t a s t e a d y f l o w w a s
i s p r e s e n t e d i n t h i s p a r a g r a p h . The o b j e c - p r e d i c t e d , i s t h e o v e r e s t i m a t e of t h e t u r -
t i v e i s t o examine t h e r e l i a b i l i t y of t h e b u l e n t v i s c o s i t y , which i n d i r e c t l y r e d u c e s
m e t h o d o l o g y when t h e u n s t e a d i n e s s o f t h e t h e R e y n o l d s number. Thus a s e c o n d r u n w a s
flow i s due t o t h e v i s c o s i t y of t h e flow performed u s i n g a modified r e l a t i o n f o r
and n o t t o a n e x t e r n a l c a u s e . the turbulent viscosity:
The R e y n o l d s n u m b e r s e x a m i n e d , b a s e d on
t h e i n f l o w u n i f o r m v e l o c i t y U, a n d t h e
s q u a r e s i d e a , a r e 1 0 0 , 250, 500 and 750.
T h r e e d i f f e r e n t g r i d s were u s e d w i t h
where
1 0 0 x 5 6 , 2 0 0 x 1 1 0 a n d 1 4 5 x 1 1 1 p o i n t s . The
2 0 0 x 1 1 0 g r i d i s shown i n F i g 5 . The p o i n t s
i n s i d e t h e s q u a r e a r e b l o c k e d . The p o s i - -(Y+ - Y:) / A'])'
t i o n of t h e c y l i n d e r and of a l l t h e
b o u n d a r i e s a r e t h o s e shown i n F i g 5 , a n d i s a f u n c t i o n proposed by Miner e t a l ( R e f
a r e t h e same f o r a l l t h e g r i d s . The u p p e r 34) i n order t o reduce t h e turbulent v i s -
and lower b o u n d a r i e s are c o n s i d e r e d t o be c o s i t y n e a r t h e w a l l . The c o n s t a n t s a r e
symmetry a x e s . f o = 0.04 , y; = 8 a n d A'=26.
00000 x/L.r=O.L
ooooo x/k=0.6
0 0 0 0 a Anolytic solution hhhhh x/k=o.a
1.02 1 - Current method
1.12 7 - Current method
1.oa
1.01
1.04
;.oo
c
1 Ll.oo
LC
\ \
3 a0.96
0.99
0.92
Fiqure 1. Time evolution of velocity (left) and pressure (right) in the one-dimensional
flow. Comparison with analytic solution.
-0.20 -0.20
-0.40 -0.40
D 111
\ \
x x
-0.60 -0.60
-0.80 -0.80
-1 .oo -1 .oo
- 1.50 -1.00 -0.50 p.00 0.50 1.00 1.50 -1.50 -1.00 -0.50 p.00 0.50 1.00 1.50
U / Uref
00000 Analytic solution, ot=lr/2 00000 Analytic solution, ot=3n/2
ooaoo Analytic solution, ot=31r/4 Analytic solution, ot=7n/4
-
1\
0~~~~
Current method - Current method
O.OO
-0.20 1 O.O0I
-0.20
L3
-0.40 -0.40 4
\
x
-0.60
-0.80
-0.60
-0.80
1
-1 .oo -1 .oo
-1.50 -1.00 -0.50 0.00 0.50 1.00 1.50 -1.50 -1.00 -0.50 0.00 0.50 1.00 1.50
U/Uref U/Uref
Fiqure 2. Longitudinal velocity profiles at several time instants, in the developed re-
gion of the two-dimensional channel.
35-10
0 0 0 0 0 Analytic solution
- Current method
,OI
1 .oo
.c
E 0.00
3
\
3
-1.00
-2.00 -20
0.00 0.40 0.80 1.20 0.00 0.40 0.80 1.20
t/tref t/tref
F i q u r e 3. T i m e e v o l u t i o n o f t h e v e l o c i t y a t t h r e e r a d i a l p o s i t i o n s ( l e f t ) a n d t i m e e v o -
l u t i o n o f t h e p r e s s u r e g r a d i e n t ( r i g h t ) i n t h e developed r e g i o n of t h e chan-
n e l . Comparison w i t h a n a l y t i c s o l u t i o n .
- 1.50
0.00 0.20 0.40 0.60 0.80 1.00 1.20 0.00 0.20 0.40 0.6? 0.80 1.00 1.20
x /c1 "/a
00000 Analytic Solution. y / o = 1 00000 Analytic Solution. y/o= 1
o o o o o Anolytic Solution. y/o=0.025 00000 Analytic 'Jolution. v/o=0.025
~ Current method - Current method
wt=n/2 wt=Sn/2
0.50 3
/
0.30
0.20 I
\
\
3 /
"
rl
-0.10 1
0.06' '0.66 '
0.66 '&io ' '0.kO ' '(.Ad ' '1'.60 ' 'l',;o
-0.50 1
0.00
0.20 0.40 0.60 0.80 1.00 1.20
1
x/a x/a
F i q u r e 4. L o n g i t u d i n a l v e l o c i t y component a l o n g t h e c i r c u l a r t u b e f o r o n e c y c l e o f t h e
flow.
35.11
0 00 5 00 1000 1500 2 0 00 25 00
x/a
Fiqure 5 . The 2 0 0 x 1 1 0 grid for the flow around a square cylinder.
1 ~ 1 1 1 1 ~ 1 1 1 1 ~ 1 1 1 1 ~ 1 I1 l 1 l 1 (
Fiqure 6. Strouhal number a s function of the Reynolds number. Comparison with experi-
mental data and numerical results.
35-12
Fiqure 7. Vorticity isolines for Reynolds numbers 100 (up) and 250 (down). Grid
200x110.
0.04
0.40
-0 M
-0.40 0 00
0.20 030 040 050
FRE
Fiqure 8. Tlme history of the v component of the velocity behind the square cylin-
der (left) and the corresponding power spectrum analysis (right).
35-13
2.50
.....................................
......................................
.................
....................................
...................................
2.00
....................................
I1.50
\ .................
A1.oo ..................................
................
0.50
0.00
0.00 2.00 4.00 8.00 10.00
(b)
Fiqure 10. Unsteady f l o w m a backward f a c i n g s t e p . ( a ) pressure i s o l i n e s , (b) vortic-
i t y isolines.
150
t.m]
O M
om +
-0.a
. .
om
..
I ' I ~ I ' I ' I oml I I I
o.m om om 003 MI 0.0 om om os4 to
May wm
Fiqure 12. T i m e mean k i n e t i c energy p r o f i l e s v e r s u s experimental d a t a .
0.80
0.40
2 0.00
-o..o
-0.80
001 0 10 015 0 20
Smuhal (MNo)
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
36-2
However, there is a further problem encountered and the flux across the face simply FA,. This gen-
--+
when adapting the grid during an unsteady com- era1 flux vector is split into a forward part F as-
putation. T h e conventional steady technique is to sociated with positive moving waves only, i.e. all
adapt the grid instantaneously and interpolate the
solution to the new grid. This is less suitable for un-
eigenvalues of$1 0, and a backward part F- as-
sociated with negative moving waves only, all eigen-
steady flows since many adaptions are required over
several periods of motion, and repeated interpola- "- < 0.At each cell face a pair of states
tion may result in a gradual loss of accuracy. Un- and a single numerical flux derived
structured adaptive grids have been developed for from this pair. The split flux components are, see
unsteady flows, see for example [5, 61, and regions of Van-Leer [7] and Parpia [8],
high gradients are simply enriched with extra points.
However, an interpolation step is still required, and
this has been shown to lead to a conservation loss,
even for unstructured grids, [5].
repeatedly calculated during an unsteady computa- Figure l ( a ) shows the grid near a NACA0012 aero-
tion, we require a method of grid generation which foil, resulting from the above interpolation, using
is simple, and which gives the speeds algebraically i m a x = 129, (99 points on the aerofoil surface, 15
rather than having to evaluate numerical differences in the wake either side), j i n a x = 30, st = 1.2,
between grid positions on successive time levels. It and the outer boundary is 20 chords away. Figure
was thus decided to use the transfinite interpolation l(b) shows the corresponding variation of r,~iand $?.
method originally described by Gordon and Hall [lo]. (Grid points i = 13 - 1 1 7 , j = 1 - 13 are shown).
For the vector function
By differentiating (25) with respect to time the grid
f(rl, €) = [X(% 0,
Y(% 01 (16) speeds can be obtained analytically, (blending func-
tions assumed constant, and outer boundary fixed)
which is known only on certain lines of the region
rl1
€1
Irl Irl2
I€ I€2 1 (17)
transfinite interpolation gives the interpolated func- Hence, grid positions are calculated by interpolation
tion f ( q , <) throughout the region by a direct al- of the boundary positions, and grid speeds by inter-
gebraic mapping. The general transfinite interpo- polation of boundary speeds, the interpolation being
lation method results in a recursive algorithm, see the same in each case.
Eriksson [ll]. However, for a C-grid the inner and
outer boundaries are lines of constant where 1) is <
known. Defining one normal derivative only a t the 4 GRID ADAPTION
inner boundary, the algorithm reduces to E direction
interpolation only, The grid is to be adapted, according to the solution, ,
so that grid points are clustered in regions of high
d gradients. Adaption is normally performed in ( 2 ,y)
f(77,E) = $0(r)f(77,<l)+$1(E)--f(rl,<1)+$2(€)f(77,E2)
a€ space. However, while this gives suitable grids for
(18) steady computations, the grid positions, and hence
Here $Ol1i2are the blending functions in the di- < more importantly grid speeds, would not be available
rection. The function f a.ctually represents a trans- algebraically for unsteady computations. Only nu-
formation from (7,€) space to ( I ,y) space. The grid merical values of d x / d t and d y / d t could be evaluated
points are indexed by i and j in the r] and directions between different adapted grids during an unsteady
respectively, and then each i and j line are defined computation, and these could cause problems of grid
as constant rl and ( lines respectively. The variables distortion and crossover when grid points move along
are normalised such that highly curved lines.
0 5 17, €, $ O J J 5 1. (19) Adaption is achieved here by writing the interpo-
The boundaries f ( q , 0) and f ( v , 1) are known at i m a x lation function in a more general form and adapting
discrete points, i.e. fi(0) and f i ( 1 ) . The value of ( the interpolation parameters instead of the physical
a t each constant [ line is then defined as j / j m a x . coordinates, such that grid positions are available al-
The blending functions 4’ and $ 2 control the spac- gebraically.
ing in the [ direction, and 4’ controls how far the
normal direction affects the line direction. The most Since each i line is a constant ij line, we can move
effective blending functions have been found to be points along an i line by simply varying € (or $)
along that line. The line remains unchanged, only
E = -,j mj a x the distribution of points along it is altered. Adap-
tion in the t direction is thus achieved by letting the
r = { et - 1
e-2
-< blending functions be variant in 71 as well as (, and
so we have,
distribution in this direction (previously vi was the where 2.0 5 f q ,fr2 5 5.0, and Avo, A ~ o Asho,
,
same on every j line). The inner and outer bound- and As,, are the initial spacings.
aries, f(v,O) and f ( v , 1) are determined in terms of
7,so that they are known a t any point, not just the Consider, for example, the variation of 7 along the
specified points fi(0). The interpolation is then aerofoil surface, $’ = 0. An intermediate variable, C ,
is defined so that 7 = v(C) where clearly 0 5 C 5 1.
a A uniform distribution of C is used, and then q(C)
fi,j = $y,jf(vi,jj o)+$~,,-fi(o)+l~l:,jf(vi,j, 1). (28)
2 is defined to give the required distribution of points.
By adapting 7 and 4’ instead of x and y the grid Figure 3(a), shows the initial distribution of 17 along
positions are still available algebraically. This means the aerofoil for 99 points on the aerofoil. This is the
that the grid speeds are also available algebraically, unadapted distribution of points on the aerofoil.
I
which is essential for efficient unsteady adaption.
For a solution where adaption is required in the r]
direction, if for example a normal shock is present,
4.1 Adaption in Each Direction A V is defined a t that point using equation (34) and
I
then use a cosine variation in q to get back to the
Instead of computing a completely new grid due to unadapted distribution of v in as few points as pos-
adaption, it is desirable to simply change only a small sible. Figure 3(b) shows the variation of along the
region of the grid where adaption is required. aerofoil surface for the flow considered in the next
section, when normal shocks are present at approx-
Adaption in the j direction is a.chieved by varying imately 0.64 chord on the upper surface and 0.32
$’ along each i line. For adaption in the i direction chord on the lower. This simple sampling and adap-
I 71 is changed along each j line to give the required tion procedure is performed for each line in each di-
I
‘ distribution. rection.
Figure 6 shows the smoothed variation of q and $ 2 , with local time-stepping that is used for steady com-
and the corresponding grid. Figure 4(b) shows the putations. This approach also means that the grid
surface pressure coefficient computed on the adapted generation routine only needs to be called once ev-
grid. The improved shock capturing is clear. ery real time-step, to calculate the grid positions and
speeds at the next time level.
In many steady flow adaption procedures the grid is
allowed to adapt gradually by effectively progressing
with the solution, until a time-asymptotic grid and 6.1 Consideration of Cell Area Changes
solution are reached. The grid adaption here is only
applied once, as this will be the case when an un- If the cell areas a t each time level or stage are simply
steady solution is periodically sampled as in section calculated using the instantaneous physical coordi-
7. The adaption only has one 'chance' to compute a nates of the cell faces a numerical error is introduced
suitable grid at each adaption point. which will increase with time. The cell areas must
therefore satisfy a geometric conservation law of the
same integral form as the mass conservation law [15],
6 UNSTEADY EULER METHOD
time steps the velocity of each point required for that undisturbed flow speed. The scheme was run at a
point to reach its position at the next ada.pted grid CFL number, based on T of 1.4, a.nd local time step-
is imposed on each point, and the grid moves grad- ping was used to accelerate convergence within each
ually between each adapted state. real time step. There were 180 real time steps per pe-
riod and the same grid data w a s used as previously,
If k is the adaption index (k = 0, .., nmdapt, where 129 x 30 points, with 99 points on the aerofoil. In the
nadapt = n t / n s a m p and nt is the number of real adaptive computation nsam.p was 10 and so nadapt
time steps per period), then qf:),$?,,k"' is the grid was 18.
point distribution at adaption k. To move the grid
points from one distribution to the next over nsa.mp Figure 7 shows normal force and moment (about
time steps we calculate the speed of each point, in a chord) coefficient loops obtained by the implicit
(v,$2) space method, adaptive and non-adaptive, and from exper-
iment [lG]. The coefficient loops are quite similar,
but the adaptive C, loop is slightly narrower, and
the C,,,loop has larger 'steps', than the standard so-
lution. The instantaneous pressure distributions are
shown in figure 8. The improved shock capturing
with the adaptive grid is clear. Figure 9 shows the
near aerofoil adaptive grid at each of the incidences
The grid speeds are obtained by differentkting equa- considered in figure 8.
tion 28 with respect to time,
Currently, only a fairly crude grid redistribution [7] Van-Leer, B., “Flux-Vector Split,ting for the Eu-
technique is employed. Future work will include de- ler Equations”, Lecture Notes i n Physics, Vol.
veloping a more sophisticated method, along with 170, 1982, pp. 507-512.
extending the adaptive technique into three dimen-
sions. The method should be equally simple, the [E] Parpia, I. H., “Van-Leer Flux-Vector Splitting
only difficulty arising from the third dimension being in Moving Coordinates”, AIAA J , Vol. 26, lan-
that the boundary definition will involve determin- uary 1988, pp. 113-115.
ing spline equations for surfaces rather than lines.
[9] Anderson, W. K . Thomas, J. L. and Van-Leer,
B., “Comparison of Finite Volume Flux Vector
References Splittings for the Euler Equations”, AIAA J ,
Vol. 24, September 1986, pp. 1453-1460.
[l] Salas, M. D. (ed.), “Accuracy of Unstructured
Grid Techniques Workshop”, (NASA Langley [IO] Gordon, W . J. and Hall, C. A., “Construction
Research Centre, Hampton, VA), Jan. 1990, of Curvilinear Coordinate Syst,ems and Applica-
(NASA Proceedings t o be published) tions of Mesh Generation”, Int J Numer Meth
Eng, 1973, Vol. 7, pp461-477.
[2] Williams, A. L. and Fiddes, S. P., “Solution of
the 2-D Unsteady Euler Equations on a Struc- [ l l ] Eriksson, L. E., “Generation of Boundary-
tured Moving Grid”, Bristol University Aero. Conforming Grids Around Wing-Body Configu-
Eng. Dept. Report 453, 1992. rations Using Transfinite Interpolation”, AIAA
J , Vol. 20, No. 10, 1982, pp. 1313-1320.
[3] Catherall, D., “Adaptivity Through Mesh
Movement”, in Proceedings of European Forum, [12] (Anonymous) “Test Cases for Inviscid Flow
Recent Developments and Applications in Aero- Field Methods”, AGARD Agardograph AR-
nautical CFD, Bristol, 1993. 211, 1985.
(41 Patel, M. K., Pericleous, K. A. and Baldwin, S., [13] Allen, C. B., “Central-Difference and Upwind-
“The Development of a Structured Mesh Grid Biased Schemes for Steady and Unsteady Euler
Adaption Technique For Resolving Shock Dis- Aerofoil Computations”, Aero. J , Vol. 99, 1995.
continuities in Upwind Navier-Stokes Codes”. [14] Jameson, A., “Time Dependent Calculations
Int J Numer Meth Fluids, Vol. 20, 1995. Using Multigrid, with Applications to Unsteady
Flows Past Airfoils and Wings”, AIAA Paper
[5] Morgan, K., “Unstructured Mesh Methods”,
91-1596.
in Proceedings 16, Rutherford Appleton Lab-
oratory EASE Community Club in CFD New [15] Thomas, P. D. and Lombard, C. K., “Geometric
Opporlunilies and Directions in Aeronaulical Conservation Law and its Application t o Flow
CFD,April 1994. Computations on Moving Grids”, AIAA J , Vol.
17, October 1979, pp. 1030-1037.
[GI Webster, B. E., Shephard, M. S., Rusak, Z.,
and Flaherty, J. E., “Unsteady Compressible [lG] (Anonymous) “Compendium of Unsteady Aero
Airfoil Aerodynamics IJsing an Adaptive Time- dynamic Measurements”, AGARD-R-702,1982.
Discontinuous GLS Finit,e Element Method”,
AIAA Paper 93-0339.
;..;;
*
+
w
/
ZETA ZETA
WC WC
No".hjqahmbl.."".x pGGGT
Fig.4. Surface C,, (a) Non-Adapliue and (4) Adaptive, NACA0012, M = 0.8, U = 1.25'.
Fig.5. Near Aerofoil Adapted Grid (a) (q,$') and (b) ( 2 ,y)
36-9
0.4
0.01
0.2
5 0
4.2
4.01 '.
,
..
4
w.t - 210 -nu
Standard
____. -Grid Expt.&Jpper)Expt(b0Wer)
Grid Adaptive
m
37-1
Paper presented af the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
37-2
4 The DLR-T-code
In order to simplify notation \ve describe the details Figure 2: Boundary of a control volume in 2-d
of our numerical method in two space dimensions.
The extension to three space dimensions follows by
straightforward considerations based on the 2-d case. m1J o , ( t ) ~ ( g , t ) dthe
x , Navier-Stokes equations (1)
can be re-written in the form
d
4.1 Finite volume approximation zs(t)=
We consider conforming triangulations 7 h consisting
of tetrahedra (triangles in two-d) in the sense of Cia-
rlet [5] and define a discrete control volume u,(t) as
the volume of the barycentric subdivision of 7) en-
closing the node & = ( z i , ~+i,#
, and bounded by where N ( i ) := {j I au, n au, # 0} is the set of
the straight line segments /GI k = 1,2, connecting indexes of nodes neighbouring node s,. Since the
line integrals are not defined if g is discontinuous
the midpoint of the edge with the point z The
Ts' two numerical flux functions are introduced, name-
geometry of the control volumes is shown in figure
1. Figure 2 shows the boundary of a control volume
ly E , S : R4 x R4 x R2 -+ R4,approximating the
convective and viscous fluxes, respectively, and sat-
isfying the fundamental consistency conditions
R A 2 2
E(X,SE) = CcF(c)ni, s((u,%L;I~)
=C i y ( d n i .
,=1 i=l
where the first error term is due to the quadrature Then, defining gh0 := g,, the polynomial (2) will
rule while the second error term depends 011 the func- certainly satisfy the recovery condition. However,
tions& Using cell average values, i.e. c, gi, results the isotropic recovery of the gradients does not take
in a first order approximation, i.e. q = 1, due to the care of shocks in the solut,ion and wlll thus lead to
weak approximation property of the cell average op- instabilities. According to the TVD methodolo&y a
erator, see [22], [23]. To increase the approximation slope limiter @, hast to be introduced such that the
order a recovery function & is sought on U, which recovery polynomial is written in the form
approximates g at least with order U(/a2).It is eas-
ily seen that linear polynomials recover g up to this ij,(z,t)= ~ l , + @ . i ( a ; , , , a b ~ ) ~.(ic-&).
order.
We have good experience in using the limiter de-
scribed by Barth and Jesperson in [4], but conver-
4.2 Recovery algorithms gence to steady state is enhanced if one adds a mod-
If & denotes the barycentre of the control volume ification as suggested by Venkatakrishnan in [31].
U: then a linear polynomial A simple ENO-type recovery can also be described
in terms of the linear interpolants E - The linear r e
T'
covery polynomial & on the box ui is then chosen to
ii%,(z,t)
= 1 u&)(z-dI (2)
be the one linear polynomialEF. on the surrounding
Ids1
meta-triangles for which the modulus of the gradient
has to be recovered on ui(t) such that it satisfies the is minimal, i.e. for which
recovery condition
radial basis funclions, for esainple the well-known control voltlme. Flrst results of this ~procedurc.are
thin-plate spline. First experiments with this kind reportcd 111 section 0.
of recovery functions i n [22], [23] showed impressive Meanwhile, radial basis functions with compact sop-
increase in accuracy. Although recovery of radial port are being constructed. We meution the class of
basis functions is much too expensive as compared WO functions as designed in [3G] and the very re-
to polynomial recovery the techniques developed in cent develop~nentsof Wendland [34]. These fuuc-
[3] could very well provide a franiework in which tions are unconditionally positive definite and thus
these more complicated functions could Ire competi- do not need the polynomial augmentation as the thin
tive with polynomial algorithn~s. plate spline. Furthermore, their compact support
In recovery with radial basis functions a recovery makes them very attractive for practical purposes.
function of the form Whether these functions can be competitive in run-
N-1 At
time to polynomial-based recovery algorithms is the
k,~(r,t)
= 4Nua)WIg- -VI,1 ) +Eb ’ k ~ k t, ) contents of future research on E N 0 approximations.
j=O k l
4.4 Time stepping schemes and par-
is sought for the I-th component, where A(u*)xf := allelism
& J,, f(2)dg denotes the cell average operator.
The radial function CP is assumed to satisfy the fun- The DLR-7-code was originally supplied with an
damental condition of being conditionally positive explicit Runge-Kutta time stepping algorithm de-
definite and r h , k = 1,.. ., N, denote a basis of the signed by Shu and Osher in [21] which respects the
space of polynoniials of a certain degree, which de- TVD-properties of the spatial discretisation, see [24].
pends on the radial function CP chosen. The number However, these schemes are limited in CFL num-
of nodes N in the recovery stencil is another quan- ber by 1 which is a dramatic upper bound for ap-
tity which has to be choosen. Numerical experience plication in a n adaptive framework where grid cells
gained so far has indicated that polynomial-based can be very small. In the meantime other Runge-
E N 0 stencil selection criteria work well also in the Kutta schemes with up to five stages are in use and
case of radial basis functions. show satisfying behaviour especially when used in
Using the well known thin plate spline a multigrid environment for steady problems. For
the computation of unsteady flows, as pitching air-
foils, the restrictions due to the CFL condition are
still too strong. One way to overcome the limita-
tions of explicit time stepping schemes is the use of
parallel computers which is easy in the case of finite
volume approximations because domain decomposi-
tion is natural. A grid partitioner was developed in
which, by construction, is able to reproduce linear connection with an intelligent load balancing a l g e
polynomials, amounts to use at least four control rithni to redivide and redistribute grid patches de-
volumes in the stencil. In an ENO-like manner one pending on the load of the processors used. In that
can think of the stencil selection according to fig- framework a parallel computation is easily done in
ure 4,where the control volumes were chosen to be an environment conskiing of a cluster of worksta-
triangles. tions running P V M or P4 while the machines are
still occupied by other users.
In figure 5 the grid of a channel with forward facing
step is shown. The grid partitioner has divided the
grid into 59 patches which contain nearly the same
number of nodes. The possible speedup is docunient-
If on each of the four stencil sets a radial basis recov- Figure 5: Grid partitioning
ery function is computed the one with the smallest
total variation norm is selected and assigned to the ed in the diagram of figure 0 where speedup vs. num-
37-5
her of processors is shown for an Intcl PARAGON. by f(y) = IIL - Ly11: anti choose an arbitrary ini-
The fiow is tlie supersonic test case by Woodward tial vcctor yo. %'&ting wit.11 111 = 0 the resid-
and Colella [35] as discussed also in section 6.1. ual Frn = min,=,o+, f ( y-) is comput,ed, where
As can be seen from figure 6 the present approach ,EKn
K , , , ( A ,:= -_-_-
~ SPQil (ro,Ar*o,~'r.o,.
..,~"'-'vo} de-
notesthe in-th Krylov subspace and vL= b-Ay0. __
Now we increase in until r;,,, is belowa given tol-
/t erance. Then we compute the o p h i a l approximate
solution y,, = arg min f ( y ) . Considering the fact
-u=uo++
_ -
iEKn
that the expense to calculate the residual increases
with the Krylov subspace dimension it is efficient to
limit this dimension. If this limit is reached before
t.liat of the tolerance tlie approximation gm lias to
be calculated and used as tlie initial value during a
repetition. This t,echniqne is called "GMRES with
restart".
Since the convergence rate of an iterative method
depends on the condition number of the matrix A,
Figure 6: Speedup on an Intel PARAGON an incomplete LU-factorisation is used as a precon-
ditioner in order to decrease the condition number.
Hereby the incomplete LU-factorisation is a pair of
towards parallelism through domain decompostion
leads to a very efficient method. a lower left (a and a upper right
fying the following three conditions:
(g matrix satis-
For use on conventional machines an implicit time
stepping scheme according to 1. presents the unit matrix for all i ,
2. L..= U , . = A.., if A . . is a null matrix,
3. (aii
"1 -> 3Ll "I
= A j ,if Aj is not a null matrix
and the linear system is transformed into
5 Adaptive concepts
5.1 Refinement .-algorithms
was designed. Over the years experience was gained with several
The numerical fiux functions are evaluated at the different refinement/coarsening strategies for trian-
+
time 'n 1' whereby a linearisation is necessary gulations. This work is documented in [26], [27], [ll]
which leads to a linear system of equations in the and [13]. Numerical experience indicated that, at
form least for reliable Euler grids, a version of the isotrop-
ic red-green refinement as described in [14] gives sn-
A..A&'+
a , l&A$=h , i = l , ...,I , perior grids. In this refinement strategy triangles
jEN(i)
which have to be refined are red-refined according
where a%" = $+' -g and A .,B.. E R4x4.
-! ->
Con- to figure 7. Remaining triangles with two hanging
nodes are also red-refined before green refinement
seqnently, for each time step a linear system
turns the triangulation again into a conforming one.
Note that at the beginning of each refinement cycle
(3)
the previous green refinements are removed in or-
has to be solved, where A is a large sparse non- der to keep the triangulations stable, i.e. in order to
symmetric matrix. For t s solution of the system avoid too small angles occuring after several adap-
(3) the GMRES algorithm developed by Saad and tation cycles.
Schulz [19], [20] is used. Therefore, the system is In a corresponding re-coarsening strategy several
shown. see [14]. t.lrat a rcfined mesli cm1 nlwiiys be is much easier to ~mplemenlthis is t h c eiror indica-
coinpletely coarsened up to it,s init,ial state. tor of our choice.
Note that in order bo keep the process of cos,rsening The additional problem occuriiig w i t h the Navier-
conservative it is necessary t.o use int.erpolatio~~
pro- Stokes equat.ions lics 111 t,he second derivatives inher-
cedures respecting couservat,ion. Examples of such ent i n the diffusive Ruses. Although we are currently
procedures are given in [14]. not able to prove error bounds it seems possible to
approximate the second derivatives in the compu-
tation of the residual in a measure-theoretic way by
sampling the jumps of the first derivatives across the
edges in normal direction. This type of error indi-
cators was inspired by the work of C. Johnson et al.
on the adaptive streamline diffusion finite element
method, see [7].[12], and developed by Goliner and
Warnecke [9]. We are currently investigating this
type of indicators for compressible flow [29].
".I
'"'I
DI
IO'S
10"
M6 wing
-singlegrid
0.50
0.40 I i
References
and
-LR+1 ; RS0.01 [I] R. Abgrall - On essentially non-oscillatory
; else. schemes on unstructured meshes: Analysis and
implementatiou. J. Conp. Phys. 114, 45-58,
where R := ( X I - +
i)2 (x2 - I1) a' The initial func-
(1994)
tion is a cone of unit height which is rotated around
the origin under the action of the differential equa- [Z] R. Abgrall - Design of an essentially nonoscilla-
tion. Measuring the remaining cone height after tory reconstruction procedure on finite-element-
37-10
type meshes. 1CA.W Rtport No.91-84, (1991). [15] D. IIietel. A . Meister, Th. Sonar - On the eoni-
Revised version. INRIA Repod N o 2.942, parisoii of two different implementations of an
(1994) implicit third-order E N 0 scheme of box type for
the Computation of unsteady compressible flow.
[3] R. Abgrall, Th. Sonar - On the use of MiihlLach in prepnratton
expansious in the recovery step of EN0 meth-
ods. DLR Interner Berrcht IB 229-95 A 34, [I61 D.J. Mavriplis - A three dimensional multi-
(1995) grid Reynolds-averaged Navier-Stokes solver for
unstructured meshes. ICASE report no. 94-29
[4] T.J. Barth, D.C. Jespersen - The design and
(1994)
application of upwind schemes on unstructured
meshes. AIAA paper 89-0366, (1989). [l7] A. Meister - Ein Beitrag zum DLR-r-Code:
Ein explizites und implizites Finite-Volu-
[5] P.G. Ciarlet - The finite element method for el- men-Verfahren zur Berechnung instationarer
liptic problems. North-Holland, 2nd edt. (1987) Stroinungeii auf unstrukturierten Gittern. DLR
[a] L.J. Durlofsky, B. Engquist, S. Osher - Triangle Interner Berichl IB 223-94 A 36, Gottrngen,
Based adaptive stencils for the solution of hy- (1994)
perbolic conservation laws. J. Comp. Phys. 98, [18] A. Meister - Development of an implicit finite
64-79, (1992) volume scheme for the computation of unsteady
[7] K . Eriksson, C. Johnson. Adaptive finite ele- Bow fields on unstructured moving grids. lo QP-
ment methods for parabolic problems I. A h e a r pear in Proceedings of the ICFD Conference on
model problem. Chalmers Uniuertty of Tech- Numerical Methods for Fluid Dynamrcs, 02-
nology, Department of Mathematics, prepnnt ford, (1995)
8891 (1988). [I91 Y. Saad, M. H. Schulz- GhfRES: A generalized
[8] M. Galle - Solution of the Euler and Navier- minimal residual algorithm for solving nonsym-
Stokes equations on hybrid grids. Thzs uolume, metriclinear systems. SIAM J . Scr. Stat. Comp.
paper no. 30 7 , 856-869, (1986)
[SI U. Gohner, G. Warnecke - A second-order finite [20] Y. Saad - Krylov subspace techniques, conju-
difference error indicator for adaptive transonic gate gradients, preconditioning and sparse ma-
flow computations. Num. Math. 7 0 , 129-161, t ~ i xsolvers. von Karman Institute of Fluid Dy-
(1995) namics, Leelure series 1994-05, (1994)
[lo] D. Hiinel, R. Schwane - An implicit flu-vector [21] C.-W. Shu, S. Osher - Efficient implementation
splitting scheme for the computation of viscous of essentially non-oscillatory shock-capturing
hypersonic flow. AIAA paper 89-0274 (1989) schemes. J. Comp. Phys. 77, 439-471, (1988)
[ll] V. Hannemann, D.Hempel, Th. Sonar - Adap- [22] Th. Sonar - Multivariate Rekonstruktionsver-
tive computation of compressible flow fields fahren znr numerischen Losung hyperbolisch-
' with the DLR-r-code. ill: Numerical Methods
er Erhaltungsgleiefiungen. Habilitationsschrifl,
for the Naurer-Stokes Equations, F.-K. Hebek- T E Darmstadt, (1995). Also: DLR Forscbungs-
er, R. Rannacher, G. Wittum (Eds.), Notes on bericht 95-02, (1995)
Numencal Fluid Mechanics, Volume 47, Vieweg [23] Th. Sonar - Optimal recovery using thin plate
Verlag, 101-110, (1 994) splines in finite volume methods for the numer-
[12] P. Bansbo, C. Johnson - Adaptive streamline ical solution of hyperbolic conservation laws.
diffusion methods for compressible flow using DLR Interner Berichi IB 223-94 A 42, (1994)
conservation variables. Comp. Methods Appl. [24] Th. Sonar - On the design of an upwind scheme
Mecb. and Engrg. 87,267-280, (1991). for compressible flow on general triangulations.
[13] D. Hempel - Dynamic adaption of triangular Numerical Algonthrns 4, 195-149, (1993)
grids. DLR Interner Berichl IB 223-95 A 38, [25] Th.Sonar - On the construction of essentially
(1995) non-oseillatory finite volume approximations to
[14]D. Hempel - Isotropic refinement and recoars- hyperbolic conservation laws on general trian-
ening in 2 dimensions. DLR Intener Bertchl IB gulations: Polynomial recovery, accuracy. and
223-95 A 35, (1995) stencil selection. submitted: Journal of Conzpu-
tational Physrcs, (1995)
1 37-1 1
L. P. Ruiz-Calavera
INTA, Aerodynamics Division, Fluid Dynamics Department,
Carretera de Ajalvir Km. 4 . 5 , 28850 Torrejon de Ardoz
SPAIN
N. Hirose
N U , Computational Science Division,
7-44-1 Jindaiji-Higashi,Chofu-shi, Tokyo 182,
JAPAN
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
38-2
different parameters that control the corresponding to each of the axis in the
calculation. transformed plane [ , q , < .
The following presents a brief The integral equation (1) is applied
description of the scheme and its separately to each cell. Assuming that:
parallel implementation, together with the independent variables are known at
some results. the center of each cell; calculating the
flux vector as the average of the values
2. NUMERICAL SCHEME in the cells on either side of the face;
Among the different schemes which have and taking the mesh velocities as the
been developed to solve the unsteady 3-D average of the velocities of the four
Euler equations [1-10], the very popular nodes defining the corresponding face,
one of Jameson 1101 has been selected the following system of ordinary
for this study. In the following a brief differential equations (one per cell)
description of the implementation made results :
here is given. More details can be found
in [11].
2.1 Governing Equations where the convective operator Qi,j,k
The flow is assumed to be governed by
the three-dimensional time-dependent
Euler equations, which for a moving
domain R with boundary may be written
in integral form as: is a function of Ui,j,k, Ui+l,j,k, Ui-l,,,k,
and Ui,j,k-l* Schemes
Ui,j+l,kr Ui,j-l,kt Ui,j,k+l
constructed in this manner reduce to
central difference schemes on Cartesian
(1) meshes, and are second order accurate if
the mesh is sufficiently smooth.
where U is the vector of conservative
flow variables; (F, G, H) are the thrge This formulation is inherently non-
components of the Euler flux vector; vL: dissipative (ignoring the effect of
is the velocity of the moving boundary; numerical boundary conditions), so that
and n is the unit exterior normal vector dissipative fluxes Di,j,k have been added
to the domain.
(6)
The well known model of Jameson [121 is
used. The idea of this adaptive scheme
is to add 4th order viscous terms
throughout the domain to provide a base
level of dissipation sufficient to
prevent non-linear instabilities, but
Here p , p, (U, v, w) and E respectively not sufficient to prevent oscillations
denote the density, pressure, Cartesian in the neighborhood of shock waves. In
velocity components of the flow, and order to capture shock waves additional
specific total energy. 2"d order viscosity terms are added
locally by a sensor designed to detect
In order to close the system of discontinuities in pressure. To avoid
equations (1)a sixth equation is needed overshoots near the shock waves produced
which is obtained from the thermodynamic by the combined presence of the Zndand
relationships for a perfect gas 4th order terms, the latter are cut of€
in that area by an appropriate switch.
1
p = (y-1) p [ E - -
2
( u 2 + v 2 + w 21) (3) For the dissipative flux across the face
separating cells i,j,k and i+l,j,k we
have (for the other faces similar
2.2 Spatial Discretization expressions apply) :
The domain around the wing is divided
into an 0-H mesh of hexahedral cells,
for which the body-fitted curvilinear
coordinates [ , v , < respectively wrap
around the wing profile (clockwise),
normal and away from it, and along the
span. Figure 1 shows an example.
Individual cells are denoted by the
subscripts i,j,k respectively
i 38-3
The dissipation coefficient &I2' and of 4, the resulting At's are usually too
are calculated as small for practical applications. This
t restriction can be relaxed by using a
I
I
. - k ( 2 ) max ( u i + Z , j , k r u i + lj,, k r
e (. 2 )1
1+?.J8k (*)
technique of residual averaging [131
which gives an implicit character to the
time-integration scheme. Before each
t u i ,j , k r u i - l ,j , k )
time-step the residuals Ri,j , k=Qi,,k-Di,
j ,k
are replaced by modified residuals Rfi,j,k
(4)
' i + $ j, , k
= max(0 , k ( 4 ) - ~ ( ~) )
i + lj ,, k
(9) which are calculated by means of an AD1
2 method :
(12)
i which is second order accurate in time
and can be shown [ l l ] to have good
diffusion and dispersion errors which is the discrete form (consistent
characteristics and less computational with the numerical scheme here employed)
cost per time step than other schemes of the Geometric Conservation Law as
with a lesser number of stages. formulated by Thomas and Lombard [17].
It states that the cell volumes must be
I
I
2.4 Residual Averaging
This explicit time-integration scheme
advanced in time in the same way as the
fluid variables (even if they could be
has a time step limit that is controlled calculated analytically at each time
I by the size of the smallest cell. step) to prevent grid-motion-induced
errors in the numerical solution.
2.6 Boundary Conditions
The following Boundary conditions are
imposed :
Even though the CFL number of the 5-
I
stage Runge-Kutta scheme is of the order a) Kinematic boundary condition on the
38-4
the fine one which respectively [91 Brenneis, A.; Eberle, A;;
correspond to CFLs of 150 and 240) is IIEvaluation of an Unsteady Implicit
given in Table 1. Euler Code Against Two and Three-
Dimensional Standard Configurationsg1;
5. CONCLUDING REMARKS AGARD CP-507, Paper 10,; 1992
A time-accurate Euler code to calculate
unsteady transonic flow about 1101 Jameson, A. ; Venkatakrishnan, V. ;
oscillating wings has been prepared and "Transonic Flows about Oscillating
implemented in the NWT parallel Airfoils using the Euler Equationsv1 ;
supercomputer. The achieved performance AIAA Paper 85-1514, 1985
has shown the feasibility of using this
type of computationally expensive 1111 Ruiz-Calavera; I'Calculation of
methods in an engineering environment. Unsteady Transonic Aerodynamic Loads on
The influence of different parameters on Wings Using the Euler Equationsv1;INTA
unsteady computations has been studied. OAT/TN0/4510/005/INTA/95, 1995
08 011
OB
04
a
0 0 2
00
-0 2
0 8
00 02 04 0 6 011 10
00 02 04 0s OB 10
x/c xlc
Fig. 2: Mean P a r t . 17.5% semispan Fig. 3: Mean Part. 82.5% semispan
DO.0
'!
I1
-. k2-0.5 i\ -. k2-0.5 k4-ll64 4
- h2-1.0
k4-1/64
k4-2164 ..
I 1 70 0
- k2-1 .O k4.2164 !i
1.
..' k2-1.5 k4-3/64 :!
-
$00
- k2-I 0 h4-2/61
0.0
... k2-1 5 k4-3/64
-
h
OD -
n
2
m
-g , o Q -
-,5.0 -
zoo -
-20.0 - !i
!i
!i
t
-25.0
so 0
0.0 0.1 0. 0.6 0.. ,.o
XIC
Fig. 4: F i r s t Haxmmic. 17.5% semispan
50.0
10.0 ,
K.0 -.. 8 0 x 1 6 ~ 3 0
- 160x32~30
10.0
I
n
-
0
tEzo.o
m
4
10.0
0.0
.10.0
100.0 10 0
80.0
80.0
90 -
70.0
60.0
8 0 -
-50.0
a -
-Q
240.0
-
0,
lu
a130.0
0 7 0 -
5 20.0
LT
10.0
- ... 80x1 6x30
0.0
6 0
- 160x32~30
.10.0
-20.0
5 0
-30.0 ' 1
0.0 0.2 0.4 0.6 0.8 1 .o
20.0
I
10.0
... 80x1 6x30
- 160x32~30
0.0
A
a
2
0)lO.O
m
-E
-20.0
-30.0
-40.0
-3 0 '
0.0 0.2 0.4 0.6 0 8 1 .o
0.0 0.2 0.4 0.6 0.8 1 .o 2zlb
XIC
.Fig. 7 : First.Hamnic. 82.5% semispan Fig. 8: Lift Coefficient. lstHarmonic.
60.0 I i 3.0
1
2.0 -
'1.0 -
-
0.0
__----------
I
0.0 *
-
!
-1.0
.10.0
'
-20.0 -2.0
10.0
1
1 .o Parallel for k
Vector for i
OVERLAP
Parallel for k
z FLUXES Vector for i
OVERLAPFIX
-2.0 ;
0.95 0.96 0.97 0.98 0.99 1 .oo Parallel for k
XlC INVERT i Vector for j
Sequential for i
Fig,. 1 0 : Imaginary P a r t .
F i r s t Harmonic. 92.5%
Parallel for k
INVERT j Vector for i
Sequential for j
8
MOVE ARRAYS
6 k- j
45
1
m4 INVERT k
Parallel for j
Vector for i
Sequential for k
3
No Residual Averaging
2
-With Residual Averaging [MOVE ARRAYS\
1
4 8 12 16 20 24 28 32
PES
Abstract. Research at the University of Glasgow, based for flutter, for example, is becoming increasingly expensive.
around implicit methods for solving the Euler and Reynolds’ On the other hand, with the rapid developments in computer
Averaged Navier-Stokes equations and to be reported in this hardware and computational techniques, the topic of compu-
paper, has targeted advanced CFD methods for tackling the tational fluid dynamics is reaching maturity as a viable way
complex flow fields of interest to aerospace vehicle design- of providmg design solutions.
ers. Therequirements for this application are for efficient,high A reasonable simulation of the fluid dynamics of high
resolution schemes which can be ported to various MPP sys- Reynolds’ number can be obtained by solving the Reynolds’
tems and implemented with robustnessto give fast turn round averaged Navier-Stokes (RANS)equations. Increasing com-
times at competitive cost. It is recognised that the most de- puter power now makes the solution of these equations feasi-
manding topics concern unsteady viscous flows and thus time ble. The level of turbulencemodel used needs to be a compro-
accuracy and efficiency is pursued as a high priority. This pa- mise between a simple eddy viscosity model such as Baldwin
per then reviews the work, ongoing and planned, by the team Lomax and a more complex second moment closure model.
at Glasgow in code developments embracing future comput- In this work the former is used, but the codes are starting to
ing environments and including some results not previously use the more general k w two equation model.
published. The example test cases used in the performance To satisfy the general requirements for a code suited for
I and sensitivity studies include the transonic flow results on aircraft design, it should be accurate, efficient and robust and
the RAE 2822 aerofoil and ONERA M6 wing selected by usable on future computer architectures.The general approach
AGARD. The computing environments to which the codes chosen by the University of Glasgow CFD Team in this work
port include workstations, either used singly or clustered to is to use high order upwind differencing to provide accu-
provide a parallel computing domain. and also integrated dis- racy and robustness and to mostly use implicit methods to
tributed memory Supercomputers such as CRAY T3D and provide efficiency [51 [91. Unstructured grids are also being
Intel Hypercube systems. The paper outlines these technolo- considered by the Team as a way forward for dealing with
gies also. geometric Complexity but there are developmental difficulties
in tackling viscous flows near boundaries and calls for high
1 Introduction memory. Geometric complexity using structured meshes can
be accommodatedusing multi-block grids which lend them-
Aerodynamics has been established as a foundation technol- selves to distributed memory computing architectures using
ogy for the design of aerospace vehicles. Good application a multidomain approach.The combination of an implicit ap-
of aerodynamics will lead to substantial economic benefits proach on a structured grid for wall turbulent flows provides
for future aircraft designs. Particularly important target areas an efficient code, particularly for unsteady flows.
include drag reduction to improve direct operating costs and There exists a considerable variety of computer architec-
better prediction of steady and unsteady loads on aircraft to tures from which to choose.The general consensus,however,
overcome structural conservatism at the time of freezing the is that competitivelypriced distributed memory massively par-
design. For the majority of aircraft, this requires the partic- allel processors (MPPs) will provide the Teraflops facility (or
ular capabhty of predicting the phenomena of shock waves greater)that will berequiredto tackleCFDsolutionsusingthe
!
and flow separation. This can be achieved through a bet- RANS model for flows over complete aircraft configurations.
ter understanding of the fluid mechanics of flow interactions A number of vendors promise production of such Teraflops
using either experimentaltechniques or computational meth- facilities in the near future, although the cost is likely to be
ods. Wind tunnel testing at simulated conditions,particularly beyond the means of all but the largest organisations. Also
I
I
I
’ Lecturer,Department of AerospaceEngineering, University of Glas-
gow, Glasgow, G12 8QQ. UK
there needs to be a further investmentin adapting the majority
of existing codes to use it. There is a trend to provide a similar
, Professor, Department of Aerospace Engineering, University of architecture at a much lower cost using workstation clusters.
I Glasgow, Glasgow, G12 8QQ, UK
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
39-2
On the sort of broad-bandwidth networks that are planned The title was changed to the shortened form Computation of
for the future in Corporate networks, the type of powerful Complex Aerodynamic Flows or CCAF Project after the pro-
high memory workstations that are being used for detailed posal was accepted [2]. The other initiative was to develop a
CAD/CAM design work in the engineering industry in the Consortium of Departments within the University to bid for
daytime are amenable to be turned loose at off peak times to resources under the New Technologies Initiative (NTI) for
provide a powerful high-memory system. the development of a High Performance Parallel Computing
Creating the parallel computing environment for the Glas- facility from Spare Capacity on a Network of Workstations!
gow CFD Team has proven to be an interesting case history NTI was developed by the Joint Information Systems Com-
that it is appropriate to relate as a contribution towards the mittee from funds that the Committee had securedthemselves
theme of this conference. Before 1990, computer systems from the Higher Education Funding Councils to promote pilot
within the Universities in the UK had undergone a major up- studies towards developing state-of-the-art computing capa-
grade each seven years, funded by the University Funding bilities across the Universities. When the funds were awarded
Council, and this generally enabled the acquisition of a useful the University project was designated the HNW Project. These
multi-user mainframe. Numerically intensive computer users projects (both now have a year's maturity) give access to a
then had access to off peak cycles through batch facilities. world class resource to the CFD Team. These projects will
Before 1994 at Glasgow, for example, the sizeable University now be described separately.
central Computing Service operated a CMS environment on One target area of application for the CCAF project is to-
an IBM 3090 150Evector facility for scientificwork, with the wards the study of aeroelasticity at the edges of the flight
help of a technological agreement with the vendor, as well as envelope, an area in which the non-linearity of the problem
VME and VMS environmentson sizeable ICL and DEC facil- poses considerableuncertainties and is likely to reveal inter-
ities, respectively. From another initiative the University also esting new mechanics. The challenge is to be in a position to
acquired a 32 transputer distributed memory Meiko Comput- complementexperimentaland analyticalstudies of these com-
ing Surface, along with a systems manager, on which some plex physical phenomena using facilities as powerful as the
early experience on parallel computing was developed by the EPCC Cray T3D. Electrodynamics radiation is also included
CFD Team members to complement time awarded by peer re- in the programmebecauseof the commonalityin grids and so-
viewed on National Facilities such as CRAY-XMP and YMP lution techniques and the opportunity to widen the application
vector multi-processors . The Team's work could be classed base of the project. The resource awarded is modest (around
at this stage in the category of high performance computing 64,000processor hours per year), but with developmentbeing
(HPC). done on local computing environments with production tests
In 1994, the Funding Council support changed to a sys- done on the National facility, the resource is useful.
tem of IT support on an annual basis, at the same time the ' h omain computational approaches are being pursued in
University adopted an IT strategy to distributed the monies CCAF: structured grid work is at a more mature stage, par-
involved thinly to all Departments whilst providing a core allelisation of multi-block codes and dealing with boundary
support for: the overall Campus Network including a FDDI layers is straight forward but dealing with geometric com-
backbone (later an ATM backbone); and a UNIX cluster for plexity is problematic; unstructured grid work copes well
core Computing (with a cost imposed on groups who used with geometric complexity but partitioning causes problems.
cycles above a threshold which was set at a low level). The The project includes comparisons between codes developed
implication of this University strategy was the need for HPC in order to determine the best future strategy. In the area of
users to prise a proportion of their Department's allocation of aeroelasticity. there is a dearth of experimental data of the
funds and add it to other initiatives to secure the computing quality and appropriateness for CFD validation. Nevertheless
environment that they needed. Also at about the same time the Consortium has identified a suitable unsteady test case
at National level, resources targeted for research (managed involving the AGARD L A " swept wing to provide an ap-
by EPSRC) were used to purchase a CRAY T3D with 320 propriately challengingcommon test case. The Glasgow Team
DEC-Alpha nodes and following bids this was placed at the is involved particularly with the developmentof a multi-block
Edinburgh Parallel Computing Centre at Edinburgh Univer- structured grid flow code meshed with a structural code made
sity. This facility was designated for the exclusive use of a available from industry and uses on average 1,600 processor
limited number of University Consortia to tackle Grand Chal- hours of T3D resource per month on this. Some preliminary
lenge problems only. results are reported below.
With this background, the team then fronted two main ini- At the other end of the cost scale, the HNW cluster project
tiatives to achieve an acceptable computing resource for its was awarded 8 man years of effort by JISC over a period of 3
ambitions to develop state-of-the-art codethat might beuseful years. The six collaborating Departments in the University of
for aircraft designers. The first was to develop a University Glasgow provided funds to purchase equipment and software
Consortium (finally, this included the Universities of Bristol, for a pilot facility. which could also be used as a demonstrator
Glasgow, Oxford and Swansea and UMIST) that proposed for a dedicated cluster as well as a base for testing different
a topic on Physically and Geometrically Complex Aerody-
namic Flows for Aircrafi Flight to use the Cray T3D facility.
see http://www.aero.gla.ac.uk/ResearcWWfor full details
39-3
cluster technologies. Following a stringent selection process ologies for the two and three dimensionalcodes and provides
and based on the company's strong interest in the cluster some new examples and Section 4 discusses the parallel cod-
technology, six Silicon Graphics' Jndys with MIPS R4400 ing methodology used.
processors and 64 ME3 memory and 17 inch monitors were
selected and purchased. These were assembled together in 2 Two-Dimensional Method
one laboratory and connected using grade 5 UTP cabling to a
lObaseTEthemet switch, which is the standard presently used The two dimensional thin-layer Reynolds' Averaged Navier-
by the University,and this itself was connectedtothe network. Stokesequationsin generalised curvilinear co-ordinates (t.9)
Using PVM 3.3 message passing, excellent performance was with 9 normal to the surfacecan be denotedin non-dimensional
achievedusing the Team's CFD codes, with little latency 163. conservative form by
A planned upgrade to ATM switching on the UTP cabling is
planned in the near future to improve communication speed as
well as a multi-cluster activity with an adjoining University
linked to the local ATM based Metropolitan Area Network where w denotes the vector of conserved variables, f the
(MAN) called ClydeNet. convectivestreamwiseflux, g the convectivenormal flux and
From other research projects, ten more Indys have recently s the normal viscous flux.
been added to the Departmental 1ObaseTnetwork.The com- One implicit step, updating the primitive variables P, can
bined resource is available generally as a computing domain be written as
to users given an account. Apart from PVM being installed
on the cluster as the messagepassing softwarefor the parallel
dw aRc"
implementations,alternatives for users include h4PI (CHIMP (-
aP
+At-
ap
+ A t aP
s n ) 6 Q = - Af(R: + Rt) (2)
and LAM versions) as well as Oxford Parallel BSP. Clusters
in other Departmentsin the University are beginning to be set where Re and R, are terms arising from the spatial discreti-
up in a similar way. sation in the t and 9 directions respectively and
Because of the heterogeneous nature of the user base of
the cluster, a resource management system was required to
optimise use of these cluster resources. The public domain
software NQS, CONDOR and DQS and demonstration ver-
sions of the supported softwareCODINETMandLSFTMwere
obtained and assessed. LSFTM,written by Platform Comput-
ing Inc. of Toronto, hadthe best ingredientsfor the University In the present work the spatial terms are discretised using
based project, particularly a multi-cluster capability, and has Osher's flux approximation with MUSCL interpolation and
been selected by a number of Industries, particularly some the Von Albada limiter for the convective,terms and central
Aerospace Industries as a means of managing the cluster re- differencing for the viscous fluxes. The Baldwin-Lomax tur-
source. A University agreement, which included technolo - bulencemodel is employedto provide a turbulent contribution
T i
ical exchanges towards the future development of LSF , to the viscosity but this is not linearised in time in the present
made available a multi-platform site license to explore its use work, i.e. turbulence contributions only appear on the right-
in a University environment and particularly this presently hand-sideof equation (2). This has been found not to degrade
unique facility of managing multiclusters. The experienceto the stability properties of the methods examined in this paper.
date in its implementation is that improved load balancing, The alternating direction implicit version of equation (2) is
and hence a considerably better use of cycles is made by now
submittingjobs to the domain, rather than to a specific work-
station. The software identifies the best resource for a job and
carries it out transparently to the user. If a user wishes to re-
claim use of a machine for interactive work, the part of the where
job being done on that machine is automaticallycheckpointed Rap= - At(R; + Rt).
and migrated to another machine with spare capacity. PVM is The AD1 factorisation which appears on the left hand side
embedded in the software so that it provides an ideal system of equation (3) has been widely used to approximatea solution
for queuin and implementing parallel programmes at low to the system (2) because the banded structure of each of the
cost. LSFT' provides excellent user interfaces, which help
factors makes it relatively easy to solve. However, the solu-
system managers of clusters to improve their service to users. tion of the AD1 system is not an exact solution of equation (2)
With continued development of the cluster technology, and in practice the factorisation error (the error introducedby
there is evidence that this type of Gordable computing could solving equation (3) rather than equation (2)) leads to a prac-
be a norm in design offices within the Aerospace Industry. tical limit on the time step and introduces another source of
With this background on the technology used at Glasgow, Sec- error into the calculation.This motivates the use of a precon-
tions 2 and 3 of the paper, outline the discretisation method-
ditioned conjugategradient solution of the unfactored system.
39-4
Conjugate gradient methods find an approximation to the between increasingthe CFL number to minimise the number
solution of a linear system by minimising a suitable resid- of implicit steps and reducing the CFL number to minimise
ual error function in a finite dimensional space of potential the number of CGS steps at each implicit step. The compar-
solution vectors. Several algorithms are available including ison of the pressure distribution with experiment for various
BiCG, CGSTAB, CGS and GMRES. These methods were levels of convergence is shown in figure 2 and shows good
,
tested in [3] and it was concluded that the choice of method agreement with experiment.
is not as crucial as the preconditioning. However, the CGS
1200
method was found to be the quickest of the three methods that
do not require re-orthogonalisationand is used here. CGS has 1100
the additional advantage that the transpose of the matrix on
the left-hand side of equation (2), is not required, reducing
implementation difficulties. The CGS algorithm was derived 900
in [lo] and is restated in 1121.
Denoting the linear system to be solved at each time step 800
by
700
Ax=b (4)
we seek an approximation to A-' z C-' which yields a 600
time of around twenty-five percent. method only the matrix for one spanwise slice or one line in
the spanwise direction need be stored at any one time. This
has the effect of reducing the matrix storage requirements at
3 Three-dimensional Extensions any one time in the calculation to m~225~~/ice,125N/;ne)
The extension of the method to three-dimensions is compli- where N,,iceis the number of grid points in a spanwise slice
cated by two considerations.First, computer storagebecomes and N;ine is the number of grid points in the spanwise di-
a limiting factor due to the need to store large Jacobian ma- rection. Since N;ineNs/ice=n/ it can be seen that the storage
trices. Secondly,the AD1 factorisation in three-dimensionsis requirements have been reduced substantially (by around two
significantly worse than in two-dimensions, making its use orders of magnitude for the test case examined in this paper).
as a preconditioner less favourable.This fact however means As a test case we shall consider flow over the ONERA M6
that there are increased gains to be made in three dimensions wing in transonic conditions. The experimential data for this
by the use of an alternative to ADI. wing is available in [13] with several previous computational
One step of the method considered can be written as results including those in [lll. The flow problem we consider
here has a free stream Mach number of 0.84, an incidence
of 6.06 and a Reynold's number of 11 million. For this case
250 explicit steps were required before PUN was used with a
where CFL number of 10. The residual is reduced about 4 orders of
Rap= - + R y + Rz). magnitude from its initial value. This was also observed for
flows over aerofoils in 181 and was due to small oscillations
This two factor step can be loosely described as unfactored in the pressure at the far field.
in each spanwise slice and approximately factored in the The comparisonof the computedpressure distribution with
spanwise direction. A stability analysis 171 has shown that the experimential results of [13] at six spanwise slices are
the method has similar stability properties to the two fac- shown in figure 3. Good agreement is obtain for the flow ex-
tor AD1 method in two-dimensions,representing a significant cept for the position of the shock and the very last station at
improvement on the behaviour of the three factor method in 99% span. This has also been observed in [ l l l for this test
three-dimensions. The linear system resulting from the first case. Shock induced separation occurs after the strong shock
factor in equation 7 has a more complicated structurethan the near the tip and the Balwin Lomax model is known to be inad-
block pentadiagonal systems which are encountered for each equate for this phenomenon. In [ 111 the Johnson-Kingmodel
factor in the three factor method. However, this sytem can be was also implemented which significantly improved the re-
solved using a direct generalisation of the method described sults. Figure 3 shows that mesh refinement in the streamwise
for two dimensions above i.e. we solve the system direction has very little effect on the solution apart from sharp-
ing the strong shock near the tip. However refinement in the
spanwisedirection not only improves the resolution of the tip
by the CGS method where of the C-H grid and hence the pressure distribution before the
shock close to the tip; but also the strength of the first shock
aw aR, aRy in the mid span region. This can be more clearly seen from
A=(- + A t - +At-), the upper wing surface pressure contours shown in figure 4.
aP 8-P ap
aw a& aw-' aw aRy
C=(-
aP
+ At-)- (-
ap ap ap
+At-)
aP
(10)
4 Parallel Implementation
and
b= - At(Rx + R y + Rz). (1 1)
A detailed description of the parallel implementation of the 2
and 3-D methods can be found in [6]. In the present section
followed by the solution of a block pentadiagonal system for we summarise the main features and give sample results.
the uDdates The major obstacle to an efficient parallel implementation
aw + At-)6P=X.
aR, of the AF-CGS method is the inherently sequential nature
(-
ap ap of the AD1 procedure. This was overcome in [ll by using a
The two factor method has substantially reduced memory transposition of the data to allow completeAD1sweepsto pro-
requirementscompared with the fully unfactoredmethod. For ceed independently on each processor. We use this approach
the third order spatial discretisation there are 13 non-zero 5 here although extra communicationis required for the present
by 5 blocks for the rows in the unfactored matrix associated method because of the matrix-vector products required in the
with any one grid cell. This means that the number of floating CGS algorithm.
point numbers which must be stored for the coefficientmatrix The computational space is mapped onto the nodes by
for a mesh with fl cells is 325N. Since n/ can be of the or- grouping complete mesh lines in both the ( and the 7 direc-
der of one million for flows around basic wings, this implies tions onto a single node. Care has to be taken to make sure
that even if we can solve the linear system efficiently, stor- that ( lines on either side of the wake cut are mapped to the
age requirements will be a limiting factor. For the two factor same processor. The computation then falls into three phases.
39-6
First, the matrix is generated and the factors are put in up- cessor without further communication. Once the updates are
per triangular form. The next phase is the multiplication of a available a second transposition is used to restore storage by
vector by the matrix and finally we have multiplication of a spanwise slices for the next time step.
vector by the preconditioner which reduces to back substiti- The method has been implemented in parallel on a range
tution on the triangular factors of the AD1 factorisation. For of machines. The algorithm speeds for the Cray T3D and the
each phase data is held on a node for complete lines in one SGI cluster are given in table 2 for grids with 140000 grid
direction in the mesh and the entire computation relating to points for the T3D and roughly half this number on the SGI
that direction is completed. The data is then communicated cluster. The parallel efficiencies will increase when the grid is
so that information for complete lines in the other direction is refined, however a high parallel efficiency has been obtained
held on a single node and the computation for that direction on 128 nodes, even for this relatively small problem. Excellent
proceeds. efficiency is obtained on the SGI cluster.
The parallel code was also implemented on a cluster of
Silicon Graphics Indy workstationsat the University of Glas-
gow. The message passing was accomplishedby using PVM 1 No.ofnudes I Emlicittimestem I ImDlicittimesteDs I
version 3.3. The comparison of algorithm speeds (time in
I speed I efficiency 1 speed I efficiency
psec/grid pointhime step) on the SGI cluster is shown in table
n_ _n_l. I 417 I1 1.00 II 1510 II 1.00
I
-
5 Conclusions
The three-dimensional algorithm has two distinct phases. The programmes that are providing a world class comput-
First, there is the generation and solution of the large lin- ing environment for the development of CFD codes at the
ear system arising from each spanwise slice of the mesh. University of Glasgow were described. A high quality access
Secondly, there is the solution of the banded linear systems to the 320 processor EPCC Cray T3D was obtained through
arising from the second factor in the spanwise direction. forming the CCAP consortiumon the problem targeted in this
The first phase is split between processors in two ways. report. At the other end of the cost scale, the development and
First, the spanwise sections are split into groups. Each group description of a parallel environment based on the spare ca-
is then assigned to a set of processors with each spanwise pacity on workstations mounted on a quality network under
slice in the group being treated in a similar way to the two the H N W project was described. It was predicted that this lat-
dimensional algorithm described above by those processors. ter typeof computingenvironmentwould bea standardwithin
The communication between the different groups of proces- the design offices of Aerospace Companies in the future.
sors, each treating a different set of spanwise slices, is simply An implicit method for simulating three-dimensionalcom-
that which would be required by an explicit method so that pressible and viscous flow developed to run on a distributed
the contributions to the residual (or the right-hand-sideof the memory parallel environmentis outlined. The AF-CGS method
linear system) from the spanwise fluxes at the interfaces be- is based on a two-dimensional approach which consists of an
tween the spanwise groupings can be evaluated. Since there iterativesolution of the linear system by the conjugategradient
is significantly less communication involved at this stage than squaredalgorithm with preconditioning by the alternating di-
is required to solve a spanwise slice in parallel, it is clear that rection implicit factorisation. The FUN (factored-unfactored)
the most efficient partition of the problem will arise when as method tackles three dimensional flows and builds on the
large a number of spanwise groups as possible is used. For a two dimensional method by factoring the linear system into
fixed number of total processors this will reduce the number a factor arising from spanwise slices in the mesh and a block
of processors which operate on a spanwise section. penta-diagonal factor arising from strips in the spanwise di-
The second phase of the calculation involves assigning rection. The more complicated factors arising from the span-
complete spanwise lines in the mesh to single processors. wise slices are solved by the two dimensional method. This
Again, a transposition of the data is used so that the calcu- approach yields a method which has similar properties to the
lation involving a single line can proceed on a single pro- 2d AD1 method, a situation which is substantiallybetter than
39-7
a three dimensional version of A D L A study concerning the [5] K.J.Badcock and B.E.Richards, ‘Implicit time stepping meth-
optimisation of the AFCGS codeusing RAE 2822 Case9 was ods for the NavierStokes equations’, in 12th AlAA CFD con-
carried out. Three levels of mesh sequencing were used to ference, San Diego. A M , (1995).
obtain a starting solution on a h e mesh of 257 x 65. Then [6] K.J.Badcock and B.E.Richards, ‘Implicit time stepping meth-
ods for the NavierStokes equations’, to appear in AlAA Jour-
the optimal CFL number used was increased to 100, and the
~ l(1995).
,
overall time to converge to within 0.25 per cent of the fully
[7] K.J.Badcock, I.C.Glover, and B.E.Richards, ‘Convergence ac-
converged lift value was reduced by a factor of 5. When ap- celeration for viscous aerofoil flows using an unfactored
plied to unsteady flows AFCGS was shown to allow for larger method’, in Second European conference on CFD, pp. 333-
time steps and a reduced computationalcost when compered 341.ECCOMAS. (1994).
to ADL [8] KJ.Badcock, 1.C.Glover. and B.E.Richards, ‘A preconditioner
The FUN code was tested through the prediction of the for steady two-dimensional turbulent flow simulation’, submit-
flows over the ONERA M6 Wing using the Cray T3D. Even ted for publication, May 1994, (1994).
for the relatively course grid tested parallel efficiencies of 75 [9] L.Dubuc KJ.Badcock, X.Xu and B.E.Richards, ‘Precondition-
per cent were achievedusing 128nodes. Improved efficiencies ers for high speed flows in aerospace engineering’, to appear
will be achievableusingfiner grids. The comparisonswith the in NumericalMethodsfor Fluid Dynamics V . Institute for Com-
putational Fluid Dynamics, Oxford, (1995).
experiment using a Baldwin-Lomax turbulence model were
[ 101 P.sonneveld, ‘CGS: A fast Lanczos-type solverfornonsymmet-
found to be satisfactory,but improvements are expected when ric linear systems’, SIAMJOUrMlStatiStiCSandcomputing, 10,
the k w turbulence model is implemented. 36-52, (1989).
Future work includes the development of multi-block ap- [ l l ] R. Radespiel, C.Rossow, and R.C. Swanson, ‘Efficient cell-
proach and the testing of the unsteady 3 d code and its cou- v&ex multigrid scheme for the three-dimensional Navier
pling with a structural code to tackle aeroelasticity cases. Stokes equations’,AIAA Journal, 28,1464- 1472, (1990).
Work is underway on multiblock extensionsof the methodol- 1121 M. Wtaletti, ‘Solver for unfactored schemes’, AlAA Journal.
ogy presented. 29,1003-1005. (1991).
[13] V.Schmitt and E C h q i n , ‘Pressure distributions on the
ONERA-M&Wmg at transonic Mach numbers’, Technical Re-
port AR-l38,AGARD, (1979).
ACKNOWLEDGEMENTS
This work has been carried out with the supportof the Ministry
of Defence, the Engineering and Physical Sciences Research
Council, British Aerospace and the Joint Information Sys-
tems Committeeof the Joint Higher Education Funding Coun-
cils under the following grants: EPSRC/MOD GW47371,
DRA/MOD/BAeFRNlC/407,EPSRC GR/K42264, NW65.
The authors would like to thank Mark Woodgatefor obtaining
the three dimensionalresultsshown in this paper and to Dr Ian
Glover for his contribution to the early part of the work. The
work on the cluster has been carried out by Bill McMillan,
Dr Xiaokun Zhou and Angus McCuish. The mesh generation
subroutines were supplied by Dr.A. L. Gaitonde of Bristol
University.
REFERENCES
T. Chyczewski, E Marconi, R. Pelz, and E. Churchitser, ‘Solu-
tion of the Euler and NavierStoke-s equationson a parallel pro-
cessorusingatransposed/IhomasADIalgorithm’,inllthAlAA
Computational Fluid Dynamics Conference. AIAA, (1993).
B.E.Richards et al, ‘Computation of complex aerodynamic
flows - CCAF project’, Technical report, Technical Annex to
Proposal to EF’SRC (unpublished).
KJ.Badcock. ‘Newton’s method for laminar aerofoil flows’,
Aerospace Engineering Report 30, Glasgow University, G l a s
gow. UK, (1993).
KJ.Badcock and A.L.Gaitonde, ‘An unfactored method with
moving meshes for solution of the NavierStokes equations for
flows about aerofoils’, submitted forpublication, AugustJ994.
(1994).
39-8
CDC4.44 CDc4.65
15
05
4 4 4
0
05
1
0 02 04 08 08 1 0 02 04 08 08 1
& XJC
Flgure 3. Comparison of computed pressure distribution with experiment for ONERA M6 wing :-Solid line 129 x 33 x 33, Dashed line
129 x 33 x 97, Dotted line 257 x 33 x 33
1
39-9
129 x 33 x 33 grid.
129 x 33 x 97 pid.
is anyway worthwhile to use the hybrid grids for that reason. Now I 1
would like to make another comment concerning Dr. Kroll's remarks. I I
said that I essentially agree with what he said, but he should be
aware of the fact that the Program Committee of an AGARD meeting has I
constraints that organizers of large conferences don't have. For I
example, we are almost not allowed to have parallel sessions. And
parallel sessions w,ould have been necessary. I propossed parallel
sessions, but I hay immediately 10 opponents in the group, so it was
impossible to make it. That is namely, because we need a Technical
Evaluator who cannot attend all sessions at the same time of course. I
Something else, also, is that we have some, let us say, political I
constraints, in the sense that it is important to AGARD that all NATO
countries can participate in such conferences, to present the status
of the research in their country, and that is a constraint other
I
I
conferences don't have. I
B. Masure. STREHNA, France
The Technical Evaluator said that many papers did not address the
problem of the accuracy assessment. I ask the Technical Evaluator to
say to us what is exactly an accuracy assessment for a code. ~
meshes than more conventional schemes. They also may require some
more computer time, at least for the same number of grid points.
However, because the number of grid points required for a certain
accuracy level may be less, we may still gain something. I don't
know what the balance is. You may wish to comment on that.
One further remark, on adaptive grids. We might wonder what is more
efficient - to implement a highly sophisticated higher order scheme
with the best thinkable multi-dimensional upwinding with only one
grid point in the shock wave, or to have an adaptive grid scheme
with, for example, two or three grid points in the shock. I am not
sure which of the two is more efficient. I will stop here. This is
just a little bit of provocation in order to get you out of your
seats, so to speak. Who would like to shoot at this or anything
else?
P.E. Rubbert. Boeina Commercial AirDlane Group. U.S.
It is important to speak to how good do you have to be, what is the
target. Not just faster, but how fast, etc. One of the things I
seem to detect at this Conference is that many of the speakers had in
their mind a different definition of the decimal point than I do. I
heard talk about working hard on grid generation to reduce the time
from three weeks to maybe one week or maybe one day. My experience
in using CFD in an airplane design environment is that when you are
talking about designing a wing, it wasn't too many years ago that
that involved a sequence of about 75 full blown CFD runs, part
analysis, part inverse design, etc. One day turnaround was
unacceptable. We do not want to take 75 days to design wings. The
decimal point belongs in terms of hours, not days. In our old design
environment, our target was to get three turnarounds in an 8 hour day
in the design environment. The challenge is now to reduce cycle time
even more. So I think it is worth saying that some of the targets
that I hear people setting for themselves will produce a capability
which is not really acceptable and useable in a real airplane company
environment.
J.W. Slooff. NLR, Netherlands
Thank you for that comment, and it reminded me that I forgot to
mention one aspect in relation to high order schemes and accuracy.
We are not looking for infinite improvements in accuracy. What we
need is, for a given accuracy that we want to obtain, but not
necessarily want to exceed, the highest efficiency, the shortest
preprocessing turnaround time and the lowest CPU cost. Higher order
methods usually have their greatest benefit if you require very high
accuracy. If you have lower accuracy requirements they may not be so
well suited for the purpose. In industry, you probably will agree
with me, different levels of accuracy are needed in different phases
of the design process. Industry is not always looking for the
highest accuracy. That is something we also have to bear in mind in
considering higher order methods.
P. Rubbert. Boeina Commercial Aimlane Group. U.S.
The subject of accuracy - another thing I did not hear at the
Conference was any discussion of CFD with respect to the environment
in which we use it. I think it is very important that we learn how
GD-12
to think about what we want from CFD in the presence of wind tunnel
analysis and the other tools that we have for doing airplane design.
For example, the question of accuracy. I heard many times people
setting goals like we would like CFD to be able to calculate drag at
this level of accuracy, and so forth, but the way we really do it in
industry is that we don't depend on any one tool to give us the total
answer. The total answer is arrived at by utilizing all of the
information at your disposal; the information that CFD provides, the
information that wind tunnels provide, your previous experience, etc.
Integrate that all together into a judgement as to what something
like the drag would be. Again, when we talk about accuracy of CFD,
if it is going to take us 6 months to build a wind tunnel model and
test it, that means one thing in terms of the amount of accuracy you
need out of CFD. But I heard some discussion this week about
stereolithography methods, and things like that that could lead you
in the direction of what one might call overnight model
manufacturing. If that happens in the wind tunnel, that has a major
influence on the type of accuracy levels you would need out of CFD.
If you could rapidly get a number out of the wind tunnel, maybe you
don't need to focus so hard on CFD accuracy. I guess my point is we
have to stop looking at CFD by itself. We have to learn to look at
it with respect to the total environment.
D. Kniaht. Rutaers Universitv. U.S.
I would like also to focus on this question of accuracy. As I have
often understood it, it seems to be more of a question of accuracy as
a function of resource rather than resource as a function of
accuracy. Typically, for example, if you want to compute the total
pressure recovery in an inlet in an industrial environment, the
question is how long it will take to get within a certain accuracy.
That may be 1% for the total pressure recovery, if it is a design, it
may in fact be even smaller or perhaps larger. I think we yet, in the
CFD community, don't focus enough on the question: given a level of
accuracy of a particular type, like total pressure recovery, what is
the resource required to get that. If you are in industry and you
have a week to do a computation, can you actually predict the total
pressure within 1%, or should you not try at all. Maybe that will
take 2 weeks and that is the information that you need to know. This
also raises the question of optimal design: the optimal design of
your algorithm in terms of reconstruction of high order methods, and
also the optimal design of your grid structure within that algorithm.
That, of course, brings to the fore the question of an estimate for
the accuracy of your scheme. That is an issue that was mentioned in
a number of papers including the earlier one this morning by
Friedrichs. In the CFD community we still do not yet have a good
measure of accuracy, and how to predict that from our solution.
S . V . Ramakrishnan, Rockwell Science Center. U.S.
I have one comment on hybrid grids. From our experience in
generating such grids, I can say that most of the difficulty lies in
the region near the body surface for complex configurations. If you
can develop a structured grid near the body, you might as well
develop such a grid everywhere, because it doesn't take too much work
to develop the grid away from the body surface. Therefore, if we
GD-13
Of course, the points of the coarse grid would also be points of the
finest grid, like in multigrid techniques I would say, and for some
equations you would only discretize them on the coarse grids and
interpolate the results for the fine grids in order to save time. Is
there some research going on in that field? Maybe it is not
interesting, I don't know.
C. Marmignon, ONERA, France
We have not looked at this point.
J.W. Slooff. NLR. Netherlands
On the last question, that is the DNS, LES question, I think I saw
three hands up there.
B. Geurts. University of Twente. Netherlands
The question you raised is, as exclusively mentioned, not within the
scope of this Symposium. If you are interested in it, I would like
to refer you to some of the work at Twente where we try to use DNS as
a data base for developing subgrid models for LES which is an
intermediate step for possible extension to Reynolds averaged
turbulence modelling improvement. We are not unique in the world,
there are several groups that have similar approaches in which they
start from DNS.
P. Comte. LEGI, Institut de Mecaniaue de Grenoble, France
I think all the LES community has tested models in comparison with
DNS, however DNS are currently restricted to fairly low Reynolds
number flows. If we want to use LES for higher Reynolds numbers,
maybe those comparisons wouldn't be that relevant.
N. Kroll, DLR, Germany
I just want to make a short comment on chemically reacting flows. In
my opinion the most severe problem is the stiffness of the discrete
system. I think you cannot overcome this problem by using different
mesh types. You have to develop efficient algorithms to overcome
that stiffness problem.
J. Jimenez. Escuela Superior de Inaenieros Aeronauticos, Spain
The question of the relationship between DNS and modelling is
something that has been considered for several years. It is a
question of what to expect. You cannot expect DNS to give you a
model. That has to be done by modellers. What DNS gives you is
"ground truth'. It gives you what the real flow is doing, and it
gives you constraints on which models work and which ones do not.
This has been practiced extensively now, at the CTR in Stanford, at
Twente, as reported in this meeting, and at many other places. There
are cases in which DNS is almost the only data available, as in the
case of stress balances in separated flows, which are difficult to
measure and difficult to model, but which have been computed with
DNS. You can use those data to check whether a particular model
works or not and, if it does not, it is up to the modeller to come up
with a better one. That last step was, of course, outside the scope
of the present meeting. DNS can do this, and it can give you some
ideas of how to improve your model, but it cannot produce a model by
itself. It is as difficult to get good models out of DNS as it has
GD-15
8. Author(s)/Editor(s) 9. Date
Multiple April 1996
10. Author’s/Editor’s Address 11. Pages
Multiple 488
12. Distribution Statement There are no restrictions on the distribution of this document.
Information about the availability of this and other AGARD
unclassified publications is given on the back cover.
13. KeywordsDescriptors
Computational fluid dynamics Turbulent flow
Design Aeroelasticity
Algorithms Fluid dynamics
Aerodynamics Chemical reactions
Parallel processing Computerized simulation
Parallel programming Computation
Computer architecture Grids (coordinates)
Unsteady flow
14. Abstract
The papers prepared for the AGARD Fluid Dynamics Panel (FDP) Symposium on “Progress and
Challenges in CFD Methods and Algorithms”, which was held 2-5 October 1995 in Seville,
Spain are contained in this Report. In addition, a Technical Evaluator’s Report aimed at
assessing the success of the Symposium in meeting its objectives, and an edited transcript of the
General Discussion held at the end of the Symposium are also included.
Papers presented during nine sessions addressed the following subjects:
- parallelcomputing;
- advanced spatial descretization techniques;
- unstructured, hybrid and overlapping grids;
- adaptive meshes;
- fast implicit and iterative solvers;
- large eddy and direct numerical simulations of turbulent flows;
- chemically reacting flows;
- unsteady aerodynamics.
NATO -9-
\,’
OTAN
7 RUE ANCELLE 92200 NEUILLY-SUR-SEINE DIFFUSION DES PUBLICATIONS
FRANCE AGARD NON CLASSIFIEES
TBlecoDie (1147.38.57.99Telex 610 176
Aucun stock de publications n’a exist6 B AGARD. A partir de 1993, AGARD dttiendra un stock limit6 des publications assocites aux cycles
de conftrences et cours sptciaux ainsi que les AGARDographies et les rapports des groupes de travail, organists et publits B partir de 1993
inclus. Les demandes de renseignements doivent &treadresskes B AGARD par lettre ou par fax B I’adresse indiquke ci-dessus. Veuillez ne
pas tiliphoner. La diffusion initiale de toutes les publications de I’AGARD est effectuke auprks des pays membres de I’OTAN par
I’intermtdiaire des centres de distribution nationaux indiquks ci-dessous. Des exemplaires supplkmentaires peuvent parfois &treobtenus
auprks de ces centres (A I’exception des Etats-Unis). Si vous souhaitez recevoir toutes les publications de I’AGARD, ou simplement celles
qui concement certains Panels, vous pouvez demander B Ctre inch sur la liste d’envoi de I’un de ces centres. Les publications de I’AGARD
sont en vente auprks des agences indiqutes ci-dessous, sous forme de photocopie ou de microfiche.
CENTRES DE DIFFUSION NATIONAUX
ALLEMACNE ISLANDE
Fachinformationszentrum Karlsruhe Director of Aviation
D-76344 Eggenstein-Leopoldshafen 2 c/o Flugrad
BELGIQUE Reykjavik
Coordonnateur AGARD-VSL ITALIE
Etat-major de la Force atrienne Aeronautica Militare
Quartier Reine Elisabeth Ufficio del Delegato Nazionale all’AGARD
Rue d’Evere, 1140 Bruxelles Aero orto Pratica di Mare
CANADA 00048 Pomezia (Roma)
Directeur, Services d’information scientifique LUXEMBOURG
Ministkre de la Dtfense nationale Voir Belgique
Ottawa, Ontario K1A OK2 NORVEGE
DANEMARK Norwegian Defence Research Establishment
Danish Defence Research Establishment Attn: Biblioteket
Ryvangs All6 1 P.O. Box 25
P.O. Box 2715 N-2007 Kjeller
DK-2100 Copenhagen 0 PAYS-BAS
ESPAGNE Netherlands Delegation to AGARD
INTA (AGARD Publications) National Aerospace Laboratory NLR
Carretera de Torrej6n a Ajalvir, Pk.4 P.O. Box 90502
28850 Torrejbn de Ardoz - Madrid 1006 BM Amsterdam
ETATS-UNIS PORTUGAL
NASA Headquarters Estado Maior da ForGa ACrea
Code JOB-1 SDFA - Centro de DocumentaGSo
Washington, D.C. 20546 Alfragide
2700 Amadora
FRANCE
O.N.E.R.A. (Direction) ROYAUME-UN1
29, Avenue de la Division Leclerc Defence Research Information Centre
92322 Chdtillon Cedex Kentigem House
65 Brown Street
GRECE Glasgow G2 8EX
Hellenic Air Force
Air War College . TURQUIE
Scientific and Technical Library Milli Savunma Bavkanligi (MSB)
Dekelia Air Force Base ARGE Dairesi Bavkanligi (MSB)
Dekelia, Athens TGA 1010 06650 Bakanliklar-Ankara
Le centre de distribution national des Etats-Unis ne detient PAS de stocks des publications de I’AGARD.
D’kventuelles demandes de photocopies doivent Etre formulkes directement auprhs du NASA Center for Aerospace Information (CASI)
B I’adresse ci-dessous. Toute notification de changement d’adresse doit &trefait tgalement auprks de CASI.
AGENCES D E VENTE
NASA Center for ESAnnformation Retrieval Service The British Library
Aerospace Information (CASI) European Space Agency Document Supply Division
800 Elkridge Landing Road 10, rue Mario Nikis Boston Spa, Wetherby
Linthicum Heights, MD 21090-2934 75015 Paris West Yorkshire LS23 7BQ
Etats-Unis France Royaume-Uni
Les demandes de microfiches ou de photocopies de documents AGARD (y compris les demandes faites auprks du CASI) doivent
comporter la dknomination AGARD, ainsi que le numtro de sCrie d’AGARD (par exemple AGARD-AG-315). Des informations
analogues, telles que le titre et la date de publication sont souhaitables. Veuiller noter qu’il y a lieu de spCcifier AGARD-R-nnn et
AGARD-AR-nnn lors de la commande des rapports AGARD et des rapports consultatifs AGARD respectivement. Des rtftrences
bibliographiques complktes ainsi que des rCsumCs des publications AGARD figurent dans les journaux suivants:
Scientific and Technical Aerospace Reports (STAR) Government Reports Announcements and Index (GRA&I)
publit par la NASA Scientific and Technical publiC par le National Technical Information Service
information Division Springfield
NASA Headquarters (JTT) Virginia 22 161
Washington D.C. 20546 Etats-Unis
Etats-Unis (accessible tgalement en mode interactif dans la base de
donnCes bibliographiques en ligne du NTIS, et sur CD-ROM)
DISTRIBUTION OF UNCLASSIFIED
' I
t
FRANCE AGARD PUBLICATIONS
t f
~~
Te'lefax (1.)47.38.k7.99
~ ~
Telex 610 176 . .
AGARD hoMs limited,quantities:of the publications that accompanied Lecture Series and Special Courses held in 1993 or later, and of
AGARDographs and \"(orking.Orobp reports ubli'hed from 1993 onward. For details, write or send a telefax to the address given above.
Please do not telephone: * . % [
. AGARD does not hold stocks of .publications'$,t accompanied earlier-Lecture Series or Courses r bf 'any other publications. Initial
distribution fall AGARD publications is made t NATO nations through'the National Distribution Cektres listed below. Further copies are
sometimes itailable froythese centres (except incthe United States). If you have a need to rkceive all AGARD publications, or just those
relating to one or more specific AGARD Panels, thef ma)i.be willing to ,jnclude you (or{&our organisgtion) on their distribution list.
AGARD. publications-may be purctiased from the Sales Agencies listed below, in photocop$ or microfiche form.
I.
1
. NATIONAL DISTRIBUTION CENTRES
! \
BELGIUM LUXEMBOURG\ !
Coordonnateur AGARD - VSL .,
See .Belgium
Etat-major de la Force aCrienne . .
_
:,
'.NETHERLANDS
:
. '
. Quartier Reine Elisabeth 1
Netherlands Delegation to AGARD
Rue d'Evere, 1140 Brukelles . , '
NA,
Wa:
Uni
*338689**P*UL*