0% found this document useful (0 votes)
879 views492 pages

Agard CFD PDF

Uploaded by

Vinoth Nagaraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
879 views492 pages

Agard CFD PDF

Uploaded by

Vinoth Nagaraj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 492

E

9
n
0
n
a
3
AGARD
ADVISORY GROUP FOR AEROSPACE RESEARCH & DEVELOPMENT
7 RUE ANCELLE, 92200 NEUILLY-SUR-SEINE, FRANCE

-- 'L

'8
I
1

AGARD CONFERENCE PROCEEDINGS 578

Progress and Challenges in CFD


Methods and Algorithms
(Progrks rkalids et dkfis en mkthodes et algorithmes Cn>)

Papers presented and discussions recorded at the 77th Fluid Dynamics Panel Symposium
held in Seville, Spain, 2-5 October 1995.

I I ,. ...................... 1
-
-@- I
NORTH ATLANTIC TREATY ORGANIZATION

Published April 1996


'
Distribution and Availability on Back Cover
' V
UdLL-\H\T€D
AGARD-CP-578

ADVISORY GROUP FOR AEROSPACE RESEARCH & DEVELOPMENT


7 RUE ANCELLE, 92200 NEUILLY-SUR-SEINE, FRANCE

AGARD CONFERENCE PROCEEDINGS 578

Progress and Challenges in CFD


Methods and Algorithms
(Progrks rCalisCs et dCfis en mCthodes et algorithmes CFD)

Papers presented and discussions recorded at the 77th Fluid Dynamics Panel Symposium
held in Seville, Spain, 2-5 October 1995.

- North Atlantic Treaty Organization


Organisation du Traite de I’Atlantique Nord

I
The Mission of AGARD

According to its Charter, the mission of AGARD is to bring together the leading personalities of the NATO nations in the
fields of science and technology relating to aerospace for the following purposes:

- Recommending effective ways for the member nations to use their research and development capabilities for the
common benefit of the NATO community;

- Providing scientific and technical advice and assistance to the Military Committee in the field of aerospace research
and development (with particular regard to its military application);

- Continuously stimulating advances in the aerospace sciences relevant to strengthening the common defence posture;
- Improving the co-operation among member nations in aerospace research and development;

- Exchange of scientific and technical information;

- Providing assistance to member nations for the purpose of increasing their scientific and technical potential;

- Rendering scientific and technical assistance, as requested, to other NATO bodies and to member nations in
connection with research and development problems in the aerospace field.

The highest authority within AGARD is the National Delegates Board consisting of officially appointed senior
representatives from each member nation. The mission of AGARD is carried out through the Panels which are composed of
experts appointed by the National Delegates, the Consultant and Exchange Programme and the Aerospace Applications
Studies Programme. The results of AGARD work are reported to the member nations and the NATO Authorities through the
AGARD series of publications of which this is one.

Participation in AGARD activities is by invitation only and is normally limited to citizens of the NATO nations.

The content of this publication has been reproduced


directly from material supplied by AGARD or the authors.

Published April 1996


Copyright 0 AGARD 1996
All Rights Reserved

ISBN 92-836-0026-6

Printed by Canada Communication Group


45 Sacri-Ceur Blvd., Hull (Quibec), Canada KIA OS7

ii
Progress and Cha lenges in CFD Methods
and Algorithms
(AGARD CP-578)

Executive Summary
Computational Fluid Dynamics (CFD) now plays an essential role in the design of aerospace vehicles.
The ability of numerical methods to accurately simulate complex external and internal aerodynamic
flows is crucial to the success of these methods in the design process, and for airplanes leads to
improved performance, agility and maneuverability.

In the last decade, considerable progress has been made in the development of numerical methods
related to CFD. As a result, various promising CFD schemes and algorithms have been developed.
However, they are not currently used in industrial codes. At the same time, new developments in
computer hardware and architectures have led to significant advances in parallel computing and
multiprocessing. These topics, which are considered likely to constitute pacing items and new
challenges in CFD in the near future, formed the framework for the program for this Symposium.

The following subjects were addressed: parallel computing, advanced spatial discretization techniques,
unstructured, hybrid and overlapping grids, adaptive meshes, fast implicit and iterative solvers, large
eddy and direct numerical simulations of turbulent flows, chemically reacting flows and unsteady
aerodynamics. Interesting and new aspects of techniques involving these subjects were discussed,
substantiating their extended potential and improved capabilities. Several important directions of
research such as aerodynamic shape optimization and multidisciplinary analysis and design were
identified, which should be the subject of intensive advanced research in the near future.

The Symposium provided a very valuable opportunity for exchange of information about recent
developments and achievements. It can, therefore, be expected to significantly contribute to future
important progress in the advancement of numerical techniques used in the design of aerospace vehicles
and other flying objects.

Jean-AndrC Essers
Programme Committee Chairman

...
LLI
Progrhs rkalisks et dkfis en mkthodes
et algorithmes CFD
(AGARD CP-578)

L’aerodynamique numCrique (CFD) joue dCsormais un rBle essentiel dans la conception des vChicules
akrospatiaux. La capacitC des mCthodes numCriques B simuler avec prCcision des Ccoulements
akrodynamiques complexes internes et externes est essentielle pour la reussite de ces mCthodes dans le
processus de conception et pour les akronefs, elle permet d’amCliorer les performances, I’agilitC et la
manaeuvrabilitC des appareils.

Au cours de la dernikre dkcennie, des progrks considCrables ont CtC rCalisCs dans le dCveloppement de
mCthodes numCriques se rapportant au CFD. De ce fait, divers algorithmes et diverses mCthodes CFD
prometteurs ont CtC dCveloppCs. Cependant, ils n’ont pas CtC intCgr6s aux codes industriels. En meme
temps, les nouveaux dkveloppements en materiel et architectures informatiques ont permis des
avancCes apprkciables dans le domaine du calcul en parallhle et du multitraitement. Ces sujets, qui sont
considCrCs comme susceptibles de constituer les jalons et les nouveaux challenges du CFD dans un
avenir proche, ont constituC I’ossature du programme de ce symposium.

Les sujets suivants ont CtC examinks: le calcul en parallhle, les techniques de discritisation spatiale
avancCes, les maillages non-structurks, hybrides et imbriquCs, les maillages adaptatifs, les codes de
rCsolution rapides, implicites et itCratifs, la simulation des grands tourbillons et la simulation numCrique
directe d’Ccoulements turbulents, les Ccoulements B reaction chimique et 1’aCrodynamique non
permanente.

Des discussions pertinentes ont eu lieu sur des aspects nouveaux et intkressants de techniques se
rapportant B ces sujets, confirmant ainsi I’extension de leur potentiel et 1’amClioration de leurs
capacitts. Plusieurs orientations importantes pour la recherche, telles que I’optimisation du profil
akrodynamique et I’analyse et la conception multidisciplinaires ont Ctk identifiCes c o m e devant faire
l’objet de travaux de recherche avancCs intensifs dans un avenir proche.

Le symposium a fourni l’occasion inestimable pour Cchanger des informations sur les realisations et les
dCveloppements rCcents. I1 devrait, par conskquent, reprksenter une contribution non negligeable aux
futurs progrks importants dans l’avancement des techniques numtriques pour la conception des
vChicules akrospatiaux et d’autres objets volants.

Jean-AndrC Essers
Programme Committee Chairman

iv
Contents

Page

Executive Summary iii

Synthbe iv

Recent Publications of the Fluid Dynamics Panel viii

Fluid Dynamics Panel X

Reference

Technical Evaluation Report T


by N. Kroll

KEYNOTE SESSION
Chairman: J.A. Essers

The Present Status, Challenges, and Future Developments in Computational Fluid Dynamics 1
by A. Jameson (Invited)

CFD Research in the Changing U.S. Aeronautical Industry 2


by P.E. Rubbert (Invited)

Parallel Computing in Computational Fluid Dynamics 3


by D.D. Knight (Invited)

Portable Parallelization of a 3D Euler/Navier-Stokes Solver for Complex Flows 4


by B. Eisfeld, H. Ritzdorf, H. Bleecke and N. Kroll

A Parallel Spectral Multi-Domain Solver Suitable for DNS and LES Numerical Simulation 5
of Incompressible Flows
by A. Pinelli and A. Vacca

On Improving Parallelism in the Transonic Unsteady Rotor Navier Stokes (TURNS) Code 6
by A.M. Wissink, A.S. Lyrintzis and R.C. Strawn

Development of a Parallel Implicit Algorithm for CFD Calculations 7


by F. Dias d’Almeida, F.A. Castro, J.M.L.M. Palma and P. Vasconcelos

Experiments with Unstructured Grid Computations 8


by S.V. Ramakrishnan, K.Y. Szema, C.L. Chen, V.V. Shankar and S.R. Chakravarthy

A Second-Order Finite-Volume Scheme Solving Euler and Navier-Stokes Equations on 9


Unstructured Adaptive Grids with an Implicit Acceleration Procedure
by M. Delanaye, Ph. Geuzaine, J.A. Essers and P. Rogiest

Un SchCma Cinematique d’Ordre 2 PrCservant les PositivitCs pour les Equations d’Euler 10
Compressibles sur Maillages non Structures Auto-Adaptatifs
by Ph. Villedieu, J.L. Estivalezes and J.J. Hylkema
A Meshless Technique for Computer Analysis of High Speed Flows 11
by T. Fischer, E. Oiiate and S. Idelsohn

Can ce11ed 12

Numerical Simulation of Internal and External Gas Dynamic Flows on Structured and 13
Unstructured Adaptive Grids
by U.G. Pirumov, I.E. Ivanov and I.A. Kryukov

An Investigation of the Effects of the Artificial Dissipation Terms in a Modern TVD Scheme 14
on the Solution of a Viscous Flow Problem
by R.D. Briggs and S. Shahpar

Cancelled 15

A Flux Filter Scheme Applied to the Euler and Navier Stokes Equations 16
by A. Vinckier, J. Jacobsen and S. Wagner

Implicit Multidimensional Upwind Residual Distribution Schemes on Adaptive Meshes 17


by H. Paillhre, J.-C. Carette, E. Issman, E. van der Weide, H. Deconinck and G. Degrez

Multidimensional Upwind Dissipation for 2D/3D Eulermavier-Stokes Applications 18


by P. Van Ransbeeck and Ch. Hirsch

A PCG/E-B-E Iteration for High Order and Fast Solution of 3-D Navier-Stokes Equations 19
by A. Rustem Aslan, U. GulGat and A. Misirhoglu

Convergence Acceleration of the Navier-Stokes Equations through Time-Derivative 20


Preconditioning
by C.L. Merkle, S. Venkateswaran and M. Deshpande

Practical Aspects of Krylov Subspace Iterative Methods in CFD 21


by T.H. Pulliam, S. Rogers and T. Barth

Hexahedron Based Grid Adaptation for Future Large Eddy Simulation 22


by J.J.W. van der Vegt and H. van der Ven

Parallel Algorithms for DNS of Compressible Flow 23


by M. Streng, H. Kuerten, J. Broeze and B. Geurts

A Straightforward 3D Multi-Block Unsteady Navier-Stokes Solver for Direct and Large- 24


Eddy Simulations of Transitional and Turbulent Compressible Flows
by P. Comte, J.H. Silvestrini and E. Lamballais

Applications of Lattice Boltzmann Methods to Fluid Dynamics 25


by S.A. Orszag, Y.H. Qian and S. Succi

Transition in the Case of Low Free Stream Turbulence 26


by V.T. Grinchenko and V.S. Chelyshkov

Structured Adaptive Sub-Block Refinement for 3D Flows 27


by K. Becker and S. Rill

Multiblock Structured Grid Algorithms for Euler Solvers in a Parallel Computing 28


Framework
by S. Sibilla and M. Vitaletti

vi
Ameliorations RCcentes du Code de Calcul d’Ecoulements Compressibles FIU3M 29
by L. Cambier, D. Darracq, M. Gazaix, Ph. Guillen, Ch. Jouet and L. Le Toullec

The Computation of Aircraft Store Trajectories using Hybrid (Structured/Unstructured) 30


Grids
by D.J. Jones, F. Fortin, D. Hawken, G.F. Syms and Y. Sun

Solution of the Euler- and Navier-Stokes Equations on Hybrid Grids 31


by M. Galle

Simulation du Mouvement Relatif de Corps Soumis h un Ecoulement Instationnaire par une 32


MCthode de Chevauchement de Maillages
by P. Brenner

Efficient Numerical Simulation of Complex 3D Flows with Large Contrast 33


by R. Radespiel, J.M.A. Longo, S. Briick and D. Schwamborn

MCthodes de DCcentrement Hybrides pour la Simulation d’Ecoulements en DCsCquilibre 34


Thermique et Chimique
by F. Coquel, V. Joly and C. Marmignon

A Projection Methodology for the Simulation of Unsteady Incompressible Viscous Flows 35


using the Approximate Factorization Technique
by A. Pentaris and S. Tsangaris

Adaption by Grid Motion for Unsteady Euler Aerofoil Flows 36


by C.B. Allen

Adaptive Computation of Unsteady Flow Fields with the DLR-T-Code 37


by 0. Friedrich, D. Hempel, A. Meister and Th. Sonar

Parametric Studies of a Time-Accurate Finite-Volume Euler Code in the NWT Parallel 38


Computer
by L.P. Ruiz-Calavera and N. Hirose

Parallel Implicit Upwind Methods for the Aerodynamics of Aerospace Vehicles 39


by K.J. Badcock and B.E. Richards

General Discussion GD
Recent Publications of
the Fluid Dynamics Panel
AGARDOGRAPHS (AG)
Computational Aerodynamics Based on the Euler Equations
AGARD AG-325, September 1994
Scale Effects on Aircraft and Weapon Aerodynamics
AGARD AG-323, July 1994
Design and Testing of High-Performance Parachutes
AGARD AG-319, November 1991
Experimental Techniques in the Field of Low Density Aerodynamics
AGARD AG-318 (E), April 1991
Techniques experimentales likes ?
I’aCrodynamique
i a basse densite
AGARD AG-318 (FR), April 1990
A Survey of Measurements and Measuring Techniques in Rapidly Distorted Compressible Turbulent Boundary Layers
AGARD AG-315, May 1989
Reynolds Number Effects in Transonic Flows
AGARD AG-303, December 1988
REPORTS (R)
Parallel Computing in CFD
AGARD R-807, Special Course Notes, October 1995
Optimum Design Methods for Aerodynamics
AGARD R-803, Special Course Notes, November 1994
Missile Aerodynamics
AGARD R-804, Special Course Notes, May 1994
Progress in Transition Modelling
AGARD R-793, Special Course Notes, April 1994
Shock-WaveBoundary-LayerInteractions in Supersonic and Hypersonic Flows
AGARD R-792, Special Course Notes, August 1993
Unstructured Grid Methods for Advection Dominated Flows
AGARD R-787, Special Course Notes, May 1992
Skin Friction Drag Reduction
AGARD R-786, Special Course Notes, March 1992
Engineering Methods in Aerodynamic Analysis and Design of Aircraft
AGARD R-783, Special Course Notes, January 1992
Aircraft Dynamics at High Angles of Attack: Experiments and Modelling
AGARD R-776, Special Course Notes, March 1991
ADVISORY REPORTS (AR)
Aerodynamics of 3-D Aircraft Afterbodies
AGARD AR-318, Report of WG17, September 1995
A Selection of Experimental Test Cases for the Validation of CFD Codes
AGARD AR-303, Vols. I and 11, Report of WG-14, August 1994
Quality Assessment for Wind Tunnel Testing
AGARD AR-304, Report of WG-15, July 1994
Air Intakes of High Speed Vehicles
AGARD AR-270, Report of WG13, September 1991
Appraisal of the Suitability of Turbulence Models in Flow Calculations
AGARD AR-291, Technical Status Review, July 1991
Rotary-Balance Testing for Aircraft Dynamics
AGARD AR-265, Report of WG11, December 1990
Calculation of 3D Separated Turbulent Flows in Boundary Layer Limit
AGARD AR-255, Report of WG10, May 1990
Adaptive Wind Tunnel Walls: Technology and Applications
AGARD AR-269, Report of WG12, April 1990
CONFERENCE PROCEEDINGS (CP)
Aerodynamics of Store Integration and Separation
AGARD CP-570, February 1996
Aerodynamics and Aeroacoustics of Rotorcraft
AGARD CP-552, August 1995
Application of Direct and Large Eddy Simulation of Transition and Turbulence
AGARD CP-551, December 1994
Wall Interference, Support Interference, and Flow Field Measurements
AGARD CP-535, July 1994
Computational and Experimental Assessment of Jets in Cross Flow
AGARD CP-534, November 1993
High-Lift System Aerodynamics
AGARD CP-515, September 1993
Theoretical and Experimental Methods in Hypersonic Flows
AGARD CP-514, April 1993
Aerodynamic EngindAirframe Integration for High Performance Aircraft and Missiles
AGARD CP-498, September 1992
Effects of Adverse Weather on Aerodynamics
AGARD CP-496, December 1991
Manoeuvring Aerodynamics
AGARD CP-497, November 1991
Vortex Flow Aerodynamics
AGARD CP-494, July 1991
Missile Aerodynamics
AGARD CP-493, October 1990
Aerodynamics of Combat Aircraft Controls and of Ground Effects
AGARD CP-465, April 1990
Computational Methods for Aerodynamic Design (Inverse) and Optimization
AGARD CP-463, March 1990
Applications of Mesh Generation to Complex 3-D Configurations
AGARD CP-464, March 1990
Fluid Dynamics of Three-Dimensional Turbulent Shear Flows and Transition
AGARD CP-438, April 1989
Validation of Computational Fluid Dynamics
AGARD CP-437, December 1988
Aerodynamic Data Accuracy and Quality: Requirements and Capabilities in Wind Tunnel Testing
AGARD CP-429, July 1988
Aerodynamics of Hypersonic Lifting Vehicles
AGARD CP-428, November 1987
Aerodynamic and Related Hydrodynamic Studies Using Water Facilities
AGARD CP-413, June 1987
Applications of Computational Fluid Dynamics in Aeronautics
AGARD CP-412, November 1986

ix
Fluid Dynamics Panel
Chairman: M. C. DUJARRIC Deputy Chairman: Professor C. CIRAY
Future Launchers Office Aeronautical Eng. Department
ESA Headquarters Middle East Technical Univ.
8-10 rue Mario Nikis Inonu Bulvari PK:06531
75015 Paris - France Ankara - Turkey

PROGRAMME COMMITTEE

Prof. J.A. ESSERS (Chairman) Prof. Ir. J.W. SLOOFF


Universitt de Likge, National Aerospace Laboratory NLR
lnstitut de Micanique, Anthony Fokkerweg 2
Service d’Atrodynamique 1059 CM Amsterdam - Netherlands
rue Ernest Solvay 21
4000 LiBge - Belgium Prof. Dr. T. YTREHUS
Division of Applied Mechanics
Prof. H. DECONINCK The University of Trondheim
von Karman Institute for Fluid Dynamics The Norwegian Inst. of Technology
Chausste de Waterloo 72 -
N-7034 Trondheim - NTH Norway
1640 Rhode-Saint-Gtnese - Belgium
Prof. A.F. de 0. FALCAO
Depart. Engenharia Mecanica
Prof. R.J. KIND Instituto Superior Tecnico
Dept. of Mechanical & Aerospace Eng. 1096 Lisboa Codex - Portugal
Carleton University
Ottawa, Ontario KIS 5B6 - Canada Dr. R. CORRAL
Departamento de Mecanica de Fluidos
Prof. A. BONNET Industria de Turbopropulsores (ITP)
Dept. Atrodynamique, Ecole Suptrieure Carretera Torrejon Ajalvir, Km 3.5
de 1’Atronautiqueet de I’Espace 28850 Torrejon de Ardoz (Madrid) - Spain
I O Avenue Edouard Belin, BP 4032
31055 Toulouse Cedex - France Prof. J. JIMENEZ
Escuela Tecnica Superior de lngenieros Aeronauticos
1CA O.P. JACQUOITE Departamento de Mecanica de Fluidos
Direction de la Recherche et de la Plaza del Cardenal Cisneros 3
Technologie - DRETISTRDTE6 -
28040 Madrid Spain
4 Avenue de la Porte d’lssy
00460 Armies - France Prof. Dr. U. KAYNAK
WSAS - Havacilik ve Uzay San. A.S.
P.K. 18 Kavaklidere, 06690
M. R. LACAU Ankara - Turkey
AQospatiale Missiles (E/ECN)
Centre des GLtines
91370 Verrieres le Buisson - France Prof. D.I.A. POLL
Head of the College of Aeronautics
Cranfield University - Cranfield
Dr. Ing. H. KORNER Bedford MK43 OAL - U.K.
Direktor - lnstitut fur Entwurfsaerodynamik
der DLR Prof. B. CANTWELL
Lilienthalplatz 7 Stanford University
D-38108 Braunschweig - Germany Dept. of Aeronautics & Astronautics
Stanford, CA 93405 - U.S.A.
Dr. A.G. PANARAS
P.O. Box 64053 Dr. L.P. PURTELL
Athens 157 I O - Greece Mechanics & Energy Conversion Division
Code 333 - Office of Naval Research
Mr. M. BORSI 800 North Quincy Street
ALENIA Aeronautica Arlington, VA 22217-5660 - U.S.A.
Corso Marche 41
10146 Torino - Italy
PANEL EXECUTIVE
Mr. J.K. MOLLOY

Mail from Europe Mail from USA and Canada


AGARD-OTAN AGARD-NATO
Attn: FDP Executive Attn: FDP Executive
7, rue Ancelle Unit PSC 116
92200 Neuilly-sur-Seine APO AE 09777
France
Tel: 33 (1) 4738 5775

X
Technical Evaluation Report
AGARD Fluid Dynamics Panel Symposium on
''Progress and Challenges in CFD Methods and Algorithms"
N. KroI
Institute of Design Aerodynamics
DLR, 38108 Braunschweig
Lilienthalplatz 7, Germany

SUMMARY complete aircraft flow predictions. From the aeronauti-


The Fluid Dynamics Panel of AGARD conducted a cal industry's point of view, CFD is expected to deliver:
Symposium on "Progress and Challenges in CFD Meth- - detailed viscous flow analysis for complex geome-
ods and Algorithms" in Seville, Spain, on October 2-5. tries at realistic Reynolds numbers
1995. The purpose of this symposium was to identify
and discuss topics which are likely to constitute pacing - accurate prediction of aerodynamic data
items and challenges in Computational Fluid Dynamics. - fast response time per flow case at acceptable total
Sessions were devoted specifically to parallel comput- costs
ing, advanced discretization schemes and advanced grid
- aerodynamic optimivltion of aircraft compo-
structures. Topics also include adaptive meshes, fast it- nentdcomplete aircraft
erative methods and algorithmic aspects for the compu-
tation of reacting flows and unsteady flows. In this eval- - interdisciplinaryanalysis of aircraft (aerodynamics
uation report an attempt is made to point out the critical + structure +flight control)
issues for each particular subject and to assess how far In order to meet these requirements, improvements in
they were addressed by the conference papers. Some CFD are needed in all areas.
general concl.uding remarks and recommendations are Based on this, the aim and scope of the symposium were
given. set by the program committee in the call for papers as
follows:
1. INTRODUCTION
"The symposium will focus on those topics
The 77th Meeting of the AGARD Fluid Dynamics Panel
which are likely to constitute pacing items and
was held from the 2nd to the 5th of October, 1995, in
new challenges in CFD. Its aim is to bring tc-
Seville, Spain. The symposium was focused on
gether scientists and engineers working on new
"Progress and Challenges in CFD Methods and Algo-
numerical developments in different fields of
rithms". The background and need for such a meeting
interest to the aerospace sciences and industrial
was stated in the call for papers:
communities.
"The design of aerospace vehicles strongly de-
Papers may address a broad range of research
pends on the ability of numerical methods to
fields of current interest. A list of possible top-
simulate complex flow fields. In the last de-
ics includes (but is not limited to) the follow-
cade, considerable progress has been made in
ing:
the development of numerical methcds related
to CFD. As a result, various promising CFD Unstructured grid, hybrid, adaptive, multi-
schemes and algorithms have been developed block and grid embedding methods and algo-

-
which are not yet currently used in industrial rithms
codes. At the same time, new developments in Implicit and iterative methods for Euler and
computer hardware and architectures have led Navier-Stokes equations, fast iterative solv-
to significant advances in parallel computing ers (multi-grid, Krylov subspace techniques)
and multiprocessing."
Numerical techniques for parallel computing
It must also be stated that despite the recent advances
and multiprocessing
CFD still suffers from deficiencies in accuracy, robust-
ness and efficiency for complex applications, such as
T-2

. anw in accurak capturing b liques


for shock waves and contact discuntinuitk,
It is wor ile to note It Y of be Pspers
were c o n c c m ~with meliZati0;; 0 f - m methods,
TVD high ~ O l u t i o aschemes, dtidilnen- use of mMc flexible grid stmctuns and development of
si& upwinding advanced disnshzab ' 'oh schemes including adaptive
methods. This may reflect the contempomy trends of
Numerical algoritlnns and problems specifi-
Cm, nsearrh in moa aeronautical companies, gover-
cally related to the implementation of turhu-
ment research laboratories and universities. Surpris-
lmce mcdels and to the simulation of
ingly, except for the keynote paper by Jmesm, no tech-
nonequilibrium chemically reacting flows
nical paper pddnapdd optimization and interdisciplinary
Numerical acmracy assessment. analysis which, in the author's opinion,are major chal-
In order to limit the. scope of the symposium, lenm in Cm,.
papas essentially devoted to grid generation The evaluation undertaken in this repat attempts to
techniques and turbulence or chemistry model- cover two espects. chepetcr 2 comprises summerics of
ling are not encouraged." the presented papers for each topic given in table 1. It is
The synpsium s p d lhrec and one-half days and not intended to give an extensive review of all individ-
the program listed 38 technicaLpspers coming from 13 uel papas, but insterd, for each particular subject it is
countdes, of which 36 papsas wem presented. 'Ihc pro- aimed to identify tbe critical issues and to assess how far
gram wnunitteeorganized a keynote session,eight ma- they were uldnased by the papers. In chapter 3, con-
jor sccisioas and a ge.aeral discussion at the end of the cluding r e d are presented indicating to what degree
meeting. The following table presents the topics covered the aims of the meetiag and the needs of the aerospace
by the symposium. Althovgh many pspsrs addressed commdty were met Furthermore, recomme.ndatio1~8
s v e d topics, they are categorized in this table basedon arisiig from the meeting are given.
thsircentral focus. The pspersspe hted in chapter 4 in
theorderoftheirprcsmcadon. 2. SYNOPSIS OF TBE PAPERS
With m p t to the theme of the meeting "Progressand
QIIlIlcngcp in CFD Methods and Algorithmsr", in the
evaluator's opinion, many papers of high quality were
given, which rcpnscnt the c m n t status of CFD, focus
on unnsolved issues and present new important direc-
tions of development to overcome current deficiencies.
On the other hind, many papers of lower quality were
presented. Some of them did not meet the main focus of
the symposium,several others did not reflect the present
status of Cm, or wem largely redoing or reinventing
well estabIishcd topics that have been known in the lit-
erahm?for some time.
I-&?& I I
2.1 Invitdpspvs
Keynote pnpers were provided by A. Jameson, I! Rub-
I techniques I I berf and D. Knighr.
Jamesm [I] gave an excellent overview of present sta-
tus, challenges and future developments in computa-
tional fluid dynamics. He addressed the essential re-
quiFeinents on numerical simulation for their effective
industrial use. Assured accuracy, acceptable computa-
tional and hnman casts as wall as fast turn around were
identified as major issues. In his opinion, more sophisti-
cavdalgorithlnsare requiredin order to substanially re-
duce computatioml costs. hprovcd methods should in-
clude hi* order schemes, advanced acccleration
methals, fast inversion methods for implicit schemes
and the effective exploifatim of massively parallel com-
puters. Tht pnpcr reviewed modern numaical methods
T-3

and addressed several issues in algorithm design. In par- work towards more. industrial applicability. As stated by
ticular, a unified approach to design accurate and effi- Rubbert, h i s is the responsibility of the money givers
cient shock capturing algorithms was presented. Some who inhabit the research engine.
examples of state-of-the-art calculations, which can be In summary, Rubbert's paper performed a general criti-
performed in an industrial environment, were given. cal assessment of today's system of research and its
Jameson pointed out that beside the transition to more stage of change. His observations represent the prag-
sophisticated algorithms, the present challenge is to ex- matic point of view of industry, from which the interest
tend the. effective use of CFD techniques to more com- of researcher's basic scientific findings are less empha-
plex applications. As key problems, he identified turbu- sized. This paper makes CFD researchers sensitive to in-
lent flows at Reynolds numbers associated with full dustrial needs. but some specific views of aemnautical
scale Right, chemically reacting flows, combustion and industry on the status of CFD and future. requirements
unsteady flows. Furthermore., multidisciplinary analysis, would have been desirable.
aerodynamic shape optimization and in the long run
multidisciplinary optimization were designated as im-
portant future target a r e a of CFD. In his presentation, The paper by Knight 131 presentedan overview of paral-
Jmeson outlined a very promising technique for effi- le1 computing in computational fluid dynamics. In the
cient three-dimensional shape optimization based on first part of the paper the basics of parallel computing
control theory. He demonstrated a succesriful design of a wen addressed, including the introduction of the dis-
swept-wing with very low wave drag within 40 design tinct levels of pallelism, the classification of parallel
iterations. In this example, the flow was modeled by the computer architectures and the description of the two
Euler equations. He mentioned that with this technique, basic programing paradigms, namely message passing
even in the case of three-dimensional flows, the compu- and data parallelism. The second part focused on several
tational requirements are so moderate that the calcula- key issues in the context of code development for paral-
tions can be performed with workstations such as the le1 computing. Dynamic load balancing and scalabiity
IBM RISC 6000 series. were identified as critical issues for complex CFD appli-
In summary, the invited paper delivered by Jorncson cations carried out on massively parallel computers.
gave a precise outline of the scope of the symposium Furthermore., a major con- of p d e l computing is
portability. Here. Knight discussed cumnt research ac-
and the expected outcome of the meeting.
tivities, including the development of message passing
standards (e.g. PVM, MPI) and data parallel pmgram-
In his presentation [Z],Rubben focused his remarks on ming language standards (e.g. m. In his presentation,
challenges and pacing items in CFD that extend beyond Knight pointed out that in the US. aerospaw industry
the technical ones. He pointed out that the key to devel- has taken a leading role in the application of parallel
oping better airplanes or better CFD is the same, namely computing to practical analysis and design. In the past
to analyze, understand and impmve the pmcesses by few years several major aerospace corporations have dc-
which airplanes or CFD m created. Rubbert called the veloped extensive networks of workstations for routine
pmess by which CFD capabilities are created the re- applications. Several examples were given in the paper.
s a c h engine. Such a research engine involves industry, Knight's presentation provided a basic intmduction into
academia and government, and the three components in- the field of parallel computing. The fundamental termi-
teract with each other as a system. In the past this sys- nology were explained and all critical issues were dis-
tem functioned quite well, but in his opinion, it has cussed. Therefore, the paper was very helpful for the un-
been almost disconnected from the customers of CFD derstanding and assessment of the following technical
research, namely the practicing design engineers. Im- papers which dealt with parallelition. Unfortunately,
pressive results of research have bee0 achieved, but they the paper did not discuss the potentials and limitations
wen not necessarily applicable by industry. The paper of high performance parallel computers to tackle large
pointed out many principal characteristics and attributes scale applications or new challenges in CFD. Results
which an improved, pmperly functioning research en- were only presented f a networks of loosely coupled
gine should have. The leading principles are customer workstations.
focus and customer satisfaction. Two farther key factors
were identified which will pace the change of the re-
search engine. The first is a two-way, more intensive Although papers on grid generation techniques were ex-
communication between the research cummunity and plicitly not encouraged by the call for papers, an invited
the engineering cummunity in mdustry. The second is a paper on status and proms of both structured and un-
modification of the evaluation system of the research structured grid generation for complex configurations
might have been desirable. The turn around time and ac- tured multigrid solver for industrial CFD applications.
curacy of the numerical simulation of industrial applica- Portability is achieved through the use of a message
tions very often depend on the capability of the avail- passing based high level communication library. This li-
able grid generation pncedure. Therefore, for ths brary supports any operation which is necessary m par-
critical assesment of numerical algorithms using struc- allel mode and involves communication between differ-
tured, unstructured a hybrid meshes, the capabilities ent processes. Performance measurements on a large
and limits of the underlying grid generation techniques variety of computers of different architectures demon-
have to be taken into account. strated the compreheiwive portability of the code. Appli-
cations included inviscid compntations for a generic air-
Z.2 Pa& Compnt@g craft consisting of wing/body/pylon/engine and viscous
Parallel computing is an important means to cut down calculations for a wing-body configuration on a m p u -
turn aronnd time and computational costs of large scale tational mesh with 6.6 million points. n e paper showed
applications. Fwthermom, it is believed that the exploi- that the complexity of coday‘s problems in applied aero-
tation of massively parallel computing is the key to dynamics cm be @ckled with parallel computers. It also
tackle new grand challenges in Cm,such as multidisci- revealed the necessity for an automatic and effective
plinary analysis and optimizaton. In the last several load balancing tool that allows the mapping of an initial
years a wide variety of parallel architectum have be- blcck structure to a higher number of processas than
come available which m r in the design of the CPU‘s
e given blocks. Details on the parallel efficiency of the
(vector versus RISC processar). the memory organiza- multigrid method ussd in the applications were not
tion (e.g. shared versus distributed memory) and the given.
communication spyatem (hardware nnd s o b ) . For the
future some of the vendors promise substantial incrrase The papas by Wssittk et d.[a], Diar d ’ A h i d a et al.
of computational power in both memory and CPU,One [7] and Bmkock er d.[36] focused on the parallel im-
of the main issues in parallel computing is the design of plementation of implicit Eulcr/Naver-Stokes solvers. In
numerical algolifhms which efficiently exploit the capa- [a] for example, two modifications of the well known
bilities of the parallel hardware. Especially in the case of
implicit LU-SOS scheme (Lower-Upper Symmetric
distributed memory machines, this is a non trivial task. Gauss-Seidel) were presented. The fust replaces the
T%e important sspeds in designing parallel algorithms Gaups-Seidel sweep in LU-SOS with a Jacobi-like
for these architectures are pardoning of data and compu- sweep which only quires n e w neighbor communi-
tation among the processors, communication at the in- cation and is therefore easy to parallelize. The second
ternal bgundariss, load baiancing and overhead due to one is a hybrid approach that couples the global Jacobi
oommunication and extra computations. Simpler algo- typc communication with the more efficient Gauss-
rithms, such as explicit schemes, parallelize quite easily Seidel sweep on each subdomaio. In both strategies
and thq lead to highpmRrmnaacem-t p d d corn-
multiple sweeps am required in each subdomain in order
plltsrs. However, due to their pwr convergence rates
to maintain the convergence behavior of the baseline
they are overall much less &cient than implicit
LU-SGS method. Both strategies have been investigated
schemes. although the latter anes g e ~ d performy far in detail with respect to parallel speed-up,convergence
M o w the peak of the p d e l machines due to the more rate and computational efliciency. Inviscid calculations
intensive and more global communicstion involved The
for 3-D hovering helicopter blades demonamed that
adjustment and further development of more sophisti- the hybrid strategy Is a promising implicit scheme for
cated algorithms such as multigrid and domain decom- parallel computers with a smaller number of pow&
position methods on parallel architectures are very processors.
promising. In contrast to explicit schemes, they provide
global distribution of information, however in a much
mom efficient way than traditional implicit schemes. %e presentations delivered by Pinelli et d. [5] and
Further research in this direction is needed in order to Srmg er ul. [21] eddrcsscd the padlelization of a l p
efficiently exploit the capabilities of parallel computers. rithms for DNS and LES.In [21] the various aspects of
the parallel implementation of a typical higher order
DNS solver baaed on domain decomposition were dis-
In this symposium, papers [41, [51. [61, [7l,1211. [U1 cussed. The intrinsic or algorithmic efficiency has been
and [36] dealt mainly with parallel computing and cov-
de6ned (see also [4]), which deals with the paralleliz-
ered various w ts thenof. Ths paper by ELfeld er ol.
ability of a dvm algorithm, regardless of the machine.
€41stressed the issue of portability. They described the Baged on some analysis, the authors showed that due to
portable parallelition of a state-of-the-ari block-struc-
extra Boating point operations at inner block boundaries
T-5

the algorithmic effickncy decreases rapidly as the spa- resolution of discontinuities and viscous shear layers.
tial discretization increws, that is, as the comsponding High resolution of all physical phenomena is r e q u i d
stencil-size grows. Test calculations on different parallel on a computational mesh with a minimum number of
architectures indicated that the machine d c i e n c y is grid points. Funherma, the spatial discretization
even considerably lower than the algorithmic efficiency. should support a robust and efficient time integration.
Funhennore, the paper reponed on fist experience that Recently, substantial progress has kmmade in this area
has bean gained for the implementation of the DNS and many diffemt promising approaches for the im-
solver on a SGI Power Challenge Array (4 nodes each proved d i s c h t i o n of the Euler and Navier-Stokes
comprised of a 16CPU shared memory parallel ma- equations are h w o in the litenrture. Among these are
chine) using a combination of fine-gained (&& s.g. impmved shock capturing algorithms based on flux
memory) and coarse-grained parallelism (explicit mes- difference and tlw vector splitting, multidmnsional
sage passing). The results were very promising. how- upwinding, residual distribution schemes and kinetic
ever, this parallelization strafegy needs furthGr investi- flux splitting. These methods have been investigated in
gation. detail for one and twbdimensional flows. Very ohn,
however, their supetiority to conventional methods have
only been demonstrated for simple test c a w . Therefore,
The paper by Sibitla and Wtaletii [E] did not show any
the key issue remains the manifestation of the improved
parallel computations, but it addressedseveral impnrant abilities of the advanced methods for relevant ZD and
aspects of multiblock-structured grid algorithms in a 3-D viscous flows around m a complex geometries.
parallel conlputing eowironment. As in [41 th@ manage-
ment of data communication between adjacent blocks is
provided by a parallel library (PARAGRID) which en- At the symposium several papers [91, [IO]. [121. [14],
sures that the same average values are assigned to all (151, [I61 and [ln. w e specifically devoted to im-
replicas of the same bousdaty node owned by different provements of the spatial discretization of Eulermavier-
blocks. The paper discussed the influence of block sub- Stokes solvers. The paper by DetoAnys er ul [9] pre-
division on accuracy and efficiency within the frame- sented the development of a new quadratic reconstrue
work of a multigrid scheme. Tbc solution algorithm has tion finite volume scheme for unstructured polygonal
been modified in order to account for the presance of lo- meshes. The most firtquently employed linear cell tb
cally unstructured topologies at block bmndaries (sin- construction of the flow variables requires sufficiently
gular points). For some test cases it could be demon- regular grids for second order accuracy and it results in
s m s d that the convergence af the numerical method a first order scheme for irregular meshcs. In contrast to
could only be. emumi with this modification. . this, the pmposed q d r a t i c mcominaction pmvides a
full second order seheme even for very irregular
meshes. In order to avoid spurious oscillations in the vi-
In conclusion, most of the papers focused on some spe- cinity of discontinuities, the quadratic reconstruction is
cifie rtlgorithmic aspects of parallel computing. Eaort switeherl to a momtone constant one with the help of a
was essentially put in adjusting sequential algorithms p’operly de6ned discoatinuity detector. %e. method is
rather than developing new p a d e l schemes. Only a designed to deal with adaptive u n m c w grids con-
few large scale Cm, applications have been presented sisting of cells with an arbitrary number of edges. ’Rme
demonsuating the capabilities and h i t s of parallel ar-
integration is perfomredby an implicit scheme based on
chitectures for industrial CPD applications. One of the Newton-Krylov techniqlres. The efficiency and high BC-
main challenges for paraUel complex applications ls the curacy of the numerical method wem d e m o n m t d for
load balanced p d o n i n g of the flow domain, which is various 2-D inviscid and viscous laminar computations
essential for obtaining optimal machine psrfowance. including test cases with locally distorted mesh. How-
This important issues were hardly addressed in the con- ever, the mahod needs to be. carehlly investigated for
ference. 3-D complex geomc4ries end turbulent flows at high
2.3 AdvanudSpatidDlserrtiE.bon *sctnMQs Reynolds numbers where hifly irregular meshes are
expGcted. Furthurtlom, the sensitivity of the q u a d d c
Although in the last dwade extensive research has been rwonshudon with rospoct to implementation of wall
ongoing towards the developEnt of BowuBte Eukx asd bOUIld&y COnditiOM has to be h v d g a t c d .
Naviet-Stokes solvtrs, the improvement of spatial $is-
cretization sc- is still a major concern in CPD.
Suitable discretization schemes are expected to offer The paper d e l i d by yillcdiau et al. [IO]presented a
certain properties. These ace conservation, at least sec- second order scheme based on kinetic flux splitting.Ihe
ond order accutacy in smooth flow regions and sharp main fGature of this approach is that under a CPL like
roadition W ty and eaergy can ka proved t e & tension of these schemes to Eulerflvavig-Stokea equa-
nonnegative. This malres the msthoa very arrmctive for tions is stmi& forward provided that a conservative
of flaw fields with near vacurn c ~ t i o n s , lineiaation @an k fatnd. Tkig can M y be &ved
sueh tm daws m u d hypmmnk vehicles, Fromiking re- fw friangular mshcs, whereas for quadrilaFunl meshes
s& wm. shown for 2-Dsupmonk and hypmnic in- it is more difftcult aad still subject of ongoing rewarch.
viscid flows in comparison with the classical Roe flux The. paper presented various numerical sxampka for 2-
dSfawce @it whew, F ~ M3-D mults for a wing D flows demens- the ability of the lgsihinldecom-
dme epplicntkrd w m presented which do not yet allow poshion .pproaeh. tn pathtar, the mwlts indicate the
the aswme8t of ths apjwoech for 3-Dmom complex improved resolution offlow disontinuities which are not
application, f%rthmm,derpilad mmds on &e con- aligned with mesh lines. Unformakly, the issue of ac-
vergea~ebdinvlor of tke methodwemmissing. curse predictionof turbulwt viscous flow was not ad-
dressed in the paper. Furthermore, no3-Dnsults were
shown. The rwidupl decomposition schemes have been
sIIccesBfuUy combined with implicit methods and d u -
tion adaptive teehniqws.

The. paper by Mur Ransbeeck and Hbscb [ 161 preseatul


an alrernntivc approacA for multidimensional upwind
schsmes on smetuml mcpqcS. In this h e m & the
numerical flux is foraulsted usillg &e artificial dissipa-
tion cotlccpt. Ibe diffusive connibution is CMlStmcted
with directjond tenus, w h m a the antidiffusive term is
designed ac~ordingto the direction of the convcctiDn
s p d and to variations of ths solution in d i g a n t m h
directions. Tha papet pmsented a classification of 6rst
and sscond order accmte schemea that have respec-
tively minimum pnd m o cross diffwb. Second order
monotone srchemgs have heen developed using the con-
cepi of non-linear hiter Punctions applied to multidi-
mcasiond mtim of wuc di&ranccs. Extensions of the
pcalsr M i madel to the Euler/NavMI;stdes
equations have bema achieved through a charaetsristic
decompositioh.Biffefent choices for the propagation di-
rection are possible. M i n g results were presented
for ZD and 3-Dsupasonic test cases showing compara-
blc or somcwhat inlgrooetl a c m q with respect to
classid second on*a dimensionlrl-split upwind
schemes. Homer, 80 mnlts for sobsonic test cases
sueh as flow cwcrt an airfoil wem ahown. Therefore. n
compmhensivt amesment of the concept is not possible
Ra papa by Pailllrc ei UL [IS]reviewed recent devel- at theMIprwlt stage of resewch.

In s u m q , pmmising appmsches were presented,


m ch aimed at improvhg the Be(xuBcy of state-of-*
art Eul~~avicFStokcp solvers In pdeular, the h i g h
opder reconstnrctiw sppoarh and the m u l h h s i o n a l
upwind scb*nsa offer prqleaies which in theay make,
than wrpctior to standard algarithms.Numerical reuults
for various hvo dlmeosiooal test probtans SupPoat this.
However, in order te push the implemmation of these
advanced teehiqua into 3-Dproduction codes for vis-
cous flow dcddom, ttithu investigtions ~ f I e&
quired. A critical awsament &auld include sensitivity
T-7

studies with respect to grid fineness and grid regularity computations gained at RockweU Science Center over
for transonic 2-D and 3-D viscous flows. It should be the past several years. One of the most important lessons
clarified whether with these new concepts substantial
progress can be made towards accurate drag prediction
they have learned from many 3-D applications is the
fact that in spite of all the advances that have been made
.
li

for 2-D and 3-D configurations at relevant Reynolds


numbers.
in the field of unstructured procedures, on comparable
grid fineness structured-grid simulations yield more %c-
!iQ
$
curate solutions. The authors concluded that for inviscid y -
2.4 Unstructured Grids, Hybrid Grids, O v e ~ p t ~ ~flows g unstructured Euler solvers have a clear edge over '%
GridsmdMeshlessTeehniqoes their structured counterparts. This is due to the fact that ?.'$
The key problem of numerical simulation of complex the solution of Euler equations, unlike Navier-Stokes
configurations is the construction of an appropriate grid equations, does not require very fine meshes in the vi-
to represent the computational domain of mterest. Grid cinity of solid bodies. Therefore, unstructured grid gen-
generation is the decisive factor concerning the turn eration becomesmuch easier to handle and several com-
around time of simulations for industrial applications. putations for many different configurations can be
Essenttally two alternative strateges exist, namely carried out in a matter of a few weeks. In the case of vis-
structured and unstructured meshes. Currently. block- cous flows, however, the stringent resolution require-
structured body-fitted meshes are most widely used. ments in the wall normal direction makes structured
They have been proved to be well suited for viscous cal- solvers more suitable for efficient calculations. In the
culations and they form the building blocks of most of framework of unsbvctured meshes, paper IS] presented
the industrial state-of-the-art production codes. How- a generalizationof the implementation of boundary ccm-
ever, with this approach, grid generation for complex ditions which allows the specification of interior bound-
geometries itself is the major challenge. Various strate- aries anywhere in the computational domain. This con-
gies are being developed to simplify the grid generation cept allows the effective computation of moving bodies,
problems. Among these are the overlapping grid tech- like in the case of airaaft store release. However, no de-
niques where the smctured grids of various blocks may tailed numerical results were shown.
overlap. The alternative approach is to divide the com-
putational domain into an unstructured assembly of The paper by Calk [28] addressed the solution of Euler
computational cells by using tetrahedra or general po- and Navier-Stokes equations on hybrid grids consisting
lygonal volumes. In contrast to structured meshes, this
of prismatic cells near the body surface and tetrahedd
strategy substantially simplifies the discretization of cells elsewhere. Ihe use of prismatic cells offers the
complex geometries. On the other hand, it complicates possibility to efficiently and accurately resolve regions
the design of accurate and efficient algorithm. While in such as boundary layers by applying high aspect ratio
the past promising and flexible unstructured mechodolc- cells in the respective areas. An upwind finite volume
gies have been developed for inviscid flows, the =U- scheme has been implemented on an auxiliary mesh of
rate calculation of viscous flows using unstructured control volumes. This dual mesh formulation guarantees
meshes is stiU an important issue of current reseaxch. In conservation in the entire flow field and in particular at
particular, efficient simulation of high Reynolds number interfaces between prismatic and tetrahedral domains.
flows requires extremely stretched cells, which in the The integration in time is performed by an explicit mul-
case of tetrahedral meshes lead to tetrahedra with acute tistage scheme accelerated by a multignd technique
angles. This may cause numerical errors, at least for based on agglomeration of control volumes. Promising
classical schemes currently used in industrial codes. An numerical results were shown for 3-D inviseid and 2-D
interesting alternative is the u8e of hybrid grids consist- viscous flows demonstraring the ability of the method.
ing of tetrahedm and hexahedral or prismatic cells. It of- However, further 3-D viscous calculations for more
fers the possibility of combining the flexibility of tetra- complex geometries are required to p f the concept of
hcdral meshes with the accuracy of regular grids in the the hybrid mesh approach.
boundary layer. The ability of this approach to simulate
turbulent flows around complex 3-D geometries is stiU
to be verified. The paper delivered by Bmnner [29] presented a com-
putational procedure to simulate mcket sfage separation.
The Euler equations with mixing gases are solved with
Some of the before mentioned issues concerning the use an upwind finite volume method on unstructured
of more flexible grid structures were a d h s e d during
meshes, which may consist of a combination of tehahe
the meeting. The paper delivered by R a m a k r i s h et dral, prismatic and hexahedral cells. In order to simulate
al. [E] presented the experience on unstructured grid the motion of bodies, a conservative overlapping grid
technique have been implemented. A temporal adaptive dingknrichment. Point redistribution schemes maintain
algorithm is used to calculate the unsteady flow field. A a constant number of points, which are moved such that
very impressive application was shown, which however they congregate near flow features. This technique can
did not allow a critical assessment of the method con- be easily implemented into existing structured and un-
cerning its accuracy. stxuchued flow solvers. However. it can lead to quite
skewed grids, especially in the cass of structured
meshes. The grid embedding technique edd points to the
The interesting concept of meshless simulation tech-
existing grid. This procedure maintains the global grid
niques for fluid flow problems was prescated by Onate
accuracy outside embedded regions and simultaneously
er al. [1I]. According to the work of Baths the discrete
increases the accuracy in the embsdded regions. The
appraXimaton of the governing equations uses a cloud
key issue for adaptive methods is the design of suitable
of arbitrary points. Unllke conventional meshes, no
error estimatorS. By far the most common approach is to
fixed connectivities betwen the points is needed and
use physical featlnw such as local solution gradients.
therefore grid properties like regular cells or non nega-
These indicators efficiently detect high-gradient regions
tive cell volumes are not required. A weighted least
such as shock waves, however the global error may not
square interpolation is used to construct a linear or qua-
necessarily be nduced and the numerical solution may
dratic function from the values given at the arbitrary
depend on the adaption patkm. Recently more ad-
points in the local interpolating domain. First examples
vanced, direct error estimators are used. They are either
for the solution of the 1-D convection diffusion equation
and f a 2-D compressible inviscid h w s were shown. based on the discretization m r , which may be esti-
mated by comparing quantities calculated on two differ-
For these calculations the points generated by an un-
ent fine meshes, or on the residual error. This strategy is
structured mesh have been used. It was pointed out that
very promising but needs hazher research, especially if
major difi3culties of this apploach are the definition of
it is applied to viscous flows. In conclusion, adaptive
the local interpolating domain and the selection of the
strategies are considered as one of the pacing items of
most signiflcant points for the interpolation in each do-
algorithmic research. Issues which have to be clarif~ed
main. The interpolation sbategy strongly influences the
f a complex applications are the development of suit-
quality of the solution. Another drawback is that the
able adaption criteria allowing grid independent solu-
method is not conservative. Funhennore, it is quite dif-
tions and dynamic load balancing for parallel comput-
ficult to access the accuracy of the numerical procedure
ing.
if a set of arbitrary points is used. However, since in the-
ory the meshless approach does not require a suitable
grid of high quality and allows an efficient adaption The issue of adaptivity has been addressed by many
st~ategy,furtber research in this area is encouraged. conference papers. Papers [91, [lsl and [341 reported on
adaptive rdinement in the context of unstructured solv-
The papas reviewed above addressed different and in
ers based on insertion and removal of grid points. The
comparison to structured meshes more sophisticated papers [151 and [34]presented an adaption strategy
grid strategies which are expected to improve or even which relies on a finite element e ~ estimator.
w whercss
in [15] the. application is restricted to steady 2-D invis-
enable the simulation of 3-D complex configurations.
Promising results for various, mostly inviscid test cases cid flows, Friedrich et al. [34] presented a dynamical
adaption for various 2-D unsteady flows. The error indi-
were shown. However, the abilities of the advanced
techniques to accurately calculate turbulent viscous cator, which has been proved reliable for many inviscid
flows around more complex geommies were not dem- calculations, is currently being extended to the Navier-
onatrated at the conference.
Stokes equations. First grid adaptions for viscous flows
were shown. Furthemore, the finite element residual
2.5 AdsptlveSehrmcs has been successfully used for a 3-D inviscid wing ap
plication. The adaptive unstructured solver of [34] has
In recent years adaptive grid methods for computational
been padelized on the basis of an intelligent dynamic
fluid dynamics have gained popularity due to their po-
load balancing procedure for pe’rformance controlled
‘ential to provide highly accurate solutions on the basis
domain decompwition. In the parallel mode an explicit
if cost-effective calculations. In contrast to global re-
time integration is employed, whereas on a sequential
hction of the mesh interval, very tine mesh cells are re-
computer unsteady calculations are canied out by an
rtricted to those regions where flow features need high
implicit method using a preconditioned GMRW tech-
grid resolution; elsewhere the computational grid may
nique.
be quite coarse. Qrid adaption methods can be categw
rized into either point distribution or mesh embed-
T-9

The paper by Becker et af. [U] addressed the adaptive cient algorithms to solve the spatial discretized Eu-
grid rsfinement for block structund solvers. In this con- IerINavier-Stokes equations has become very obvious.
cept, locally refined mesh blocks are patched into the Many solvers still uaed in current aerospace develop
existing mesh. The additional fine subblocks are con- ment programs exhibit slow convergence towards the
nected with the original mesh via the multigrid tech- desired steady state solutions which leads to high com-
nique. The level of local truncation error is used as error puter costs and long turn around times. Consequently,
indicator. Following the idea of Brandt, truncarion e m r there is a substantial amount of research work focused
estimates can be extracted directly from the multigrid on methods for convergence acceleration, Promising ap-
cycle. So far, the refinement procedure is set up outside proaches are the multigrid time-sfepping technique and
the flow solver. First results presented for 2-Dand 3-D the Newton iteration with fast iterative solvers. In struc-
inviscid and viscous test caws show fhe feasibility of tured codes multigrid techniques based on explicit mul-
the strategy of subblock refinement. However, consider- tistage schemes are widely used and they have been
able more work is required to establish a fully auto- proved to yield good convergence rates for many practi-
matic, robust and accurate adaption method. cal applications. However, for the numerical simulation
of high Reynolds number flows, the convergence of the
standard multigrid schemes considerably slows down.
Van der Vegi er al. [20] presented a hexahedron based
grid adaption procedure, The method uses the discontin-
This is due to the stiffnass of the numerical problem,
which is introduced tbroug4 the high-as* ratio cetls
ues Galerkin finite volume formulation with local grid
requid for the efficient solution of such flow fields.
enrichment. A directional grid adaption is employed
Therefore, one of the key issues concerning algodrhmic
which allows subdividing of cells, independently in
each of the three local grid directions. This anisentropic development is the design of appropriate multigrid com-
ponents, such as smoothing and grid transfer operators,
grid refinement is expected to be more efficient in cap-
which efficiently k k l e high aspect ratio cells.
turing local flow phenomena than isentropic refinement,
since many flow featwas are onedimensional. The sen- Interest in fast iterative methods has been mainly moti-
sor uses primitive variables and is constructed such that vated by unstructured solvers. It was shown that cou-
it prevents regions with discontinuities from constantly pling Newton's method with iterative solvem for the in-
dominating the local grid refinement procedure. The ca- ner iteration is an effective approaeh for solving the
pability of the adaptive method was demonstrated by large systems of nonlinear equations arising from the
calculations of the inviscid transonic flow around a ge- dmtimtion of Wer and Navia-Stokes equations. An
neric delta wing. From the author's viewpoint, the hexa- i n t e d n g featme of Newton's method is its ability to
hedral based adaptive solver is a good candidate for provide superlinear asymptotic convergence. On the
large eddy simulations (LES), becaw it offers the o p other hand, efficient iterative schcmes based on New-
portunity to accurately capture viscous sublayers with ton's iteration require excessive memory allocationsfor
successively fine grids through load grid refinement. three dimensional applications. Therefore, strategies
LES results, however, were not shown. have to be developed which eliminate the large storage
requirements but still remain the favorable convergence
characteristics of Newton's method.
Grid adaptive procedures based on point redistribution
wen diseussed in the papers [121, [271, [331. This tech-
nique was mainly used in the framework of moving The paper by P u l h et al. [NI gave an excellent over-
grids for unsteady calculations. view of the potentialities and drawbacks of Newton's
method applied to CFD solvers. For practical re880115,in
each Newton iteration the large block banded matrix is
In conclusion, various adaptive saategies were pre- solved by an iterative matrix solution method. In partic-
sented. The important issue of developing a suitable in- ular, the paper addressed the class of Krylov subspace
dicator for adaption was addressed. Various error esti- methods known as GMRES. It p e n t e d practical as-
mators have becn proposed and successfully applied to pects and implemeatation issues of these methods. %
inviscid flows. However, further research is needed to main components of the Newton-GMRES approach,
establish efficient and robust adaptive methods for vis- such as evaluation of the Jacobian, matrix-vector multi-
cous flows. ply and matrix preconditioning, were discussed with &
2.6 Fast Implicit and Iterative Sdvers
spect to global convergence behavior, memory e-
ments and accuracy. Trade-offs between futl Newton
As numerical flow simulations pave thein way into the and approximate Newton and other pertinent approxi-
practical aerodynamic design process, the need for effl- mations were investigated. The Newton-GMRW solver
was analyzed in the framework of a s t r u c t d and un- calculated and the effects of turbulence are modelled by
smciNul(fd 2-D Navier-Stokes codb. In both cases very a so-called Mbulence model. However, in many cases
prornising rasulta w e n shown. Calculations with similar the quality of the sulution may strongly depend on the
methods were ria0 carrid out in papers [91, [ 151.It can turbulence model wed in the calculation and at best
be concluded that optimal shtegks which .msw favor- quaionable. Wts may be obtained for more complex
able convergence chamterlstics will lead to excessive flow phenomena such as massive flow separation. The
memory requirements. No 3-Dcdculation with New- rapid increase of unnputer resources motivated the re-
ton-Krylov subspace techniques were presented at the senrch on direct numerical simulation (DNS) or large
conference. eddy simulation (LES) of turbulent flows. In the case of
DNS, the unsteady Navier-Stokes equations are solved
directly. No turbulence model is required since all scales
The paper by C d i e r et al. [a]proposed a new im- and turbulence motions pre;sent ace resolved numeri-
plicit algorithm called DDLU factorization. Compared cally. Due to excessive computer mources required
to the classical AD1 factorization, this s u a m enabtes a
even for simple geomelries, this simulntion technique is
redwtion in both CPU time and memory. Tln new im out of question for practical applications. kIowever, it
pliit technique was applied to a 3-D supersonic test provides a very important methodology for turbulence
case on a relatively come, mosh. For a romprehensive research. In contrast to DNS,the large eddy simulation
aswssment of this technique further test calculations are of turbulent flows resolves only the large scale structure
requid. of the turbulence, while the effects of smaller eddies are
described by a statistical subgrid medel. As the resolu-
The paper hy Me&k e?al. [ 181 was devoted to conver- tion of the fine scale turbulence motion is not requid,
gence acceleration of the Navier-Stokes equation far fewer grid poiuts are neaded making LES feasible
through a timederivative pmcon&tioning of the gov- for practical problems at relevant Reynolds numbers in
erning equations. U s i q physical srgusnants, a general- the near future. On the other hand, in order to ensure im-
ized pwmnditionqr was developed, ensuring conver- proved results oomppred to the solution of the Reynolds
gence &amctmiatics which am independent of the averaged Navier-Stokes equations (RANS), besides the
Mach number. The uniform convergence was demon- establishment of a suitable subgrid model, accurate res-
septed for a variety of applications covering a wide olution of the viscous sub-layers in the near wall regions
range of Mach numbers. In many low =peed cases, the is needed. This substantially increases the number of
preconditioned system s b w d a much imprwed con- grid points for LES compand to RANS solvers. Fur-
wrgence rate while having no detrimental effects in re- thermore, since time accurate solutions are calculated in
gimes where the original method alrsady worked ef& the fnunework of LES,significant further development
ciently. So. preconditigning of the governing equations of the classical CFD methods is needed. In addition to
may &er the possibility to develop an efficient unified the validation of a subgrid model, more sophisticated al-
flow solver for the whole Mach number regime. Further gorithms SWh p~ &ti% grids, h i g k order discretiza-
research is r e q u i d to establish this approach. tions. efficient unsteady solvers d parallel computing
have to be mad^ available for LES before this technique
can be usgd as a tool for flow simulations.
At the conference none of the papers devoted to conver-
gence acm.lenttion addressed the key problem of com-
prtiog realistic Reymldp n u m k flows. Thsse flows re- Numerical aspects of DNS and LES were addressed by
quire c o q u w i o n a l m e h with veay high a s p t ratia papem [51,[281,[211 and [221. As already mentioned
01 irregular c e b leading to very sfiff discrete equations. above. the papas [SI and [21]were devoted to the ex-
The developslGnt of numerical shutegies to overcome ploitation of Pprauel computers. whercas p s p r [201pre-
the stiffnesr and to ensw fast conveqgence in these.flow sented a grid adaption method specially designed for
situations is one of the grand challenges in algorithmic LES.The focus of Gums et al. [22]was the investiga-
research. tion of subgrid scale models w i t h a classical explicit
finite dfierence method. The aim of the presentation
2.7 ~ t B l o w s , L E s I D N s was to show some examples of what can be achieved
The key problem of accurate numerical simulation of with today's supercomputm sml standard codes using
complex Aowe is the U p t i o n of transition and m u - eddy-viscosity models.
lence. CumnUy, in all industrial relevant calculations,
the Reynoldp a v e r 4 Navies-Stokes equations are In summary, from the papers delivered at the conference
solved, in which only bsterigiiOally sfationary flow is it is very difficult to estimate whether a large eddy simu-
T-11

lation for a practical problem, such as a clean wing at a the distinct flux vector and flux Merence splitting con-
relevant Reynolds number, will k o m e feasible in near cepts while remining their inemsting features. The pro-
future. In order to reach that goal, significant research posed method is a m b i i o n of the Van Leer scheme
work on both algorithms and subgrid scale modeh is and the Osher scheme with some modificati~nsand ex-
needed. A few preliminary approaches for algorithmic tensions. The ability of the new method to resolve vis-
improvement were shown at the. conference. cous hypersonic reacting flows was illustrated by vari-
ous results including internal and external flow
'28 C h ~ R ~ F l o n s configurations. The time integration is pedormed by an
The effective use of CFD for viscous hypersonic react- u n f d implicit scheme, which in the c m n t impke-
,i n g flows is one of the present challenges. In the past, mentation leads to somewhat slow convergence rate and
substantial effort has been devoted to this research area needs to be improved for further applications.
and key requirements for efficient solution algorithms
have been identified [30]. These are sharp capturing of In conclusion the two papers on reaoting flows c o v e d
strong shocks, robustness in regions of strong flow ex- the key issues for developing efficient numerical tbols
pansion, high resolution of viscous regions, efficient for the simulation of complex flows. Very promising re-
treatment of adverse grid and flow situarioas in the case sults were prescntcd, illustrating that effective @*
of complex 3-D geomchles ' , and effective integration of
tions in terms of both accuracy and efficiency for com-
stiff equations introduced by the large chemical source plex d g u r a t i o n s are now feasible.
-turns.
2.9 I . f n s t e d y ~
The two conference pepers devoted to reacting flows ad- For steady flows, substaatial CFD capability has been
dressed these algorithmic issues. The paper by Rade- achieved over the last two decades and EulerNavier-
spiel et al. [30] reviewed reant progress made with flux Stokes solvers are intensively used in d y n a m i c de-
vector splitting methods to ensure high resolution and sign. In contrast, although some isolated unsteady flow
r o b m e s s for hypersonic viscous raeaing flow simula- calculation5have bcen carried out for various classes of
tions. 'Avo promising approaches -fly published in problems, numerical simulation of unsteady flow fields
the literanue were discussed and compared. Both based on EuledNavier-Stokes equations is certainly not
schemes use scalar dissipation functions and W i con- routine for industrial npplieations, due to the excessive
ceptual d i f f e r e m appear in the resolution of shock compuqional effort involved in these calculations.
waves. Implementation details and recommendations From the algorithmic point of view, new innovative eon-
for their effective use for viscous flows were given. Fur- cepts are required, which substantially cut down the
thermore,the capabilities of the multigrid method based costs of time accurate simulations. This is especially k
on explicit multistage time-ste,pping schemes were in- portant for viscous flow calculations. where a very fine
vestigated for reacting flows. A number of modest mod- mesh near the wall is required to resolve the boundary
ifications of the standard multigrid method successfully layer. Is- that are central to unsteady CFD are the use
used for subsonic and transonic flow problems were re- of efficient haplicit time integration with favorable sta-
ported in order to ensum fast convergence for high bility and accuracy characteristics, moving grids, adep
Mach number flows with strong shocks. The stihes of tive grids with I d grid relineme.ntlconrsening and par-
the equations introduced by the large chemical source allcl computing. Moreover, for aercelastic applications
terms is removed by a point implicit treatment. Various efficient coupling strategiesare required.
computations for diffwent complex flow problems were
presented. They impressively demonstrate that with the
T h e accurate calculations have been sddressad by scv-
reported algofithmic improvements converged flow so-
lutions for reacting flows over complex 3-D configum- eral paws (e.g. 1273, [321, Wl, [341, W1). papa
by Pentaria et d.[32] focused on the solution of the un-
tions are now feasible.
steady incompressible 2-D Navier-Stokes equations us-
ing a projection metkodology developed for cokcated
The paper devoted by Caquel et aL [31] focused on the grids. Standard numerical schemes. such as approximate
extension of a hybrid upwind spitting method to mn- factarkation techniques, were employed. The numerical
equilibrium flows.Eased on thGexpuicncc that the clas- results presented for some test casm were encouraging.
sical Van Lccr flux vector scheme is not suitable for vis- however, no remarks on the efficiency of the method
cous calculatioas and the Roe type flux We- were given.
solvers arc not robust for hypersonic flows, a new up-
wind approach was prcseated which basically combines
I'he paper deli& by ANun [33] was devotad to grid 3. CONCL.UJ"GREMARKS
adaption for ~ ~ t c p inviscid
dy airfoil flows. The solu- In chapter 2 each specific subject of the meeting has al-
tion odaptivc grids are gemrated by a new tcansfinite in- ready been fully commented, so that only g e n d con-
tqol&on technique. An inferestinp approach was pre- cluding remark8 m given here.
sented, in which adaption is perfowed by adapting the
In the evaluator's opinion the theme of the symposium
interpolation par- instead of the physical grid pa-
"propresS and Challenges in CFD Algorithms and
sitions. For unsteady calculations. &I adaption is per-
M e t h W was too e n m p i n g and too ambitious for
formed gFaduplly by imposing a so-called adaption ve-
a 3 I t 2 day long AOARD conference. Many papers of
locity onto each grid point. The grid interpolation
great interest and high technical standard were deliv-
strategy was shown to be well d t e d for srnrctured mov-
ing grids. It is very flexible and requires only little CPU
end. They addressed specific challenges in CFD, pro-
posed new methods or modifications to known method-
time. Steady and unsteady airfoil computations wen-
and presented smaller or larger progress. On the
presented illuslrating the improved rasults from the ologiesother hand, howevcr, quite a large number of papers of
adapted meah. For the ddatim, an upwind E h
lower quality were presented, which either did not focus
solver with the dual-time implicit approach was used,
on curmnt key issues of algorithmic restarch or mainly
which is considerably more e@cient than the basic ex-
reinvented well known results. Probably this situation is
&it solver. The paper focused on two-dimensional in-
very similar to all other large CFD cooferences. But
viscid appliestons, so that the flexibility and efficiency
measured against the ambitious theme of this sympa-
of the proposed grid adaptiou strategy are still to be ver-
sium. it has to be clearly stated that in many areas the
ified for both viscous and threeduneasl
' 'onal flows. A
Seville cont%renw did not reflect the actual status of
time-vaxying grid technique was also presented by [%I.
CFD and its reeent.pmgre.ss. Considering Jameson's ex-
Here, the time integration was carried out with a m n d
cellent survey paper, it is obvious that several important
order implicit scheme.
algorithmic deveiopments and recent improvements
were not addressed. For example, no paper was devoted
A more sophis~cticstedmoving grid technique was pre- to d y n a m i c shapp optimiition and multidisci-
sented by Jones e$ al. [271, with the goal of computing plinary analysis, topics which are increasingly important
pircraa store trajffitorieg The technique is basmi on for future CFD applications in industry.Furthcrmorc. in
fully mstrurhtrcd or hybrid meshes. It WBS pointed out some areas such as unstruchKed grids and adaptive
that the geomepic conse~~ation law has to be satisfied schemes,CFD is much hrther developed than reported
within the framework of moving grids in order to guar- at the confaence. Since many leading experts, ape
antee consistent results. So far, only hvo-diiensional cially those from the US., did not contribute to the con-
unsteady malts have been achievd ference, it is haid to expact that the high demands of the
symposium could be met.
The papr by Kuis Calavem et al. [3S]a d d r e d para- Nevertheless, several important dmtions of algorith-
metric studies of a time accunte Euler code for osoillat- mic research were address& which m expecred to im-
iag wings. A rather standard central scheme with dfi- prove the capability of Cm, for complex applications in
cial dissipation and explicit multistage time stepping the industrial environment. Thwe included parallel
scheme was used Effects ef grid density and artificial computing, a d v a n d dhmtbtion techniques, fast iter-
viscosity on the time acemm solutions were discussed ative solvers and powerful acceleration techniques,
ShOWhg heXp.atd bGh&m. The Code has been h- adaptive schemes and flexible strategies for discretizing
plernented on a powerful parallel computer, namely the the computational domain. Interesting and new aspects
National Wind lhnncl of NAL in Japan. It was demon- of these techniques were discussed, substantiating their
strated that parallel computing is a n e c e s q in@ent extended potentiab and improved abilities. In most
for effective thnedimcnsional unsteady flow calcula- cases, however, the superiority of the more sophisticated
tions. methods to the well established standard schemes was
only demonstrated €or simplified test problems, for
which the classical methods also perform quite well.
In summary a view central issues for unsteady computa- Very often results were shown for 2-Dinviscid and lam-
tions were discussed by the conference pqms. How- inar viscous flows. Three-dimensional calculations were
ever, no major progress in the development of a l p restricted in most cases to inviscid flows or simplified
rithms for efficient thtabdrm ' ensional time accurate
geome~es.Only a few more d s t i c calculations were
calculations was presented. presented. To make a step forward, it is very important
to apply the advanced mculodologies to those problems,
T-13

ciencies in terms of accuracy and efficiency or do not


. N..
"Portable Parallelization of a 3-D EuledNavier Sto-
work at all. One of the grand challenges in CFD is the kes Solver for Complex Flows".
effective simulation of viscous flows at realistic, M I
scale Reynolds numbers for complex configurations. 5. Pinelli. A., Vacca, A., "A Parallel Spoctral Multi-
T h s problem, although ideal for testing advanced dis- domain Solver for Incompressible Navier-Stokes
cretiztion and time integration schemes, was hardly Equations".
tackled at the conference. even for simplified geometries.
Furthermore, in order to raise the confidence level of 6. Wissink, A., Lyrintzis, A., Strawn. R., "On Improv-
CFD methods, c a f u l grid refinement studies, msitiv- ing Parallelism in the Ransonic Unsteady Rotor
ity investigations, estimation apd control of the numeri- Navier-Stokes Code TURNS'.
cal error as well as detailed ccde validation am required
for a wide class of relevant applications. In many pa- 7. Dias D'Almeida, F., Castro, F. A., Palma, J.M.L.
pers, these issues were only partIy or not all considered. M., Vasconcelos, P., "Development of a Parallel
Implicit Algorithm for CFD Calculations".

In conclusion, considerable research work is still needcd 8. Ramakdshnan, S.V., Szema, ICY., Chen, C.L.,
to establish CFD as an effectiw tool in the aerodynamic Shankar, V.V..,Chakravatthy, S.R., "Experiments
design process. The most important, but probably also with Unstructured Grid Computations".
the most limiting factor, is turbulence modelling. a sub-
ject which was outside the scope of this symposium. 9. Delanaye, M., Geuzaine. Ph.. Essers, J.A., "A
With respect to algorithms, furthar development and im- Secon Order Accurate Finite Volume Scheme Solv-
provement remain essential but have to be directed to- ing Euler and Navier-Stokes Equations on Adaptive
wards the real challenges in CFD. which include: unstructured Grids with an Implicit Convergence
Acceleration F'rocedure".
accurate viscous flow simulation at relevant
Reynolds numbers 10. Villedieu, Ph., Estivalem, J. L.,"A New Positivity
effective treatment of complex configura- Preserving Second Order Accurate Kinetic Scheme
tions, such as a complete airrraft for the Multidimensional E u b Equations on
Unstructured M e s W .
efficient simulation of more complex flows
with multiple space and time d e s , such as 11. Onate, E., F i h w , J., "Meshless Techniques for
unsteady flows or reacting Bows Computer Analysis of High Sped Flows".
large eddy simulation for practical applica-
12. Pirumov, U.G., Kryukov, LA.. Ivanov, I.E.,

-- tions
aerodynamic shap optimization
"Numerical Simulation of Internal and External Gas
Dynamic Flows on Structured and Unstructured
Adaptive Grids".
multidisciplinary analysis and design
The Seville symposium was a step in the right direction. 13. Briggs, R. D., Shahpar, S.,"An Investigation of the
For some topics, it showed some good promise but there Effects of the Artivicial Dissipation Term in a
is still considerable work to be done to meet the chal- Modem TVD Scheme. on the Solution of a Viscous '
lenges of industrial CFD. The symposium provided a Flow Pmhlem".
valuable forum for exchange of information about re-
cent developments and achievements. 14. Vinckier, A., Jaeobaen, J., Wagnw, S., 'Multidi-
mensmnal Upwinding with Flux Filters for the
4. LITERATURE Euler and Navier-Stokes Equations".
I . Jameson, A., "Present Stahls, Challenges and 15. Paill&, P., Carette, J.C., Issman, E., Van der
Future Developments in CFD". Weide. E.,Dcconinclz H..Degrez, 0. "Implicit
Multidimekonal Upwind Residual Distribution
2. Rubben. P., "CFD Research in the Changing US.
Schemes on Adaptive Meshes".
Aeronautical Industry".
16. Van Ransbeeck, €?, H i d , Ch.. "Multidimensional
3. Knight. D., "Parallel Computing in CFD".
Upwind Dissipation for 2-DE!-D Eulw/Navier-Sto-
kes Applications".
ZLJ

17. A g l a A.R., Wcat, U., Misinhoglu, A., "A Wentrement Hybrides pour la Simulation
PCCIIE.B-E Iteration for High Order and Fast Solu- d'Ecoulements en Degtquilibte ~ r m i q u e et
tion of 3-D Navier-Stow Equations". Chimique".

18. Mer&, C.L.,Frcnltaaswaran, S., "Convergence 32. Pentaris, A., Tsangaris.S.,"A Projection Methodol-
Acceleration of Naviw-Stokes Computatims ogy for the Simulation of Unstaedy Incompressible
Thraugb %e Derivative Prsconditioning". VISCOUS Rows Using the Approximate Factoriza-
tion 'Itchnique".
19. Pullim, T.H., Ropers, S. Barth, T., "Practical
AsspecrS of Krylov Subspace Basad-IWatiw Meth- 33. Allen, C.B., "An Implicit Upwind Scheme with
ods in CPD". Orid Adaption for Unsfcady Euler Aemfoil Rows".

Van der Ven, H.,"Haahadron


28. Van der Vegt, JJ.W., 34. priedrichs, 0..Gerhold, Th., Meister, A., Sonar,
Eased Grid adaption Algorithm for Puture Large Th., "Adaptive Computation of Unsteady plow
Eday Simulationy. Fields with DLR-~-code".

21. .%fen& M., Kuwten. H..Broczc, J., Geurts, B., 35. RU~Z R.P., ~ i m s en..
, **patmetric
studies
"Parallel Algdthms h DNS of Conpressible of a lime AccurateFinite Volume Buler Code in the
Row". N.W.T.ParaUel Computer".
22. COmra. P., "A SfI'ai&&~iVd 3-D Mdti-Block 36. Badcock, K.J., Richards, B.E.. "Parallel Implicit
Unsteady Navis-Stokes Solver for Dirnct and Upwind Methods for the Aerodynamics of Aero-
h g e W y Simulatioos of Transitional and Turbu- space Vehicles".
lent CompressibleFbws".
23. Orsizag, SA., Quian. Y.% S d ,S., "Applications
&Lattice Boltzmarm Medmds to Fluid M&anics".
24. Becker, K., Rill, S., "Structured Adaptive Sub
B l ~ Rcflnsment
k fbI 3-D Rows*.

25. Sibilla, S., Vitaletti, M., "Multiblwk Srmctured


Orid Algodthms for Wer Solvers in a Parallel
CQmpting&aruewo&".
26. Cambier, L.,Dorrscq, D.. M.,Guillet&Ph.,
Jcuq a.. Le ~ U W L,., "Am6liorations
Rtcsnfes du cod0 de Calcul d'Beoulemeats Com-
pressibh FLU3M".
21. Jones, D.J., Fortin, F., Sym, O.F., Hawken, D.,
'The Computation of aireraft Store Trajemories
using Hylwid (srmcture&llnsrmcturcd)Grids".

28. Galls, M.,"Solution of the J3uls and Navier-Stokes


Equations on Hybrid a d s " .
29. Bresmr, P., "SimulstiOn du Mouvement Rclafif de
CBpPs sownis B un EEoulement Instaionnaire par
uno Mcth Chev-t de Maillages".
30, Ihadcspief,R. Longo, J.M.A., Brkk S., Schwam-
born, D., "Efficient Numerical Simulation of Com-
plw 3-D Flows with CbnW".

31. Capol, E, Joly, V., Mermignon, C., "M6kds de


The Present Status, Challenges, and
Future Developments in
Computational Fluid Dynamics
Antony Jameson
Department of Mechanical and Aerospace Engineering
Princeton University
Princeton, New Jersey 08544 USA

1. SUMMARY Des ite the advances that have been made, CFD is still
not geing exploited as effectively as one would like in the
This paper presents a perspective on corn utational fluid design process. This is p a d due to the long set-up and
dynamics as a tool for aircraft desi t addresses the
requirements for effective industriaf%e, .and trade-offs
P high costs, both human andlcomput?tional of complex
flow simulations. The essential requuements for indus-
between modelling accuracy and computational costs. Is- trial use are:
sues in algorithmaesi ari discussed in detail, together
8"
with a unified,ap roac to the design of shock captunng
algonthms. Fin aYly, the pa r discusses the use of tech- 1. assured accuracy
niouec drawn from
...~~controf%eow, to detemune ootimal
~~~~~~~ ~~~~~~

~~ ~

aerodynamic shapes. In the future mul$disclplinaj anal- 2. acceptable computational and human costs
ysis and optimization should be combined to provide an
integratednumerical design environment.
3. fast turn around.

2. INTRODUCTION Improvements are still needed in all three areas. In par-


Computational methods first began to have a significant ticular, the fidelity of modelling of high Reynolds number
impact on aerodynamic analysis and design in the period viscous flows continues to be limited by computational
costs. Consequently accurate and cost-effective simula-
of 1965-75. This decade saw the introduction of anel
methods which could solve the linear flow mode s for
arbitrarily complex geometry in both subsomc and super-
P tion of viscous flow at Reynolds numbers associated with
full scale Ai ht, such as the prediction of high lift devices,
remains a chlenge. Several routes are available toward
sonic flow 158, 147, 1791. It also saw the a pearance of the reduction of computational costs, includ.in the re-
the first satisfactory methods for treating &e noAnear duchon of mesh requlrements by the use of hgfer order
equationsof transonic flow [ 123,122,63,64,43,54 and schemes, improved convergence to a steady state b so
the development of the hodograph method for the dgsign histicated acceleration methods, fast inversion me&od;
of shock free supercriticalairfoils [U]. !or im licit schemes, and the exploitation of massively
p a r a d computers.
Computational Fluid Dynamics (CFD)has now matured
to the point at which it is widely accepted as a key tool Another factor limiting the effective use of CFD is the
for aerodynamic desi n Algorithms have been the sub- lack of good interfaces to computer aided design (CAD)
'ect of intensive deviopment for the pa$ two decades. systems. The geometry models provided by existing CAD
h e principles under1 in the design and mplementauon systems often-fail to meet the riquiremenis of continuity
of robust schemes wLcf~can accurately resolve shock and smoothness needed for flow simulation, with the con-
waves and contact discontinuiues in compressible flows sequence that they must be modified before they can be
are now quite well established. It is also uite well under- used to rovidethe inputfor mesh generation. This bottle-
stood how to design high order schemes?or viscous flow, neck, w%ch impedes the automation of the mesh genera-
including compact schemes and spectral methods. Adap- tion rocess, needs to be eliplinated, "d the CFD softyare
z
tive refinement of the mesh interval h) and the order of
approximations (p) has been success ully exploited both
separately and in combination in the h-p method [1261.
A continuing obstacle to the treatment of configurations
shoufd be fuUy integrated m a numencal desi envuon-
ment. In addiuon to more accurate and cost-e%ctive flow
prediction methods, better optimizations methods are also
needed, so that not only can designs be rapid1 evaluated,
with complex geometry has been the problem of mesh but duections of improvement can be identiJed. Posses-
generation. Several eneral techniques have been devel- sion of techniques which,result in a faster design cycle
oped, includin alg&aic transformations and methods gives a crucial advantage in a competitive envuonment.
based on the sofution of elliptic apd hyperbolic uations.
In the last few years methods using unstructure meshes 7 A critical issue, examined in the next section, is the choice
Dassault- &r~
have also be un to gain more general acceptance. The
led the way in developing a fi-
nite element me od or transom potenual flow. They
of mathemaucal models. What level of complexity is
needed to provide sufficient accuracy for aerodynamic
desi n and what is the im act on cdst and turn--around
obtained a solution for a complete Falcon 50 as early time.%Secuon
, . 3 addresses &e design of numerical algo-
as 1982 [Z]. Euler methods for unstructured meshes rithms for flow simulation. Section 4 resents the results
have been the subject of intensive development by several of some numerical calculations whicl require moderate
oups since 1985 []IO, 82, 81, 163, 141, and Navier- computer resources and could be completed with the fast
gokes methods on unstructured meshes have also been turn-around required by industrial users. Section 5 dis-
demonstrated[ll7, 118.111. cusses automatic desien omedures which can be used
to produce opumum a&dynamic designs. Finally, Sec-
non 7. offers an outlook for the future.

Paper presented ut the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spuin, from 2-5 October 1995, and published in CP-578.
1-2

3. THE COMPLEXITY OF FLUID FLOW AND potential flow or Euler solutions for an airfoil can be ac-
MATHEMATICAL MODELLING curately calculated on a mesh with 160 cells around the
section, and 32 cells normal to the section. Using multi-
3.1 The Hierarchy of Mathematical Models grid techniques 10 to 25 cycles are enou h to Gbtain a
conver ed result. Consequently airfoil c&uIations can
Many critical henomena of fluid flow, such as shock be perkrmed in seconds on a Cray W,and can also
waves and turI? ulence,, are essentlally non-linear. They be performed on 486classdpersonal computers. Corre-
also exhibit extreme disparities of scales. m l e the ac- spondingl accurate three- mensiond mviscid calcula-
tual thickness of a shock wave is of the order of a me? nons can ge performed for a wing on a mesh, say with
free path of the gas articles, on a macrosco ic scale its
thickness . ~~ ~~~~ ~~~ ~~~~~ ~ l
s essenti$Iv, zero. In turbulent ow enerev
i. 192x32~48=294,912 cells, in about 5 minutes on a sin-
gle processor Cra YMP, or less than a minute with eight
is transferred from large scale motions to pro essive?
sm4ler eddies until the scale becomes so sma I that thef 1
processors, or in or 2 hours on a workstation such as a
motlon
motion is dissipated by viscosity. The ratlo of the length Hewlett Packard 735 or an IElM 560 model.
scale of the global flow to that of the smallest persisting
perbistlng- VISCOUS simulations at high Reynolds numbers require
eddies is of the order Rei, where Re is the Reynolds num- Careful two-dimensional studies
her, typically in the range of 30 million for an aircraft. In
order to resolve such scales in all three space directions a
computational grid with the order of Rei cells would be ary layer, in addition to 32 intervals between the boundary
required. This is beyond the ran e of any current or fore- layer and the far field, leading to a total of 64 intervals.
seeable computer. Consequeniy mathematical models In order to prevent degradations in accuracy and conver-
with varyin degrees of simplification have to be intro- ence due to excessively large aspect ratios (in excess of
duced in ode, to make corn utational simulatlon of flow f.000) in the surface mesh cells, +e chordwise resolu-
feasible, and to produce viagle and cost-effective meth- tion must also be increased to 512 intervals. Reasonably
ods. accurate solutions can be obtained in a 512x64 mesh in
Figure I (su plied by Pradee Ra') indicates a hierar- 100multigrid cycles. Translated to three dimensions, this
chy of mode[ at different leve% o(simp1ification which would imply the need for meshes with 5-10 million cells
have Droved useful in practice. Efficient flight is gen- (for example, 512x64x256= 8,388,608 cells as shown
erally achieved by the-use of smooth and <treeamlined in Figure 2). When simulations are performed on less
shapes which avoid Bow separation and minimize vis- fine meshes with, sa ,500,000 to 1million cells, it is very
cous effects, with the consequence that useful predicuons hard to avoid mesh Jependency in the solutionsas well as
can he made using inviscid models. Inviscid calculatlons sensitivity to the turbulence model.
with boundary layer corrections can provide quite accu-
rate redictions of lit and drakwhen @e flow remcns
attacRed, but iteratlon between e inviscid outer solutlon
and the inner boundary layer solutlon becomes mcreas-
in ly difficult with the onset of separation. Procedures for
sofving the full viscous equations are likely to be needed
for the simulation of arbitrary com lex separated flows,
which may occur at high angles ofattack or with bluff
bodies. In order to heat flows at high Reynolds numbers,
one is enerally forced to estimate turbulent effects by
Reynolis averaging of the fluctuating components. his
requires the introduction of a turbulence model. As the
available computing power increases one may also as-
pire to large eddy simulation ES) in which the larger
p.
scale eddies are directly calcu ated, while the,influence
of turbulence at scales smaller than the mesh interval is
represented by a subgrid scale model.

Figure 2 Mesh Requirements for a Viscous Simulation

A typical algorithm requires of the order of 5,000 floating


Figure 1: Hierarchy of Fluid Flow Models oint o ratlonsper mesh point in one multigrid iteration.
b t h l!$milllion mesh points, the operationcount is of the
order of 0 . 5 10"~ per cycle. Given a computer capable
of sustaining IOii operationsper second (100 gi aflo s),
3.2 Computational Costs 200 cycles could then be performed in 100 secon%. &U-
ulations of unsteady viscous flows (flutter, buffet) would
Computational costs vary drasticall with the choice of be likely to re uire l,OO0-10,000 tune steps. A further
mathematical model. Panel methois can t x effectively 4
progression to arge eddy simulation of complex c o d
uratlons would require even eater resources. The f$
used to solve the linear potential flow equation with
higher-end personal com uters (with an Intel 80486 mi- lowing estimate is due to W.# Jou [90].Suppose that a
cro rocessor, for exampL). studies of the dependency conservative estimate of the size of edkes in a boundary
of %e result on mesh refinement, performed by this au- layer that ought to be resolved is 1/ 5 of the boundary layer
thor and others, have demonstrated that inviscid transonic thickness. Assuming that IO points are needed to resolve
1-3

a single eddy, the mesh interval should then be 1/50 of The selection of sufficiently accurate mathematical mod-
the boundary layer thickness. Moreover, since the eddies els and a jud ent of their cost-effectiveness ultimately
are three-dimensional, the same mesh interval should be rests with i n g q Aircraft and spacecraft desi ns nor
used in all three directions. Now, if the boundary layer mally pass throug the three phases of conce tuafdesigi
thickness is of the order of 0.01 of the chord length, 5,000 prelimnary design, and detailed desi n torrespond-
!ngl, the appropriateCFD models witviry in complex-
intervals will be needed in the chordwise direction, and
fora win with an aspect ratjo of IO,50,000 intervals will &
ity. the Conceptual and prelinary desi n phases, the
emphasis will be on relatively smple modis which can
be needA in the spanwise direction. Thus, of the order of
50 x 5 , 000 x 50, 000 or 12.5 billion mesh points would give results with very rapid turn-around and low computer
be needed in the boundary layer. If the time dependent costs. in order to evaluate alternative confieurationc and
behavior of the eddies is to be fully resolved using time
steps on the order of the time for a wave to pass through a
mesh interval, and one allows for a tolal lime e ual to the Ild be placed on-numerical p r e c -
time required for waves to travel three times ?he length tions bas forced the extensive use of wind tunnel tesung
of the chord. of the order of 15,ooO time steps would be at an early stage of the desi n This practice was very
needed. Performance beyond the teratlop ( IOi2 opera- exoensive. The limited n u d e ; of models that could be
~~~~~ ~ ~~~

tions per second) will he needed to attempt calculations fahcated also limited the range of design variations that
of this nature, which also have an informauon content far could be evaluated. It can be anticipated tha! in the fu-
beyond what is needed for enginering anal sis and de- ture, the role of wind tunnel tesung in the design process
sign. The desi er does not need to know x e details of will be more one of verification. Experimena &search
the eddies in tE boundary layer. The primary purpose
of such calculations is to im rove thc prediction of aver-
to improve our understanding of the physics of complex
flows will continue, however, to play a vital role.
a ed uantities such as skin kction, and the prediction of
gfobagbehaviorsuch as the onset of separation. The man
current use of Navier-Stokes and large eddy simulations
r
is to gain an improved insight into ~e ph sics of turbulent
flow, which may in tum lead to the deve opment of more
comprehensive and reliable turbulence models.
4. CFDALGORlTHMS
4.1 Difficulties of Flow Simulation
The corn utational simulation of fluid flow presents a
number of' severe challenges for al orithm design. At the
3.3 'Aubnlence Modelling level of inviscid modeling, the in%erent nonlinearity of
the fluid flow equations leads to the formation of singu-
It is doubtful whether a universally valid turbulence larities such as shock waves and contact discontinuihes.
model, capable of describing all complex flows, could he Moreover, the geometric configurations of interest are
devised [52]. AI ebraic models [30,9] haveprovedfairly extremely com lex, and generally contain sh edges
satisfactory for %e calculation of attached and slightly which lead to %e shedding of vortex sheets. ?xtreme
separated wing flows. ,These models rely on the boundary
layer concept, usual1 incorporating separate formulas for
7
the m e r and outer a ers, and they require an estimate
of a length scale whici depends on the thickness of the
e ients near stagnation points or win tips may also
to numerical errors that can have iobal influence.
Numericallv generated entrow mav be convected from
the leading-edge, for cxampl'e: cauiing the formauon of
boundary layer. The estimation of this quantity by a a numericall induced boundary layer which can lead to
search for a maximum of the vorticity times a distance separahon. &e need to treat extenor domans of infinite
to the wall, as @ the Baldwin-Lomax model, can lead to extent IS also a source of difficulty. Boundary condihons
ambiguities in internal flows, and also m complex vorh- imposed at artificial outer boundahes may cause reflected
cal flows over slender bodies and highly swept or delta waves which significantly interfere with the flow. When
wings 140, 1151. The Johnson-Kin model [88], which viscous effects are also included in the simulation, the
allows for non-equilibrium effects &rough the inaoduc- extreme difference of the scales in the viscous boundary
tion of an ordinary differential equation for the maxmum
shear stress. has improved the rediction of flows with
shock induced separation [148, $11,
r
layer and the outer flow. which is essential1 inviscid, IS
another source,ofdifficulty. forcing the use o meshes with
extreme variations in the mesh intervals. For these rea-
sons, CFD has been a driving force for the development
Closure models depending on the solution of transport of numerical algorithms.
eouations are widelv acceoted for industrial anolications.
f i e s e rnodelseliminatethk need toesumate a llngescale
by detecting the ed e of the bounday layer Eddy viscos-
P
ity models typical y use two equauons for the turbulent
kinetic energy k and the dissipation rate E , or a pair of
4.2 Structurpd and Unstructured Meshes

uivalent quantities [89, 178. 160, 1, 121, 351. Models The al oritbm designer faces a number of critical deci-
8 !his type enerall tend to resent difficulties in the
region very cfose to &e wall. &ey also tend to be badly
sions. %%efirst choice that must be made IS the nature
of the mesh used to divide the flow field into discrete
conditionedfornumerical solution. The IC-I model [154] subdomains. The discretization procedure must allow for
is designed to alleviate this problem by taking advantage the treatment of complex configurations. The principal
alternatives are Cartesian meshes, body-fitted curvilinear
of the linear behaviour of the leneth scale 6 near the wall. meshes, and unstructured tetrahedral meshes. Each of
In an alternative a proach to theldesi of models which these a proaches has advantages which have led to their
are more amenahre to numerical sogtion, new models use. d e Cartesian mesh minimizes the corn lexity of
requiring the solution of one transport equation have re- the algorithm at interior points and facilitates %e use of
centl been introduced [IO, 1591. %e performance of high order discretization procedures, at the expense of
the alygebraic models remains competlhve for wing flows, greater corn lemy, and possibly a loss of accuracy, in the
but the one- and two-equation models show promise for treatment oPboundary,conditions at curved surfaces. his
broader classes of flows. In order to achieve greater uni- difficultymay be alleviated b using mesh refinement ro
versality, research is also bein ursued on more complex oedures near the surface. d h theu aid, schemes w f i c i
Reynolds stress transport m&i, which require the solu- use Cartesian meshes have recentl been develo ed to
tion of a larger number of transport equations. treat very complex configurations rho,149,22,9!].
Another direction of research is the attempt to devise have been widely used and are par-
more rational models via renormalization group (RNG) to the treatment of viscous flow be-
theory [182,155 Both algebraic and two-equation k - allow the mesh to be compressed near
models devised by this approach have shown promising W~ this approach, the problem of
results [I 161. mesh generation itself bas proved to be a major pacing
1-4

item. The most commonly used procedures are alge- equations. In the finite volume method [llZd, the. dis-
braic transformations [7,44.46,156], methods based on cretization is accomplished by dividing the omam of
the solution of elliphc equations, pioneered by Thompson the flow into a large number of small subdomans. and
~170,171,157,158],andme~odsbasedonthesolutionof applying the conservation laws in the integral form
yperbolic equahons marehmg out from,the body [161
In order to treat very complex codguratlons it general y 2.
proves ""P ent to use a multiblock 1177, 150 roce
dure, wi separately generated meshes m eact! {loci
which may then be atcbed at block faces, or allowed
to overlap, as m the ghimera scheme [19,20]: While a Here f is the flux appearing in equation (1) and dS is
number of interactive software systems for d enera- the directed surface element of the boundary aR of the
tion have been developed, such as EAOLE, %UbGEN. domain R. The use of the integral form has the advantage
complex configuratlon may requue mon s of effort.x
and ICEM, the generahon of a sapsfacto grid for a very

The alternative is to use an unstructured mesh in which the


that no assum tion of the differentiability of the solutions
is implied, wik the result that it remains a valid statement
for a subdomain containinga shock wave. In general the
subdomains could be arbitrary, but it is convenient to use
domainis subdwidedinto tetrahedra. Thisin turnrequlres either hexahedral cells m a body confornun curvilinear
the development of solution algorithms ca able of yield mesh or tetrahedrons 111 an unstructured mes%.
ing the required accupcy on unstructure8meshes. This
approach has been gaming acceptance, as it is becoming Alternative discretization schemes may be obtained by
a parent that it can lead to a speed-up and reduction in storin flow variables at either the cell centers or the ver-
$e cost of mesh generation that more than offsets the in- tices. h e s e variations are illustrated in Figure 3 for the
creased complexlty and cost of the flow simulations. 30 two-dimensional case. With a cell-centered scheme the
competing procedures for eneratingtnangulahons whch discrete conservation law Iakes the form
have both proved successi l1 are Delaunay triangulahon
[41, 111, based on concepts introduced at the beginning
of the cenhuy by Voronoi [175], and the moving front
method [ 1111.

4.3 Finite Difterence, Finite Volume, and Finite Ele-


ment sehemes
Associated with choice of mesh m e is the formulation of
~~~~ ~~~~

thediscretizationprocedurefor~~equationsoffluidEow,
which can be expressed as differential conservation laws. 3a: Cell Centered Scheme.
In the Cartesian tensor notation. let z,be the coordinates.
p, 0, T,and E the pressure, density, temperature, and
iod energy, and ut d e velocity corn onents Usin the
convention that summation over i=1.13 is imlied%y a
repeated subscript j , each ConseiVation equati6n has ihe
form
aw + ='af.
- o. (1)
at 82,.
For the mass equation
3b: Vertex Scheme.
w=p, fj=puj.
Figure 3 Structured and Unstructured Discretizations.
For the i momentum equation

d
-wv+
dt
f.S=O, (4)
where U,, is the viscous stress tensor. For the energy
equation faces

w=pE, f 3 = ( p E+p) U,
aT
- U3kUk - l
C zv
where V is the cell volume, and f is now a numerical
estimate of the flux vector through each face. f may be
3 evaluated from values of the flow variables in the cells
separated by each face, using upwind biasing to allow for
where nis the coefficient of heat conduction. The pressure the+ections,of wavepro agahon. With.hexahedral cells,
is related to the density and energy by the equation of state equatlon (4) is very S&I to a finite Merence scheme
in curvilinear coordinates. Under a transformation to
1 curvilinear coordinates (j. equation (1) becomes
P=(r- 1) p (E - p U , ) (2)

in which y is the ratio of specific heats. In the Navier- (5)


Stokes equations the viscous stresses are assumed to be
linearly proportionalto h e rate of strain, or
matrix [e].
where J is the Jacobian determinantof the transformat&on
The transformed flux J$ f, corresponds
to the dot product of the flux f with a vector face area

where p and X are the coefficients of viscosity and bulk


Jg, while J represents the transformation of the cell
viscosity, and usually X=-2p/3. volume. The finite volume form (4) has the advantages
that it is valid for both structuredand unstructured meshes,
The finite difference method, which requires the use of and that it assuresthat a umfonn flow exactly satisfies the
a *esian or a spuctured curvhear mesh, directly ap- equations, because CfWsS=Ofor a closed control vol-
promates the hfferenhal operators appearing in these ume. Finite dif€erence schemes do not necessarily satisfy
this constraint because of the discretization errors in eval- maximum V k - v j 5 0, and at a minimum V k - v j 2 0.
uating and the inversion of the transformation matrix. Thus the condition
zj
A cell-vertex finite volume scheme can be derived by tak-
ing the union of the cells surroundinga given vertex as the cjk 2 0, k Pj (8)
control volume for that vertex [55, 71, 1391. In equation
(4), V is now the sum of the volumes of the surrounding is sufficient to ensure stability in the maximum norm.
cells, while the flux balance is evaluated over the outer Moreover, if the scheme has a compact stencil, so that
faces of the polyhedral control volume. In the absence of c j k ' o when j and k are not nearest neighbors, a local
upwind biasing the flux vector is evaluated by averagin
over the corners of each face. This has the advantage of maximum cannot increase and local minimum cannot de-
remaining accurate on an irregular or unstructured mesh. crease. This local extremum diminishing (LED) roperty
An alternative route to the discrete equations is provided prevents the birth and growth of oscillations. ' h e one-
by the finite element method. Whereas the finite differ- dimensional conservation law
ence and finite volume methods a proximate the differ-
R
entia1 and integral operators, the nite element method au a
- + --f(U) =o
proceeds by inserting an approximate solution into the at ax
exact equations. On multiplying by a test function q5 and
integratingby parts over space, one obtains the weak form provides a useful model for analysis. In this case waves
are propagated with a speed a(u> =%,
and the solution
is constant along the characteristics %=a(u). Thus the
LED property is satisfied. In fact the total variation

which is also valid in the presence of discontinuitiesin the


flow. In the Galerkin method the a proximate solution is
expandedin terms of the same famify of functions as those
from which the test functions are drawn. By choosing of a solution of this equation does not increase, provided
test functions with local su port, separate equations are that any discontinuit appearing in the solution satisfies an
obtained for each node. €o!r example, if a tetrahedral
mesh is used, and q5 is piecewise linear, with a nonzero
B
entropy condition [ 61. Harten pro osed that difference
schemes ought to be designed so tiat the discrete total
value only at a sin le node, the equations at each node variation cannot increase [56]. If the end values are fixed,
have a stencil whick contains only the nearest neighbors. the total variation can be expressed as
In this case the finite element approximation corres onds
closely to a finite volume scheme. If a iecewise Tinear
approximation to the flux f is used in tie evaluation of
the integrals on the right hand side of equation (6), these
T V ( ~=2) (E maxima - E minima) .
integrals reduce to formulas which are identical to the flux
balance of the finite volume scheme. Thus a LED scheme is also total variation diminish-
ing (TVD). Positivit conditions of the ty e expressed
Thus the finite difference and finite volume methods lead in equations (7) a n i (8) lead to diagonaiy dominant
to essential1 similar schemes on structured meshes, while schemes, and are the key to the elimination of improper
the finite voyume method is essentially equivalent to a fi- oscillations. The positivity conditions may be realized by
nite element method with linear elements when a tetra- the introduction of diffusive terms or by the use of up-
hedral mesh is used. Provided that the flow equations wind biasing in the discrete scheme. Unfortunately, they
are expressed in the conservation law form (l), all three may also lead to severe restrictions on accuracy unless the
methods lead to an exact cancellation of the fluxes through coefficients have a complex nonlinear dependence on the
interior cell boundaries, so that the conservative ro erty solution.
of the equations is preserved. The important r o l orthis
propert in ensuring correct shock jump conditions was
pointedbut by Lax and Wendroff [97]. 4.4.2 Artijicial Difision and Upwinding

4.4 Non-oscillatory Shock Capturing Schemes Following the pioneering work of Godunov [ 5 13, a variety
of dissipative and upwind schemes designed to have good
4.4.1 Local Extremum Diminishing (LED) Schemes shock capturing properties have been developed during
the past two decades [162, 23, 98, 100, 146, 130, 56,
The discretization rocedures which have been described 129, 166, 5, 68, 183, 62, 180, 13, 12, 111. If the one-
in the last section read to nondissipative approximations dimensional scalar conservation law
to the Euler equations. Dissipative terms may be needed
for two reasons. The first is the possibilit of undamped av a
- + --f(v) =o
oscillatory modes. The second reason is txe need for the at ax (9)
clean capture of shock waves and contact discontinuities
without undesirable oscillations. An extreme overshoot is represented by a three point scheme
could result in a negative value of an inherently positive
quantity such as the pressure or density. The next sec- -dvj-
-C.+t + (Vj+, - Vj) + cT (U+, - Vj) ,
tions summarize a unified ap roach to the constructlon of dt 3 3-4
nonoscillatory schemes via tie introduction of controlled
diffusive and antidiffusiveterms. This is the line adhered the scheme is LED if
to in the author's own work.
The development of non-oscillato schemes has been a c3+.+ t >
- 0, 2 0.
rime focus of algorithm researchxr com ressible flow.
P
Eonsider a general semi-discrete scheme o the form A conservative semidiscrete approximation to the one-
dimensional Conservation law can be derived b subdi-
(7) viding the line into cells. Then the evolution of tie value
v j in the jth cell is given by
A maximum cannot increase and a minimum cannot de- dvj
crease if the coefficients C j k are non-negative, since at a Ax- dt + hj+f - hj-f=O,
1-6

where hj+l is an estimate of the flux between cells j and 4.4.3 Hi h Resolution Switched Schemes: Jameson-
j + 1. Th2esimplest estimate is the arithmetic average Scfmidt-Turkel (JST) Scheme
(fj+l + fj) /2, but this leads to a scheme that does not
satisfy the positivity conditions. To correct this, one may Higher order non-oscillatory schemes can be derived by
add a dissipative term and set introducing anti-diffusive terms in a controlled manner.
An early attem t to roduce a hi h resolution scheme
by this approac\ is tRe Jameson-lchmidt-Turkel (JST)
scheme [ 8 5 ] . Suppose that anti-diffusive terms are intro-
duced by subtracting neighboring differences to produce
In order to estimate the required value of the coefficient a third order diffusive flux
aj+4, let aj+4 be a numerical estimate of the wave speed
Af
d 3. + l1= c r3. +2l { A u 3. + I7 - -21 (Auj+* + A u j - ~ ) } , (15)

which is an approximation to ; a h 3 The positivity &.


condition (8) is violated b this scheme. It proves that it
generates substantial oscjlrations in $e vicinity of shock
Then waves, which can be eliminated by switchinglocally to the
first order scheme. The JST scheme therefore introduces
blended diffusion of the form

where
Avj+f=vj+l - v j 1
and the LED condition (10) is satisfied if
The idea is to use variable coefficients .+ ,
and E (4)
3+4 J T
which produce a low level of diffusion in regions where
If one takes the solution is smooth, but prevent oscillations near dis-
1
a.+1=-
3 I 2 laj+jll continuities. If is constructed so that it is of order
3+4
one obtains the first order upwind scheme Ax2 where the solution is smooth, while E$; is of order
unity, both terms in dj+ 4 will be of order Ax3.
The JST scheme has proved very effective in practice in
This is the least diffusive first order scheme which satisfies numerous calculations of com lex steady flows, and con-
the LED condition. In this sense upwinding is a natural ditions under which it could ge a total variation dimin-
approach to the construction of non-oscillatory schemes. ishing (TVD) scheme have been examined by Swanson
It may be noted that the successful treatment of transonic and Turkel [165]. An alternative statement of sufficient
otential flow also involved the use of U wind biasing.
h i s was first introduced by Murman and ??ole to treat the conditions on the coefficients E !'Iand E !4) for the JST
3+t 3+i
transonic small disturbance equation [ 1231. scheme to be LED is as follows:
Another important re uirement of discrete schemes is
that they should excluie nonphysical solutions which do
not satisfy appropriate entropy conditions [95], which Theorem 1 (Positivity of the JST scheme)
require the convergence of characteristics towards ad- Suppose that whenever either vj+l or vj is an extremum
missible discontinuities. This places more stringent the coefficients of the JST scheme satisfy
bounds on the minimum level of numerical viscosity
[113, 169, 128, 1311. In the case that the numerical flux
> -1 j+i-o.
-
function is strictly convex, Ais0 has recently proved [2]
that it is sufficient that 3+t -21
"'+f
3 I 1
e (4)

Then the JST scheme is local extremum diminishing


(LED).
for e > 0. Thus the numerical viscosity should be rounded Proof: We need only consider the rate of change o v at
out and not allowed to reach zero at a point where the extrema1 points. Suppose that v j is an extremum. d e n
wave speed a ( u ) =% approaches zero. This justifies, for
example, Harten's entropy fix [56].
Higher order schemes can be constructed b introducing
higher order diffusive terms. Unfortunate{; these have
larger stencils and coefficients of varying si n which are and the semi-discrete scheme ( 1 1 ) reduces to
not compatible with the conditions (8) for a E!D scheme,
and it is known that schemes which satisfy these condi-
tions are at best first order accurate in the nei hborhood
B
of an extremum. It proves useful in the fol owin de-
velopment to introduce the conce t of essentially focal
extremum diminishing (ELED) scgemes. These are de-
fined to be schemes which satisfy the condition that in
the limit as the mesh width Ax + 0, local maxima are
non-increasing, and local minima are non-decreasing. and each coefficient has the required sign. U
1-7

In order to construct 63!*) and E 3!4)- 4 with the desired prop- Set
-#
erties define

R(u,v) =[ Ii'%%l 9
if U P O or v #O
0 ifu=v=O,
(18)
and

where q is a positive integer. Then R(u,v ) =1 if U and v


have opposite signs. Otherwise R(u,v) < 1 . Now set
Then,
Qj=R<Avj+t,Avj- i 1 , Qj+ ;=max ( Q j ,Q j + l ) .
and

Thus the scheme satisfies the LED condition if aj+f2


laj+jI for all j , and $ ( T ) >_ 0, which is assured by
4.4.4 Symmetric Limited Positive (SLIP) Scheme property (P4)on L. At the same $me it follows from
property (P3)that the first order diffusive flux is can-
celed when AV is smoothly varying and of constant sign.
An alternative route to high resolution without oscillation Schemes constructed by this formulation will be referred
is to introduce flux limiters to uarantee the satisfaction
5%
of the positivity condition (8). e use of limiters dates
back to the work of Boris and Book [23]. A particularly
to as symmetric limited ositive (SLIP) schemes. This
result may be summarize as B
sim le wa to introduce limiters, pro osed by the author
B
in 4984 [l8],is to use flux limited issipatton. In this
Theorem 2 (Positivity of the SLIP scheme)
scheme the third order diffusion defined by equation (15)
is modified by the insertion of limiters which produce an Su pose that the discrete conservation law ( I I ) contains
a lmited d'ffusiveflux as de ned by equation (20). Then
e uivalent three point scheme with positive coefficients.
?%e original scheme [68]can be improved in the followin
manner so that less restrtctive flux limiters are required!
di
the positivi condition ( I ), together with the prvper-
ties (PI-lVYfor limited averages, are sufJicient to ensure
Let L ( u , v ) be a limited average of U and v with the satisfaction of the LED principle that a local m i m u m
following properties: cannot increase and a local minimum cannot decrease. 0

A variety of limiters may be defined which meet the re-


P1. L(U, V I = L ( v ,21) quirements of properties (Pl-P4). Define
P2. L ( a u , av)= a L ( u , v ) 1
S ( u , v ) =- {sign(u) + sign(v) }
P3. L(U, U ) =U 2
P4. L ( U , v ) =O if U and v have opposite signs: other- which vanishes is U and v have opposite signs.
wise L ( u ,v ) has the same sign as U and U. Then two limiters which are appropriate are the following
well-known schemes:
Properties (Pl-P3) are natural properties of an average.
Pro erty (P4)is needed for the construction of a LED or
TVb scheme.
It is convenient to introduce the notation
$(TI = L ( l , r )= L ( r ,1 1 , 2. Van Leer:
where according to (P4)$ ( T I 2 0. It follows from (P2)
on setting a=; or that

In order to produce a family of limiters which contains


these as special cases it is convenient to set
Also it follows on setting v = l and U=T that 1
L(U,V) = - D ( u , v ) ( U + w) ,
2
where D ( U , v ) is a factor which should deflate the arith-
metic average, and become zero if U and v have opposite
Thus, if there exists T < 0 for which $ ( T I > 0, then signs. Take
$ (:) < 0. The only way to ensure that $ ( T I 2 0 is to
re uire $ ( T ) =O for all T < 0, corresponding to property
(PI).
Now one defines the diffusive flux for a scalar conserva- where R(u,v ) is the same function that was introduced
tion law as in the JST scheme, and q is a positive integer. Then
D ( u , v) =O if U and v have opposite signs. Also if q=1,
L(u, v) reduces to minmod, while if q=2, L ( u ,v) is
1-8

equivalent to Van Leer’s li+ter. By if a j + l < 0. If cr one recovers a standard


a
generate a se uence of 1imted.averages
a limit define by the arithmetic mean
when U and v have opposite signs.
high resolution upwind scheme in semi-discrete form.
Consider the case that aj+i > 0 and a j - 1 > 0. If one
When the terms are re rouped, it can be seen that with sets
Q
this limiter the SLIP sc eme is exact1 equivalent to the
JST scheme, with the switch is definecras
+ “vj+t
T =-
- Avj- 9
T =-
AV+ 4 ’ A V ~ 4- ’
the scheme reduces to

j+l = aj+j (1 - Qj+t) . To assure the correct sign to satisf fhe LED criterion the
€ (4)
d
flux limiter must now satisfy the a ditlonal constraint that
This formulation thus unifies the JST and SLIP schemes. $ ( T I 5 2.
The USLIP formulation is essentially e uivalent to stan-
4.4.5 Essentially Local Extremum Diminishing (ELED)
R
dard upwind schemes [ 130,1661. Both t e SLIP and US-
LIP constructions can be implemented on unstructured
Scheme with SoftLimiter meshes [75,79]. The anti-diffusive terms are then calcu-
lated by taking the scalar product of the vectors definin
The limiters defined by the formula (22) have the disad- an edge with the gradient in the adjacent upstream an
downstream cells.
9
vantage that they are active at a smooth extrema, reducing
the local accuracy of the scheme to first order. In or-
der to prevent.this, the SLIP scheme can be relaxed to
a
give an essentlally local extremum diminishin (ELED)
scheme which is second order accurate at smoot extrema
by the introduction of a threshold in the limited average.
4.4.7 S stems o Conservation Laws: Flux Splitting and
dux-D d r e n c e Splitting
Therefore redefine D (U, U > as Steger and Warming [ 1621first showed how to generalize
D (U,VI =1 -
max( IuI + 1v1 ,€Axr>
Iq, (23)
the concept of upwinding to the system of conservation
laws
aw a
-+-f(w)=O
a t ax
4,
where T= q 2 2. This reduces to the previous definition
by the concept of flux splitting. Suppose that the flux is
if lul + 1vl > €AxT.
In any region where the solution is smooth,Avj+s -Avj- 4
split as f =f + f - where
+ and E
have positive and
negative ei envalues. Then the first order upwind scheme
is of order Ax2. In fact if there is a smooth extremum in is producef by taking the numerical flux to be
the neighborhood of vj or v ~ +a ~Taylor
, series expansion
indicates that Avj+ 3, Avj+ 1 and Avj- ;are each individ-
ually of order Ax2, since $=O at the extremum. It may
be verified that second order accuracy is preserved at a This can be expressed in viscosity form as
smooth extremum if q 2 2. On the other hand the lim-
I I
iter acts in the usual way if IAvj+ j or lAvj- j > €AxT,
and it may also be verified that in the limit Ax + 0
local maxima are non increasin and local minima are
non decreasing [79]. Thus the sci?eme is essentially local
extremum diminishing (ELED). = 1 (fj+l + f j )
- - dj+f,
The effect of the “soft limiter” is not only to improve the 2
accuracy: the introduction of a threshold below which where the diffusive flux is
extrema of small amplitude are accepted also usually re-
sults in a faster rate of convergence to a steady state, and
decreases the likelyhood of limit cycles in which the lim-
iter interacts unfavorably with the corrections produced
by the updatin scheme. In a scheme recently proposed
by Venkatakrisnan a threshold is introduced precisely Roe derived the alternative formulation of flux difference
for this purpose [ 1741. s littin [ 1461 by distributing the corrections due to the
&x dilference in each interval upwind and downwind to
obtain
4.4.6 Upstream Limited Positive (USLIP) Schemes dwj + =o,
Ax-
dt
+(fj+l - f j > - + < f j- f j - I >
By adding the anti-diffusive correction purely from the
upstream side one may derive a family of upstream limited where now the flux difference fj+l - fj is split. The
ositive (USLIP) schemes. Corespondin to the original correspondingdiffusive flux is
ELIP scheme defined by equatlon (20), akSLIP scheme
is obtained by setting

dj+f=aj+t {Avj+j - L @vj+f’Avj--t)}


Following Roe’s derivation, let A j + f be a mean value
if aj+t > 0, or
Jacobian matrix exactly satisfying the condition
fj+1 - fj==Aj+i(wj+l - wj) . (26)
1-9

Aj+i may be calculated by substituting the weighted av- Thus these schemes are closely related to schemes which
erages introduce separate s littings of the convective and res-
sure terms, such ase!lt wave-particle scheme [141,8[ the
advection upwind splitting method (AUSM) [ 106, 1761,
and the convective upwind and split pressure (CUSP)
schemes [76].
into the standard formulas for the Jacobian matrix A=%. In order to examine the shock capturing properties of these
various schemes, consider the general case of a first order
A splitting according to characteristic fields is now ob- diffusive flux of the form
tained by decomposing Aj+4 as
1
dj+f=Taj+tBj+f(wj+l - wj) > (34)
A ~ +=TAT- l , (28)
where the columns of T are the eigenvectors of Aj+4, where the matrix Bj+4determines the properties of the
scheme and the scaling factor aj+t is included for con-
and A is a diagonal matrix of the eigenvalues. Now the
corresponding diffusive flux is venience. All the previous schemes can be obtained by
representing Bj+jas a polynomial in the matrix Aj+t
defined by equation (26). Schemes of this class were
considered by Van Leer [99]. According to the Cayley-
Hamilton theorem, a matrix satisfies its own characteristic
where equation. Therefore the third and hi her powers of A can
I A ~ + ~I =T 1 ~ T - 11 be eliminated,and there is no loss o4enerality in limiting
Bj+ to a polynomial of degree 2,
and 1A1 is the diagonal matrix containing the absolute
values of the eigenvalues. Bj+t=aol+ alAj+j + ~ 2 A : + f . (35)
The characteristic
4.4.8 Alternative Splittings upwind scheme for which Bj+l=
Characteristic splitting has the advantages that it intro- substituting Aj+4=TAT-', A2 =TA2T-l. Then QO,
3+f
duces the minimum amount of diffusion. to exclude the ~ 1 and
, a2 are determined from the three equations
rowth of local extrema of the charactenstic variables,and
Bat with the Roe lineafizatjon it allows a discrete shock
structure with a single interior point. To reduce the com- QO + + Q ~ X ; = I X ~ ~ k=1,2,3.
,
putational complexity one may replace IAl by a1 where The same re resentation remains valid for three dimen-
if Q is at least equal to the spectral radius max IX(A) I, sional flow gecause Aj+i still has only three distinct
then the positivity conditions will still be satisfied. Then eigenvalues U , U + c, U - c.
the first order scheme simply has the scalar diffusive flux
1
dj+f=TQj+tAwj+t. (29) 4.4.9 Analysis of Stationary Discrete Shocks

The JST scheme with scalar diffusive flux ca


waves with about 3 interior oints, and it has
P
used for transonic flow ca culations because it is bo
robust and computationally inexpensive.
An intermediate class of schemes can be formulated by ! !
defining the first order diffusive flux as a combination of I WL i WL I!

'vy
differences of the state and flux vectors
1 . 1
dj+t=y"j+lc (wj+l- wj) + z ~ j + 4( f j + l - fj) 9 (30)
I !
I
I I
where the factor c is included in the first term to make
~ j * + and
~ bj+* dimensionless. Schemes of this class I
I
are fully upwind in supersonic flow if one takes Q ~ * + ~ = O I
I
and Pj+t=sign ( M ) when the absolute value of the Mach j+l I j+2 I I

number M exceeds 1. The flux vector f can be decom-


posed as Figure 4: Shock structure for single interior point.
f = U w + f,, (31)
The ideal model of a discrete shock is illustrated in fi -
where ure (4). Suppose that W L and W R are left and rigat
f,= ( ) .
(32)
states which satisfy the jump conditions for a stationary
shock, and that the corresponding fluxes are fL=f ( W L )
and f ~ (wR).
- Since the shock is stationary fL= R .
Then -f
The idea discrete shock has constant states W L to the eft
and W R to the ri ht, and a single point with an intermedi-
f
f j + i - fj=fi ( w j + l - wj) + tij ( u j + l - u j ) + fpj+, - f p j 7
ate ?a
value W A . e intermediate value is needed to allow
the discrete solution to correspond to a true solution in
(33) which the shock wave does not coincide with an interface
where I and U, are the arithmetic averages between two mesh cells.
Schemescorrespondingto one, two or three terms in equa-
tion (35) are examined in [80]. The analysis of these three
1-10

cases shows that a discrete shock structure with a single the corresponding error in the tem erature may lead to a
interior point is su ported by artificial diffusion that sat-
isfies the two cond!tions that reactions.
8
wrong prediction of associated e ects such as chemical

The source of the error in the stagnation enthalpy is the


1. it produces an upwind flux if !he flow is determined discrepancy between the convective terms
to be supersonic through the interface
2. it satisfies a generalized ei envalue problem for the
exit from the shock of thejorm
(AAR- OARBAR)
(WR - WA)=O, (36) in the flux vector, which contain p H , and the state vector
which contains pE. This may be remedied by introducing
where AAR is the linearized Jacobian matrix and B R a modified state vector
is the matrix defining the diffusion for the interface AIR.
This follows from the equilibrium condition h R A = h R R
for the cell j + 1 in fi ure 4. These two conditions are
satisfied by both the ciaracteristic scheme and also the
CUSP scheme, provided that the coefficients of convective
diffusion and pressure differences are correctly balanced. Then one introduces the linearization
Scalar diffusion does not satisfy the first condition. In the
case of the CUSP scheme (30) equation (36) reduces to fR - f L = A h (WhR - Wh' .
Here A h may be calculated in the same wa as the stan-
z
dard Roe linearization. Introduce the weig ted averages
defined by equation (27). Then
Thus W R - W A is an eigenvector of the Roe matrix A R A , 0 1
and -$ is the corresponding eigenvalue. Since the
7
eigenvaluesare U , U + c, and U - c, the only choice which
leads to positive diffusion when U > 0 is U - c, yielding
-uH ^k U

the relationship
The eigenvalues of A h are U, A+ and A- where
a * c = ( l + p >( C - U ) ,o< U < c
Thus there is a one parameter family of schemes which
support the ideal shock structure. The term p<fR - f A )
contributes to the diffusion of the convective terms. Al-
lowing for the split (3 l), the total effective coefficient of Now both CUSP and characteristic schemes which pre-
convective diffusion is a c = a * c + E. A CUSP scheme serve constant stagnation enthalpy in steady flow can be
with low numerical diffusion is then obtained by taking constructed from the modified Jacobian matrix A h [80].
a=JMI,leading to the coefficients illustrated in figure 5 . These schemes also produce a discrete shock structure
with one interiorpoint in steady flow. Then one arrives at
four variations with this ropert ,which can conveniently
be distin uished asthe E! and d-CUSP schemes, and the
E- and Ifcharactenstic schemes.

4.5 Multidimensional Schemes


The simplest approach to the treatment of multi-
dimensional problems on structured meshes is to appl
the one-dimensional construction separately in each mesh
direction. On triangulated meshes in two or three dimen-
sions the SLIP and USLIP constructions may also be
Figure 5 : Diffusion Coefficients. implemented along the mesh edges [79]. A substantial
body of current research is directed toward the imple-
mentation of truly multi-dimensional upwind schemes in
which the upwind biasing is determined by properties of
4.4.10 CUSP and Characteristic Schemes Admitting the flow rather than the mesh. A thorough review is given
Constant Total Enthalpy in Steady Flow by Pailliere and Deconinck in reference [ 1321.
Residual distribution schemes are an attractive a proach
In steady flow the stagnation enthalpy H is constant, cor- for triangulated meshes. In these the residual deined by
responding to the fact that the energy and mass conserva- the s ace derivatives is evaluated for each cell, and then
tion equations are consistent when the constant factor H distriiuted to the vertices with weights which depend on
is removed from the energy equation. Discrete and semi- the direction of convection. For a scalar conservation
discrete schemes do not necessarily satisfy this property. law the weights can be chosen to maintain positivity with
In the case of a semi-discrete scheme expressed in viscos- minimum cross diffusion in the direction normal to the
ity form, equations (1 1) and (12),a solution with constant flow. For the Euler equations the residual can be linearized
H is admitted if the viscosity for the energy equation re- by assuming that the parameter vector with components
duces to the viscosity for the continuity equation with p ,/&,@ui, and f i H varies linearly over the cell. Then
replaced by p H . When the standard characteristic de-
composition (28) is used, the viscous fluxes for --
afj ( w )-Aj&
p H which result from composition of the fluxes or and
characteristic variables do not have this roperty, and H
P
the axj axj
is not constant in the discrete solution. fn practice there
is an excursion of H in the discrete shock structure which where the Jacobian matrices A j = g are evaluated with
represents a local heat source. In very high speed flows Roe averaging of the values of w at the vertices. Waves
1-11

in the direction n can then be expressed in terms of the


eigenvectors of njA3,and a positwe distribution scheme
For a cell-centered discretization (figure 6a) 2 is needed
is used for waves in preferred directions. The best choice at each face. The simplest procedure is to evaluate
of these directions IS the subject of on oing research,
but reliminary results indicate the possiiety of p+v- in each cell, and to average & between the two cells
inkkgh resolutlon of shocks and contact dmontlnuihes on either side of a face [87]. %e resulting discretization
w cb are not aligned with mesh lines [132]. does not have a com act stencil, and sup Its undamped
Hirsch and Van Ransbeeck adopt an alternative approach
R
oscillatory modes. a one-diensionZalculation. for
in whichthey directly conshuctduectionaldiffusive terms example, 2 U + -2U.tUI-2
would be discretized as * ' 4kz . In
on structured meshes, with anti-diffusion controlled by
limiters based on comparisons of slopes in diffFent dI-
order to produce a compact stencil 2 may be estimated

c
rections 601. They also show prouusing results in calcu-
lations o nozzles with multiply reflected oblique shocks.
From acontrol volume centered on each face,using formu-
las (38) or (39) 11441. This is com utationally expensive
because the number of faces is mu& larger than the num-
ber of cells. In a hexahedral mesh with ti large number of
vermes the number of faces approaches three times the
4.5.1 Hi h Order Godunov Schemes, and Kinetic F l u number of cells.
Sphing
This motivates the inn@ucuon of dual meshes for the
A substantial body of current research is d e l e d toward evaluatlon of the velocity denvatlves and the flux bal-
the implementation of truly mul~-dImensionalupwind ance as sketched in figure 6. The figure shows both

dk
schemes [59,135,101 Reference [132] provides a thor-
ough review of recent velopments in this field. Some of
the most impressive simulationsof time dependent flows
with strong shock waves have been achieved with higher
order Godunov schemes [1801. In these schemes the aver-
age value in each cell is updated by applying the integral
conservation law using interfFe fluxes, predicted from
the exact or approxlmate SOluhOn of a hemann problem
between adiacent cells. A hieher order estimate of the
solution IS then reconstructed h m the cell averages, and
slope luniters are ap lied to the reconstruction. An ex-
ample is the class ofessentially nonroscillatory (ENO)
schemes, which can attain a very lugh order of accu-
racy at the cost of a substantial increase tn computational
complexiy [32. 153, lS1,.152]. Methods b&d on re-
construction can also be unplemented on unstructured 6a: Cell-centered
meshes 113, 121. Recently there has been an increasing scheme. uij evaluated 6b: Cell-vertex scheme.
interest in kinetlc flux splrtting schemes, which use solu- at vertices ofthe primaryoij evaluated at cell cen-
tions of the Boltzmann equatlcon or the BGK equation to mesh ters of the primary mesh
predict the interface fluxes [42,36,45, 136, 1811.
Fi ure 6 Viscous discretizations for cell-centered and
cei-vertix algorithms.

Tbe discretization of the viscous terms of the Navier cell-centered and cell-vertex schemes. The dual mesh
Stokes equations requires an approximation to the vc- connects cell centers of the orimarv ,mesh. If .there is
. ~. a~.
2 kink in the ~~&a&mesh:&ule'du&xlls should be formed
~~~~~~~

locity derivatives in order to calculate the tensor u,j, by assemdng contiguous fractlons of the neighboring
defined by uation (3). Then the viscous terms may be primary cells. On smooth meshes comparable results are
included i n x e flux balance (4 In order to evaluate the obtained b either of these formulations [114,115, 1071.
derivatives one may apply the auss formula to a control If the mest has a kink the cell-vertex scheme has the
volume V with the boundary S advantage that the derivatives 2 are calculated in the
interior of a regular cell, with no loss of accuracy.
A desirable property is that a linearly varying velocity dis-
tribution, as in a Couette flow, should produce a constant
where nj is the outward normal. For a teuahedral or stress and hence an exact stress balance. This roperty is
hexahedral cell this gives not necessarily satisfied in general by finite digerence or
finite volume schemes on curvilinear meshes. The char-
acterization k-exact has been proposed for schemes that
are exact for polynomials of degree k. The cell-vertex fi-
nite volume scheme is linearly exact if the derivatives are
evaluated by equation (39). since then 2 is exactly eval-
where a, is an estimate of the avera e of U, over the uated as a constant, leading to constant viscous stresses
face. If u varies linearly over a tetmiedral cell this is uij, and an exact viscous stress balance. This remains
exact. Alternatively, assuming a local transformation to true when there is a kink in the mesh, because the sum-
computational coordinates tj- one may apply the chain mation of constant stresses over the faces of the kinked
control volume sketched in figure 6 still yields a perfect
balance. The use of equation (39) to evaluate 2, bow-
ever, requires the additional calculation or storage of the
Here the transformation derivatives e can be evaluated
nine metric quantities 2 in each cell, whereas equation
(38) can be evaluated from the same face areas that are
derivatives e In this case
varying functlon.
e
by the same finite difference formulas as the velocity
is exact if U is a linearly
used for the flux balance.
In the case of an unshuctured mesh, the weak form (6)
leads to a natural discretization with linear elements, in
1-12

which the piecewise linear approximation ields a con- R(w"+'). The resulting equation
stant stress in each cell. This method yieldys a represen-
tation which is globally correct when averaged over the
cells, a result that can be proved by energ esumates for el-
liptic problems [ 1641. It should be. n o t d however, mat it
yields formulas that are not necessarily locally consistent can be linearized as
with the differential equations, if Ta lor seriesexpansions
are substltuted for the solution at tie vertices appear@
in the local stencil. Figure 7 illustrates the mscretlzatlon
of the Laplacian uZz+ uyywhich is obtained with linear
elements. It shows a particular triangulation such that If one sets p=l and lets At + m this reduces to the
the approximation is locally consistent with uzz + 3uyy. .
Newton iteration which hm been successfully used in
Thus the use ofan irregular uian lation in the boundary two-dimensional calculations 1173, 50 In the three-
layer may significantly degrade tre accuracy. dimensional case with, sa an N x ". x N mesh, the
bandwidth of the matrix tkat must be inverted is of or-
der N'. Direct inversion requires a number of operations
proportional to the number of unknowns multlplied by
the s uare of the bandwidth of the order of N7.This is
h rohhtive, and forces recourse to either,an approximate
Factorization method or an iteranve solutlon method.
Alternatin direction meth+, which introduce factors
corresponiing to each coordinate, are widely used for
structured meshes [17, 1371. They cannot be imple-
mented on unstructured tetrahedral meshes that do not
contain identifiable mesh dxections, although other de-
compositions are possible [log]. If one chooses to adopt

-- b h
the iterative solutlon techruque, the rinci al alternatives
are variants of the Gauss-Ssidel a n i Jacogi methods. A
symmetric Gauss-Seidel metpod with one iteration per
tlme step is essentially eqtyalent to an approximate
lower-upper (LU)factorizatlon of the imphcit scheme
Figure 7 Example of discretization..u + uyuon a trian- [86,125,31,184]. On the other hand, the Jacobi method
gular mesh. The discretizationis locally equivalent to the with a fixed number of iterations per time step reduces
to a multista e explicit scheme, belongin to the en
approximation U,,=*, 3~,,="d-6hU,.+~~ . eral class of fun e Kutta schemes [33 Sciemes o?thiI
type have rovdv& effective for wide varie of prob
lems, and %ey have the advanta e that they c a n L applied
equally easily on both structurdand unstructuredmeshes
[84,67,69, 1451.
4.7 Time Stepping Schemes If one reduces the linear model problem corresponding to
If the space discretization rocedwe is implemented sep- (40) to an ordinary differential equation by substitutinga
arately, it leads to a set ofcoup!ed ordinary differentlal Fourier mode fi=e'PzJ, the resultin Fourier symbol has
equatlons, which can be written in the form an imapnary a t proportional to fhe wave speed. and
a negatlve reaf'part roportional to the diffusion. n u s
the tune stepping sc?I erne should have a stability region
dw which contams substantial intervals of both the negative
- + R(w)=O, real axis and the hagin? yis.To achieve this it pays
dt
to treat the convectlve an dssipatlve terms in a distlnct
where w is the vector of the Bow variables at the mesh fashion. Thus the residual is split as
points, and R(w) is the vector of the residuals, consisting
of the flux balances defined by the space discretization R(w)=Q(d
+ D(ur),
scheme, together with the added dissi ative terms. If the where Q(,,,)is the convective and D(w) the dissi-
obiective is simolv to reach the stedv state and details
ative art. Denote the time level nAt by a superscri t n
h e n g e multistage time stepping scheme is f o r m u f k
as

derivatives are calculated from known values of theflow ,,,(n+i,o) = ,,n


vaiables,at the, beginning of the time step, or an kplicit
scheme, UI whch the formulas for the s ace denvatlves
include as yet unknown values of the &w variables at
the end of the time ste , leading to the need to solve
coupled equations for &e new values, p e pemsst-
ble time step for an explicit scheme IS h t e d by the
Courant-Fnednchs-Lewy (a) condition, which states
that a differencescheme cannot be a conver ent and stable
a prodation unless its domain of depenfence contains
$e d o m m of dependence of the corresponding Meren-
tial eouation. One can anticioate that unolicit
~~ ~ ~ ~~~~~~~~~~~~

r~~~~~ schemes
~ ~ ~~~~

will geld convergence in a sm'aller number of time ste s


~

because the timestep is no longer constrainedby the Ck


condjtion. Imphcit schemes will be. efficient, however,
only if the decrease in the number of ume steps outweighs
thehcrease in the corn utational effort per time step con-
sequent u p n the n d t o solve coupled equations. The
prototype lmplicit scheme can be formulated by estimat-
ing at t + pAt as a linear combination of R (w") and
1-13

chosen to increase the stability interval along the has to be transferred back to grid k - 1 with the aid of
negatlve real axis. ..
- . ovtimized
an intemolationoveratorL .7 . .,k .. With uroverlv
These Schemes do not fall the mework coefficikntsmultiitagetimestepping schemes can be very
of Runge-Kutta schemes, and they have much larger sta- effiC*ent,fiven of the 'd P'JCess. A W-cYcle of
bility regions [69]. live schemes which have been found the type illustrated In F@ure(lfProves to be a Particularly
to be particularl effective are tabulated below. The first
is a four-stage scieme with two evaluations of dissipation.
Its coefficients are

The second is a five-stage scheme with three evaluations


of dissipation. Its coefficients are

4.8 Multigrid Methods


4.8.1 Accelerntion of Steady Flow Calculations

Radical improvements in the rate of convergence to a


steady state can be realized by the multigrid time-step ing
t e c h ue The concept of acceleration by the intrduc- 8c:5 Levels.
tion o? mhtiple grids was first proposed b Fedorenko
[48]. There is by now a fairly well-devegped theory Figure 8: Multigrid W-cYcle for managing the grid Gal-
of multi 'd methods for elliptic e q d o n s based on the culation. E, evaluate the change in the Bow for one step;
concept g a t the updating scheme acts as a smoothing o p T,transfer the data without updating the solution.
erator on each grid [24,53]. This theory does not hold for
hy rbolic s stems. Nevertheless, it Seems that it ou ht
E
to possibz to accelerate the evolution of a hyperb3ic
system to a steady state by usin large times@ s on coarse
effectivestrate formanagingthe worksplitbetween the
meshes. In a g w d i e n s i o n d Case the number of cells
is reduced by a factor of eight on each coarser grid. On
ds so that disturbances wilfbe more rap&' expelled examination of the figure,it can thereforebe seen that the
g o u g h the outer boundary. Vanous multl$d time- work measured in u t s correspondmg to a step on the fine
stepping schemes designed to take advantage o t h ~ effect
s grid is of the order of
have been . proposed
. [124,65,55,71,29.6,57,83,93].
One can devise a multi 'd scheme using a sequence of 1 + 218 + 4/64 + ... < 413,
independently g e n e r a t T c o a y meshes, by ehminating
alternate pomts in each coordinate %cpon. In order to and consequently the very lar e effectivetime step of the
give a precise descnptlon of the mule d scheme, sub-
step in the fine grid.
%
complete cycle costs only slig tly more than a single time
scripts may be used to indxate the FSeveral transfer
%
operationsneed to be defined. First solution vector on
grid k must be initialized as
4.8.2 Multigrid Implicit Schemes for Unsteady Flow

T h e dependent calculations are needed for a number


where wk-1 is the current value on grid k - I, and !fk,k-I of important ap hcatlons. such as Buner analysis, or the
is a transfer operator. Next it is necessary to transfer a analysis of the low pas1 a helicopter rotor, in which the
residual forcing function such that the solution grid k is stability linut ofan explicit scheme forces the use of much
driven hy the residuals calculated on grid k - 1. This can smaller time steps than would be needed for an accurate
be accomplished by setting simulation. In this situation a multigrid explicit scheme
can be used in an inner iteration to solve the equations of
a fully implicit time stepping scheme [74].
Pk=Qk,k-iRk-i (wk-I) - R I [Wp)] ,
Suppose that (40) is approximated as
where Qk,k-I is another transfer operator. Then Rk ( W k )
is replaced by Rk (Wk + P k in the time- step ing scheme. Dtw"" + R(w"+') =O.
Thus. the multistage scheme is reformulateaas
Here Dt is a kth order accurate backward difference op-
w!') = tup) - alAttn: [Rp)+ Pk] erator of the form
... ...
W k('I') = W p ) - cY,+l& [
,:
I
' + Pk] .
Tbe result wi'") then provides the initial data for grid where
k + 1. Finally, the accumulated correction on grid k A-w"+l-
-w "+I -w".
1-14

Applied to the h e a r differential equation in the different coordinate directions. The need to resolve
the boundary layer generally forces the intiduction of
dw mesh cells with very high aspect ratios near the bound-
--=(2W and these can lead to a severe reduction in the rate
dt yionvergence to a steady state. Pierce has recently ob-
the schemes with k = l , 2 are stable for all aAt in the left tained impressiveresults using diagonal and block-Jacobi
half plane (A-stable). Dahlquist has shown that A-stable preconditloners which include the mesh intervals [ 1331.
linear multi-step schemes are at best second order accurate An alternative approach has recently been proposed by
I381. Gear however, has shown that the schemes with
< 6 are stim stable [49], and one of the higher order
4
sckmes may o er a better compromse between accuracy
Ta'asan 1681, in which the equations are wntten in a
canonical form which se arates the equations describ-
J:
ing acoustic waves from ose describing convection. In
and stability, depending on the application. terms of the velocity components U,v and the vorticity
Equation (40) is now treated as a modified steady state w , temperature T,entrop s and total enthal y H, the
problem to be solved by a multigrid scheme using variable eauations describinn s t e d two-dimensional &w can be
local time steps in a fictitious time to. For example, in the
case k=2 one solves

where
aw
-=R*
at*
(w) , [
where
and the last two terms are mated as k e d source terms.
The first term shifts the Fourier symbol of the equivalent
model problem io the left in the complex plane. While
this pr6motes stability, it may also reuire-a limit to be
imposed on the magnitude of the local time step At* rel-
ative to that of the implicit time step At. This may be
relieved by a oint im licit modification of the multi- D3 =
a
U--U-
a
stage scheme [ fl9].-In %e case of problems with moving
boundaries the e uations must be modified to allow for
az ay
movement and delormation of the mesh. a 8
Q = u-tv-
This method has proved effective for the calculation of az ay
unsteady flows,that mi ht be associated with,wing flutter
and also in the ciculation of unsteady incompress- Here the first two quatio,ns describe ?,elliptic system if
!&?lows 1181. It has the advantage that it can be added the flow is subsomc, wbde the remaning equations are
convective. Now se arately optimized mulugrid proce-
as an optioh tdacom uterprogrdwhich uses an ex licit
multi 'd scheme, alfowing it to be used for the e d i e n t dures are used to sage the two sets of equations, which
are essentially decoupled.
calcuEon of both steady and unsteady flows.

4.9 Preconditioning 4.10 High Order Schemes and Mesh Rehement


Another way to im rove the rate of convergence to a The need both to improve the accuracy of computational
steady state is to muPtiply the space derivatives in equa- simulations and to assure known levels of accuracy is the
tion (1) by a reconditioning matrix P which is designed focus of ongoing research. The main routes to improv-
to equalize $e eigenvalues, so that a~ the waves can in the accuracy are to increase the order of the discrete
advanced with optlmal tune steps. A symmetnc preconm- scfeme and to reduce the mesh interval. Hi h order differ-
tioner which ualizes the eigenvalues has been proposed ence methods are most easily implemente3 on Cartesian,
by Van Leer3021. When the equations are written in or at least extreme1 smooth gnds. The expansion of
stream-aligned coordinatesthis has the form the stencil as the ordkr is increased leads to the need for
complex boun conditions. Com act schemes keep
r &MZ - IPM o o o
lb,
the stencil as s a a s p s i b l e [ 140, 281. On simple
&+l 0 0 0 1 domains. spectral me ods are articularly effective, es-
pecially in the case of periodic Lundary conditions, and
can be used to produce exponentially fast convergence of
the mor as the mesh interval is decreased 1127,271. A
compromise is to divide the field mto subdomains and
introduce high order elements. This approach is used in
where the spectral element method [92].
High order difference schemes and s ctral methods have
proven articular~yuseful in direct Xvier-stokes simula-
uons o?transient @d turbulent flows. High order methods
are also beneficial in computational aero-acousucs. where
it is desired to track wavds over long distanceswith min-
imum error. If the flow contains shock waves or contact
Turkel has pro osed an asymmetric preconditioner which discontinuities, the EN0 method may be used to construct
has also provdeffective.particularly for flow at low Mach high order non-oscillatory schemes.
numbers [ 1721. The use,of these preconditionencan lead In multi-dmensional flow simulations, global reduction
to instability at stagnauon points where there is a zero of the mesh interval can ,be prohibitively expensive, mo-
eigenvalue which cannot be equalized with the eigenval- tlvatine. the use of adaouve mesh refinement urocedures
ues tc.
which-duce the local mesh width h if there is' an indica-
The preconditionem of Van Leer and Turkel do not take tion that the error is too large [21,39, 109,61,138. 1031.
account of the effect of differences in the mesh intervals In such h-refinement metfiods, simple error indicators
1-15

such as local solution gradients may be used. Alterna- 5.2 Euler cnlculationsfor Airfoils and Wings
tively, the discretization error may be estimated b com-
paring quantities calculated with two mesh wid& say The results of transonic flow calculations for two well
on the current mesh and a coarser mesh with double the known airfoils, the RAE 2822 and the NACA 0012, are
mesh interval. Procedures of this kind may also be used presented in figures (22-25). The H-CUSP scheme was
to provide a posteriori estimates of the error once the again used. The Limiter defined by equation (23)was used
calculationis completed. with 4=3.The 5 stagetime steppin scheme (42)was aug-
mented by the mulhgrid scheme &scribed in section, 4.2
This kind of local ada tive control can also be applied to accelerate convergence to a stead state The equatlons
to the local order of a h e element method to produce were discretized on meshes with 6-topoiogy extending
a prefinement method, where p represents the order of out to a radius of about 100 chords. In each case the
the polynomial basis functions: Finally, both h- and p calculations were performed on a se uence of succes-
refinement can be combined to produce an h-p method in sively finer meshes from 40x8 to 320x84 cells, while the
~~~~~
. ~ ~ - ~
which h and D are locallv ootimized to vield a solution
~~~~~ ~
~~~

with minimum ermr [ 126].rSuch m e t h h can achieve


multlgrid cycles on each of these meshes descended to a
coarsest mesh of 10x2 cells. Fi 22 shows the inner
exponentially fast convergence, and are well established parts of the 160x32 meshes for %wo airfoils. Figures
in computational solid mechanics. 23-25 show the h a l results on 320x64 meshes for the
RAE 2822 airfoil at Mach .75and '3 angle of attack, and
for the NACA 0012 airfoil at Mach .8 and 1.25" angle of
attack, and also at Mach .85 and 1' angle of attack. In the
5. CURRENT STATUS OF NUMERICAL SIMU-
LATION pressure distributions the pressure coefficient C P = m
is dotted with the neeative (suction)
,~~~ ,oressures
r uoward. so
This section presents some representative numerical re- ~~~~ ~~~~~~~

thh the u p p c r c y e g resentsthe flow overthe tipper side


~

sults which con& the pro ernes of the al orithmswhich of a lifting aufoil. &e convergence lustones show the
have been reviewed in theyast section. &ese have been mean rate of change of the density, and also the total num-
drawn from the work of the author and his associates. ber of supersonic points in the flow field, which provides
The also illustrate the kind of calculation which can be a useful measure of the global conver ence of tiansonic
pert?&ed in an industrial environment, where rapid turn flow calculations such as these. In ea& case the conver-
around is mportant to allow the quick assessment of de- y e history is shown for 100 cycles, while the pressure
sign changes, and computational costs must be Limited. smbuGon IS displayed after a suflicient numkr of c
cles for its convergence. The pressure dtstnbutlon of&
RAE 2822 airfoil conver ed in only 25 c cles. Conver-
5.1 One-dimensional shnck gence was slower for thekACA 0012 aidil. In the case
of flow at Mach .8 and 1.25' angle of attack, additional
In order to ve the discrete structure of station-
9.
ary shocks, calcu atlons were erformed for a one-
dimensional problem with initial &ta containing left and
c cles were needed to damp out a wave downstream of
de weak shock wave on the lower surface.
right states compatible with the Rankine Hugniit con$- As a further check on accuracy the dra coefficient should
tions. An intermediate state consisting of e anthmetlc be zero in subsonic flow. or in shock f r e transonic
~~~ ~
~~~~~~~~~~~ flow.
average of the left and ri ht states was introduced at a Table 2 shows the corn' uted drag coefficient on a se-
single cell in the center offhe domain. With this interme- P,
quence of three meshes three exam les The first two
are subsonic flows over the RAE 282fand NACA 0012
diate state the svstem is not in eouilibrium. and the time
airfoils at Mach .5 and 3" angle of attack. The third is the
~~~~~~ ~~~~~~ ~~~~ ~~ ~~~~ ~~~ ~ ~~~~~ ~~~~ ~~~~~

de ndent equatlons were solved to find an equilibrium


s o h o n with a stationary shock wave separatlng the left flow over the shock free Kom airfoil at its design point
and right states. Table 1 shows the result for a shock of Mach .75and 0' angle of attack. In all three cases the
wave at Mach 20. This calculation used the H-CUSP dra coefficient is calculated to be zero to four digits on a
scheme, which allows a solution with constant sta na l6&32 mesh.
tion enthal y. with the l i t e r defined by equation &3),
and q=3. &e formulation is described in detad in refer-
ence [SO]. The table shows the values of H, p , M and
the entropy s=log - log A
A perfect one point (3).
shock structure is displayed. The entrop is zero to 4
decimal places upstream of the shock, e&bits a slight
excursion at the mterior oint, and is constant to 4 deci-
mal places downstream ofthe shock. It may be noted that
the mass, momentum and energy of the initial data are Table 2: Drag Coefficient on a sequence of meshes
not compatible with the final uilibrium state. Accord-
ing to conservatlon arguments e total mass, momentum %
and energy must remain constant if the outflow flux fR
remains ual to the inflow flux fL. Therefore fR must
be al1ow3 to vary according to an appropriate outflow
boundary condition to allow the total mass, momentum
and energy to be adjusted to values compatible with equi- and 3.06' angle Gf attack. This again verifies the non-
librium. oscillatory character of the solution, and the sharp resolu-
tion of shock waves. In this case 50 c cles were sufficient
n NI S for convergence of the pressure dismiutions.
zmRxT(I 0.0000 Figure9showsacalculationoftheNorthropYF23byR.J.
283.5ooO 1 .0000 20.0000 0.0000 Busch. Jr., who used the author's K O 5 7 code to solve
283.5000 1.0000 20.0000 0.0000 thc Euler equations 1261. Although an inviscid model of
283.4960 307.4467 0.7229 40.3353 the flow was used, it can be seen that the simulationsare
283.4960 466.4889 0.3804 37.6355 in ood a ement with wind tunnel measurements both
283.4960
283.4960
466.4889 0.3804 37.6355
466.4889 0.3804 37.6355
at hach .6 with angles of attack of 0.8 and 16 degrees,
and at Mach 1.5 with an le, of attack of 0, 4 and 8 de-
ees. At a hgb an le ofakick the flow separates from
Table 1: Shock Wave at Mach 20 g e leading edge, anithis example showsthat in situations
where the point of separation is fixed, an inviscid model
may still produce a useful predictlon. Thus valuable in-
1-16

formation for the aerodynamic design could be obtained I No. ot Nodes 11 SecondslCycle [ Speedup I
with a relatively inexpensive computational model.

Table 3: AIRPLANE Parallel Performance on the SE,


MD-11 Model

~ ... ~ ~~~ ~~~~~~~~ ~ ~ ~~~~ ~~~~

throu b the engine nacelles, using 3%8407mesh points of


2 1 d 6 6 tetrahedra. This calculation takes 4 hours on an
IBM 590 workstation. A parallel version of the code has
been developin collabohion with W.S. Cheng, and !he
same calcu anon can be performed in 20 m u t e s using
16 mcessms of an IFJM SPZ. The parallel speed-up for
the%iDll is shown in table 3.

I --
Figure 9 Comparison of Ex erimental and Com uted
Drag Rise Curve for the YF-2!3 (Supplied by R. J.\usb
Jr.)

Figure 11: Pressure Contours and Sonic Boom on a Rep-


resentative HSCT Configuration

5.3 Vioeous Flow Calculations


The next figures show viscous simulations based on the
solution of the Reynolds avcra ed Navier Stokes equa-
tions with turbulence models. I ure 13 shows a two-
dimensional calculation for the &
'2822 airfoil by L.
Martinelli. The vertical axis re sents the ne ative res
sure coefficient, and there is a sKck wave halfway $ong
the upper surface. This exam le confirms that in the
B
absence of significant shock in uced separation, simula-
tions performed on a sufsciently fine mesh (with 5 12 x 64
cells) can produce excellent agreement with ex rimental
data. Fi e 21 shows a simulation of the ZDonneU-
D o u g l a s ~ 8performed by R.M.Cummin s, YM Rizk,
L.B. Schiff and N.M. Chaderjian at NASS A& [37
They used a multiblock mesh with about 900000 mes
points. While this is probably not enough for an accu-
k
rate. quantitative prediction,.the agrFment with both the
expenmental data and the vlsualizatmn are quite good.
Figure 1 0 Com arison of Experimental and Calculated
Results for a H S f f Configuration Figure 14 shows an unstead flow calculation for a
itchin airfoil performed b
h0h f.
Alonso usin the code
which he 'ointly d v e l y d wi$ L.%artinelli
The next fi ures show the results of calculations using the
AIRPLAN% code developed by T.J. Baker and the author,
and the author [41.As uses them tlmd un~licitscheme
described in Section 3.7.2 which alhws thi number of
to solve theEuler uations on-an unstrucnurdmesh. This time steps to be reduced from several thousand to 36 per
proyides the flexit& to treat arbitrarily complex config- pitching c cle The agreement with experimentaldata is
uratlons without the need to spend months developing an qLIitego0J .
1-17

Figure 12: Computed Pressure Field for a McDonnell


Douglas MDl 1

5.4 Ship Wave Reaistauce caIculations


Figure; 15-17 show the results of an application of the
same multigrid h i t e volume techniques to the calculation
of the flow past a naval fri~te,,usinga code which was
developed by J. Farmer, L. artrnelh and the author [47].
The mesh was adjusted during the course of the calcu-
lation to conform to the free surface in order to satisfy
the exact non-linear boundary condition, while ,?rtificial
compressibility was used to treat the incompressible flow
equations.

Figure 13: no-Dimensional 'hbulent Viscous Calcula-


6. AERODYNAMIC SHAPE OpTIMlzATION tion (by Luigi Martinelli)
6.1 Optimization and Design
Traditionall the rocess of selecting design variations has so that to first order
been carridout I! & ! d and error, relying on the intuition
and experience orthe desi ner With currently available ap
I + 6I=I - -&=I ar
- A--.a r T
equipment the turn mount foi numerical simulations is
becomin so rapid that it is feasible to examine an ex-
aa aCYaCY
tremely &ge number of variations. It is not at likely More so histicated search rocedures may &used such as
that re ated trials in an interactive desi n and analysis quasi-dwtonmethods, with attempt to estunate the sec-
proce& can lead to a truly optimum &sign. In order
to take full advanta e of the possibili of examinin a ond derivative & of the cost function fromchanges in
ti 7.
large design s ace e numencal simu auons need to%e
combined w i g automatic search and optimization proce- the Merit in successive optimization steps. These
dyes. This cap lead to autopatic design meth.?ds which methods also generally introduce line searches to find
will full realize the potenual unprovements in aerody- the minimum in the search direction which is defined at
~~~~~~ ~~ ~.
~~~~~~~~

namic ehciency. each step. The main disadvantage ofthis approach is the
need for a number of flow calculations proponional to the
The simplest ap roach to optimization is to define the number of design variables to estimate the gradient. The
eometry througg a set of design payneters, which may, computational costs can thus become prolubitive as the
for example, be the weights ai applied to a set of shape number of design variables is increased.
functions b, (z) so that the shape is represented as An alternative approach is to cast the desi n problem as a
search for the shape that will generate the &red ressure
distribution. This approach recognizes that the iesigner
usually has an idea of the the kind of pressure distnbu-
Then a cost function I is selected which might, for exam- tion that will lead to the desired performance. Thus. it is
ple, be the drag coefficient or the lift to drag ratio, and I useful to consider the inverse problem of calculating the
is regarded as a function of the parameters CY,. The sen- sha that will lead to a given pressure distribution. The
sitivities% may now be estimated by making a small megod bas the advanta e that only one flow solution is
required to obtain the &sired design. Unfortunately, a
variation CY. in each design parameter in turn and recal- bysically realizable shap may not necessarily exist,,un-
culating the flow to obtain the change in I . Then Pess the pressure hstnbutlon satlsfies certllln consmmts.
Thus the problem must be very carefully formulated,oth-
erwise it may be ill posed.
The difficulty that the target pressure may be unattainable
The gradient vector may now be used to determine a may be circumvented by treating the inverse problem as
a special case of the optimization problem, with a cost
direction of im rovcient. - m e simplest procedure is to function which measures the error m the solution of the
make a step ine! negative gradient direction by setting inverse problem. For example, if pd is the desired surface
pressure, one may take the cost function to be an integral
CY"+'=CY" - A60, over the the body surface of the square of the pressure
1-18

Figure 1 6 Contours of Surface Wave Elevation Near the


Transom Stem

Figure 14 Mach Number Contours. Pitching Airfoil


Case. Re=l.O x IO6,M,=0.796,K,=0.202.

error,

or possibly a more general Sobolev no? of the pressure


error. This has the advantage of converung a possibly ill
posed problem into a well posed one. It has the disadvan-
iage h i t it incurs the computational costs associated with
optimization procedures.

6.2 Application of Cnntml Theory


In order to reduce the computational costs, it turns out that Figure 17: Pressure Contours in the Bow Region
there are advantages in formulating boththe inverseprob-
lem and more eneral aerodynmc problems w i h n the
framework of &e mathematlcal theorv for the control of of the shape of the boundary. If the boundary shape is re-
s stems overned by partial differenual e uations [lo5 garded as arbitr within some requirements of smooth-
lwing, for exayle. is a,deviceto producegift by control: ness, then the fuygenerality of shapes cannot be defined
line the flow, an its design can be regarded as a problem with a finite number of arameters, and one must use the
in &e optimal control ofthe flow equations by variation concept of the Frechet aerivative of the cost with respect
tn a function.
~ .... Clearlv.
~~~ - such, a derivative
~~~~~~ , ~cannot~be deter-
~~~~ ~ ~
~~~~~~~~~~ ~~~~ ~
mined directly by filute differences of the design aram-
eters because there are now an infinite number orthese.
Using techniques of conml theory, however, the,gradient
can be determined indirectly b solvin an adjoint e ua
tion which has coefficients defiYned by he soluuon 09th;
flow equations. The cost of solving ihe adjoint equation
is comparable to that of solvin the flow e uations. Thus
the gradient can be determindwith roughy the compu-
tational costs of two flow solutions, indegqdently of the
number of design variables, which may infinite if the
boundary is regarded as a free surface.
For flow about an airfoil or wing, the aerodynamic pro
erties which define the cost function are functions of
flow-field variables ( w ) and the physical location of the
boundary, which may be represented by the function 3,
say. Then
I=I(w,rn,
and a change in 3results in a change
____ IT aiT
61=-6w + -63, (43)
Figure 15: Contours of Surface Wave Elevation for a aw 83
Combatant Ship in the cost function. Using control theory, the governing
1-19

equations of the flowfield arc introduced as a consuaint ad'ointpartial differentialequation.Ifthese uations are
in such a way that the final ex ression for the gradient solved exactly they can provide an exact grzent of the
does not requue reevaluation oI%e flowfield. In order lo inexact cost function whch results from the discretization
achieve this 6w must be elimina!ed from (43). Su pose of the flow equations. On the other hand any consistent
that the govenungequauon R which expresses the J p e n - discretization of the adjoint partial differenual equation
dence of w and 7 within the flowfield domain D can be will yield the exact adient in the limit as,the mesh is
written as refined. The trade-ohetween the compleuty of the ad-
R (w,7 )=O. (44) joint discretization. the accuracy of the resulting estimate
of the gradient, and its impact on the computauonal cost
Then 6w is determined from the equation to approach an optimum solution is a subject of ongoing
research.
(45) The me optimum shape belongs to an infinitely dimen-
sional space of design arameters. One motivation for
developing the theory !or the partial differential q u a -
Next, introducing a Lagrange Multiplier $, we have tions of the flow is to provide an indication in principle
of how such a solution could be ap roached if sufficient
6I =

= -$:[
aIT
-6w+-~5-$~
aw
arT
a3
T [8;;;])6w+[$-$
aR T aR
[-])63.
computational resources were ava&ble. Another moti-
yation is that it highlights the possibilit of generatin
dl posed formulauons of the problem. &or example, i
one attempts to calculate the sensitivity of the pressure
at a particular location to changes in the boundary shape,
+
aF
there is the ossibility that a sha e modification could
Choosing (I, to satisfy the adjoint equation cause a shocf wave to pass over &t location. Then the
sensitivity could become unbounded. The movement of
the shock, however, is continuous as the shape changes.
Therefore a uantity such as the drag coefficient, which
is detennine8by integrating the pressure over the surface,
also depends continuously on the shape. The adjoint
equation allows the sensihvity of the drag coefficient to
the first term is eliminated. and we find that be determined without the ex licit evaluauon of pressure
sensitivities which would be &posed.
61=867, (47)
The discrete adjoint equations, whether they are derived
where directly or by discretization of the adjoint partial differen-
tial equation, are linear. Therefore they could be solved
by direct numericalinversion. The cost of direct inversion
can become prohibihve, however, as the mesh is relined,
The advantage is that (47) is independent of 6w, with the and it becomes more efficient to use iterative solution
result that the gradient of I with respect to an arbitraty methods. Moreover, because of the similarity of the ad-
number of desien variables can be determined without the joint y t i o n s to the flow equations, the s F e iterative
~~~~~~~~~ ~~ ~~~~

need for additi&fiow-field evaluations. In the case that metho s wluch have been proved to be efficient for the
(44)is a artial differential equation, @eadjoint equation solution of the flow equations are efficient for the solution
(46) is go
a partial Merenual equauon and appropnate
boundary conditions must be deteimined.
of the adjoint equations.
The control theory formulation for optimal aerodynamic
After makine a s t e ~in the neeative eradient direction. desi n has roved effective in a vanety of ap lications
the gradient Fan be iecalculadand thi process repeated [73.57, 144. The adioint equations have also &en used
by Ta'asan, Xuruvilaand Sdas [167 who have imple-
to follow a path of steepest descent unul a minimum is
reached. In order to avoid violaung consmamts, such as sented bv the flow euuations
j,
mented a one shot approach in which e constraintrepre-
~ is onlv muired to be ~ satisfied
a minimum acceptable wing thickness, the gradient ma by the fihal converged solution, and computational costs
be projected into the allowable subspace within whic; are also reduced by applying multi 'd techniques to the
the constraints are satisfied. In this way one can devise geometry modifications as well as &?solution of the flow
procedures which must necessarily converge at least to a ind ad'oint equations. pironneau has studied the use of
ocal minimum. and which can be accelerated bv the use
-7 - ~ -
~~~~~ ~~~~ ~ ~~~~~ ~~~~

of more sophisticated descent methods such as conjugate


~~

d controi theory for optimal sha desi n of systems gov-


%radmdor quasi-Newton alfri%s. There is the possi- erned b elliptic equations [ 1 6 , antmore recently the
ility of more than one loc mmimum. but in any case Navier-stokes equations, and also wave reflection prob-
the method will lead to an improvement ovei the original lems. Adjoint methods have also been used by Baysal
design. Furthermore, unlike the traditional mverse algo- andEleshaky [16].
r i h i s , any measure of performance can be used as The
cost funchon.
6.3 Three-DimensionalDesign using the Euler Qua-
In reference [72] the author derived the adjoint equations tiOnS
for transonic flows modelled by both the potenhal flow
equation and the Euler equations. The theory was de- In order to illustrate the application of control theory to
veloped in terms of artial differential equations, leading aerodynamic design roblems, this section treats the case
to an adjoint partiJ differential equation. In order Jo of thrwdimensionafwing design usin the inviscid Eu-
obtain numerical solutions both the flow and the adjomt
equations must be discretized. The control theory might
!
ler equations as the mathematical mode for compressible
flow. A,transformation to a body-fitted coorqinatesystem
be a plied directly to the discrete flow uations whch will be introduced, so that vanauons in the wing shape m-
red from the numerical amroximation3the flow eaua-
tions hy finite element, fin& volume or finite differ&cc
~ ~ ~ ~~ ~~~~ ~~~
duce corresponding variations in the computationalmesh.
Thus the flow is determined by the soluhon of the trans-
procedures. This leads directly to a set of discrete ad'oint formed equation (5). Let
equations with amatrix which is the trans ose of the hco-
bran mahx of the full set of discrete noninear flow q u a -
tions. Onathree-dimensional mesh with indicesi, ,k-the
individual adjoint equationsmay be denved by cokxting
together all the terms multiplied by the variation 6 W , j , k
of the discrete flow variable W , , j . k . The mulling discrete and
adjoint equationsrepresent a possible discretization of the Q=JK-~.
1-20

The elements of Q are,the coefficients of K,and in a The weak form of the equation for 6w in the steady state
finite volume discretizahongey are just the face areas of becomes
the computational cells projected in the 2 12 2~. and 2 3
directions. Also introduce scaled contravanant velocity
components
ui=Qijuj.
The msformed equations can now be written as where
~F,=C;~W
+ 6Qijfj,
aw + -=o
- OF+
at R-8
which should hold for any differential test function 4.
This equation may he added to.the variation in the cost
where function, wluch may now be wntten as
W=Jw
and

Assume now that the new computational coordinate sys- (53)


tem conforms to the wing in such a way that the wmg
surface BW is represented by E -0. Then the Bow is
determined as the steady state solition of equation (48) On tpe win surface Bw,nl=n,=O and it follows from
subject to the flow tangency condmon equahon (48) that

U 2 4 onBw. (49)
At the far field boundary Bp, conditionsare specified for
incoming waves. as in the two-dimensional case, while (54)
outgoing waves are determined by the soluhon.
The weak form of the Euler equationsfor steady Bow can
be written as

Since the weak equation for bw should h r an arbi-


trary choice of thitest vector 4, we are free to choose 4 to
where the test vector 4 is an arbitrary differentiable func- simplify the resulting expressions. Therefore we set 4=$,
tion and n. is the outward normal at the boundary. If a whe? the costate vector 11, is the solution of the adjoint
differentiahe solution w is obtained to this equation, it equahon
can be integrated by parts to gwe
(55)

At the outer boundary mcoming characteristicsfor $ cor-


respond to outgoing characteristics for 6w. Consequently
and since this is m e for any 0, the differential form can one can choose boundary conditionsfor 11, such that
be recovered. If the solution is discontinuous, equation
(5%may be integrated by parts se arately on either side n.11,TC,6w=0.
of e discontinlutyto recover the sock jump condtions.
Then if the coordinate transformation is such that 6
Suppose now that it is desired to control the surfacepres-
sure h varying the win shape. It is convenient to retam
f negligible in the far field, the only remaining boun % IS

a6 x Jcomputational omm. Vafiatons in the shape


then result rn correspondmg vanahons in the mapping
derivatives defined by K. Introduce the cost function
tern 1s
-/Lw 11,T6Fz @i@3.

Thus by letting $ satisfy the boundary condition,


- P d ) on Bw,
Qz& + Q& + Qu@r=@ (56)
where p d is the desired pressure. The design problem is we find finally that
now treated as a control oblem where the control func-
tion is the wing shape, wgch is to be chosen to minimize
I subject to the constraintsdefined by the flow equations
(48-50). A variation in the shape will cause a variation
bI =-
J, %-6Qijfjd’D

6p in the pressure and consequently a variation in the cost


function - //ca~,,$~
BW
+ 6Qn$3 + Q d ~ 4 ) ~ C y i d h (57)
.

A convenient way to treat a wing is to introduce sheared


parabolic coordinates as shown in figure 18 through the
transformauon
Since p depends on w through the equation of state (2),
the variation 6p c,an be determined from the variation 6w. 1
Define the Jacobian mamces z = 20 (C)+ p (C) {E2 - (II+ s(E, C 9 )
Y = ~o(C)+a(C)E(II+S(E,C))
2 = C.
1-21

Inde ndent movement of the boundary mesh points


coulrproduce discontinuities in the designed shape. In
order to r y e n t ti$s @e,gradientmay be also smoothed.
Both exp cit and imphcit smoothing procedures are use-
ful. Sup sethatthemovement ofthesurfacemeshpoints
were deEed bv local B-sulines. In the case of a uniform
18a: 5 , y-Plane. 18b: (,q-Plane. one-dimensionh mesh, a 9-spline with a displacement d
centered at the mesh point i would produce displacements
Figure 18: Sheared Parabolic Mapping. d / 4 at i + I and i - 1 and zero elsewhere, while preserv-
ing COnhnuity of the first and second derivatives. Thus
we can suppose that the discrete surfacedisplacement has
Herez=zl y=zz. z = q are the Cartesian coordinates, and the form
E andn + 3 correspond to parabolic coordinates -generated 6S=Bd,
6y the'mapping where B is a matrix with coefficients defined by the B-
1 s lines, and di is the displacement associated with the
2 + iy=zo + iyo + -a(()
2
{t + i (q + f s line centered at i. Then, using the discrete formulas,
to f k t order the change in the cost is
at a fixed span station C. 50(0,and yo (C) ?e, the,coor-
dinates of a sineular line which IS swem to he lust inside 61=BT6S=GTBd.
the leadin of a swept win , wihe a (C jis a scale
factor to d o w for spanwise chadvariations. Thus the gradient with respect to the B-spline coefficients
is obtained by multiplying G by BT,and a descent step is
$,
We now treat S C) $e control. Substitution of these
formulas yields e vanatlon in the form
defined by %tung

d=-XBTG, 6S=Bd=-XBBTG
61= J / G ( E , q ) aS(E,q) &dq
where X is sufficiently smalland positive. The coefficients
of B can be renormalizedto produce unit row sums. With
where the gradient G 7) is Obtained bY evaluating the a uniform mesh s acing in the computational domain this
integralsin equation (57). Thus to reduce I we can choose formula is uivafent to the use of a gradient modified by
two passes3the explicit smoottung procedure
6S=-XG
where A is suliiciently small and non-ne ative In order
to imuose a thickness constraint we can &fine a baseline
surf&e So (6C) below which S (E, C) is not allowed to
fall. Now we take X=X (4C) as a non-negative function Withasimilarsmoothingpmcedureinthek discretization.
such that Implicit smoothing may also be used. The smoothing
(58) equation
S(S,C) + 6S(E,C) 2 So (t,0.
Then the constraint is satisfied, while

approximates the differential equation

a ac
0 - -E-+
The costate solution (il is a legitimate test function for
the weak form of the flow uations only if it is (tiffer-
ac
enuable. Smoothness shea *O be. PreFrvd
redesigned shape. It is therefore c~ciallymportant to
the If one sets 6S=-Xc, then to first order the change in the
mtmduce appropriate smoothmg procedures. In order cost
to avoid disconunuitiesin the adjomt boundary condition
which would be caused by the appearance of shock waves,
the cost function for the target pressure may be mcdified

Then assuring an improvementif X IS sufficientl small andpos-


itive, unless the process has already reacled a stationary
6I = // point at which G=O.

6.4 DesignofSwept WingsforVeryLowShockDrag

r
The method has been used to cany out a stud of swept
wing designs which might be a opriate for on range
"osport aircraft. Since three Kensional calciations
requm pbstantial computational resources, ,it is ex-
and the smootp quantity 2 replaces p - pd in the adjoint tremely important for $e practical implementatlon of the
boundary conhhon. method to use fast soluhon algorithm for the flow and the
1-22

ndinini
__.__eouatinns.
.~ In this case the author's FL087 com-
~~~ ~~~~ ~ ~~~ ~
rogram FL067. This program uses a cell-vemx formu-
puter pro am has been used as the basis of the design fation, and has recently b e n mod!fied to. incorporate a
method. KO87 solves the t&ee dimensional Euler equa- local extremum dmnishmg a1 onthm with a ve
tions with a cell-centered finite volume scheme, and uses level of numerical diffusion 174. m e ? run to fuVc'Z
residual averaeine and multierid acceleration to obtain vergence it was found that a better estmate of the drag
ve rapid stedy k e solutio~s,~us,dly in25 to 50multi- coefficient of the redesigned wing is 0.0094 at Mach 0.85
griycyc!es [66, 701.. Upwind biasing is used to produce with a lift coefficient of 0.5, giving a lift to drag ratio
non-oscillatory soluhons, and assure the clean capture of of 53. The results from FL067 for the initial and final
shock waves. This is introduced through the addition wings are illustrated in Figures 29 and 30. A calculation
of carefully controlled numerical diffusion terms, with a at Mach 0.500 shows a drag coefficient of 0.0087 for a
magnitude of order Az3 in smooth parts of the flow. The lift coefficientof 0.5. Since in this case the flow is en-
adjoint equations are treated in the same way as the flow tirely subsonic, this provides an estimate of the vortex
equations. The fluxes are first estimated by. central differ- drag for this planform and lift distribution, which is just
ences, and then modified b downwind biasing through what one obtains from the standard formula for induced
numerical diffusive terms whch are supplied by the same drag, CD=CL*/ETAR, with an aspect ratio AR=9, and
subroutinesthat were used for the flow equatlons. an efficiency factor c=0.97. Thus the design method has
The study has been focussed on wings designed for cruis- reduced the shock wave drag coefficient to about 0.0007
ing at Mach 3 5 , with lift coefficients in the range of .5 to at a lift coefficient of 0.5. Figure 31 shows the result of
.55. In every case, the wing planform was fixed while !he an analysis for an off design point with the Mach number
sections were free to be chan ed arbitranly by the desi n increased to .86 with the same lift coefficient of .5. ? i s
method, with a restriction on lie minimum ttuckness. d e results in a flat-topped pressure distribution terminahng
with a weak shock of near1 uniform strength across the
whole s an The drag coe&cient is ,0097. The penalty
of ,000Lfis so small that this might be a preferred cruising
condition.
A second wing was designed in exactly ,the same manner
as the first, starting from the same inmal geometry and
with the same constraints, to give a l i i coefficientof .55
at
~.~Mach .85. This oroduces stroneer shock waves and is
~~~~~ ~~ ~~~~
r~~~~~~

0.6. This section,which has a thickness to chord ratio of therefore a more severe test ofthe hethod. In this case the
9.5 percent, was used at the ti Similar sections with an total inviscid drag coefficient w+ reduced from0.0243 to
increased thickness were usefinboard. The variation of 0.0134 in 40 design cycles. Agam the performance of the
thickness was non-linear with a more rapid increase near fmal design was verified by a calculation with FL067, and
the rmt, where the thickness to chord raho of the basic when the-result was fully converged the drag coefficient
section was multiplied by a factor of 1.47. The inboard was found to be 0.0115. A subsonic calculation at Mach
sections were rotated upwards to give the initial wing 3.5 .500showsadragcoefficientofO.O107foraliiftcoefficient
degrees twist fromroot to tip. ?e two-dimensional pvs- of 0.55. Thus in this case the shock wave drag coefficient
sure distribution of the basic wing sectlon at its, desip is about 0.0008, For a representative transport aircrafl the
int was introduced as a target pressufedistnbutlon uni- parasite drag coefficient of the wing due to skin friction is
& n l y across the span. This target is resumably not about 0.0045. Also the fuselage drag coefficient is about
realizable, but serves to favor the estabhsK,ent of a rela- 0.0050, the nacelle drag coefficient is about 0.0015. the
tivel benign pressure distribution. The total inviscid dra empennage dra coefficient is about 0.0020.and excres-
coedcient, due to the combination of vortex and shoc f cence drag coekcient is about 0.0010. This would give
wave drag, was also included in the cost function. Since a total drag coefficient C~=0.0255for a l i i coefficient
the main objective of the study was to minimize the dra of 0.55. coresDondine to a lift to drag ratio LID=21.6.
the @get pressure hstribution was reset after every fo
design cycle to apispibuhon derived by smwthmg the ex- a
' h s would be subsktial improvcmh over he values
ism pressure distnbuhon. Thm allows the scheme more obtained by currently flying transport amraft.
freefom to make changes which reduce drag. The cal-
culations were performed with the lift coefficient forced
to approach a fixed value b adjusting the angle of attack 6.5 Optimization of Complex Configurations
every fifthiteration of the d w solution. It was found that
the computational costs can be reduced b usin only 15 In order to treal more complex configurations one can use
a numerical rid generation procedure to produce a bod
multigrid cycles in each flow solution, aniin ea& adjoint
solution. Althou h this is not enough for full ,conv,er- F
fitted mesh or the initial geometry, and then modify
mesh in sub uent design cycles by an analyhc perturba-
de
gence, it proves su%icient to provide a shape mdficahon tion formul3n the two-dimensional case, for example,
wluch leads to an improvement.
with computational coordinates c, q. let the boundary dis-
Figures 27 and 28 show a wing which was designed for a olacement at n=O be 6 ~ (€1. h 6 m (0.Then the mesh
lift coefficient of .SO at Mach .85. In order to prevent the points along the radial cobrdinateiinis €=constantcan be
final wing from becoming too thin the threshold So (c, 7 ) ) replaced by
was set at three quarters of the height of the bump S (t,7))
defining the initlal wing. This calculation was performed
on a mesh with 192 intervals in the direction wrapping
around the wine. 32 intervals in the normal n direction
and 48 interval; in the spanwise C direction, giving a yielding
total of 2949 12 cells. The wing was specified b 33 scc-
tions, each with 128 points, giving a total of 4234 design
variables. The plots ihow ttie inifial wing geome
7 "d
pressure distribution, and the modified geometry an pres
sure distributionafter 40 desi n cycles The total inviscid
drag coefficient was reduced from 0.0210 to 0.01 12. The
initlal design exhibits a very strong shock wave in the Such a procedure has been implemented by J. Reuther for
inboard region. It can be seen that this is corn letely the three-dimensional Euler equations, and ap lied to the
eliminated, leaving a ve weak shock wave in &e out- optimization of wing-body configurations[ 14!].
board r r . TO verifyxe solution, the final geometry
was an yzed with another method, using the computer It is also possible to show that in the continuous limit
the field integral in equation (57) can be eliminated. Let
1-23

the change in the coordinates 2, at k e d be 62i (5). be significantly improved by innovative concepts, such
Then, using the fact that the fluxes f, (w)satisfy the flow as the idea of time inclining. It can be anticipated that
eauation (48). it is oossible to show bv a direct calculation interdisciplinary applications in which CFD ii cou led
with the com utauonal analysis of other properties ofthe
P
desien will D av an increasinelv imoortant role. These
appkations'miy include sm%ral, thermal and electro-
~~~~~~~~ ~~~~ ~~~~~

magnetic analysis. Aeroelastic problems and integrated


where control system and aerodynamic design are likely target
areas. The development of improvedalgorithmsconh-
A detail@ derivation is given in,reference[781. Thus the
perturbahon equahon can be wntten as I .
where 6w is the variation in the solution at fixed caused
by the change in the boundary, whle 6w' is the change
in the original solution w(t) corresponding to the mesh
movement 62 (t)
I- --
Figure 19: Concept for a numerical wind tunnel.

Now

and if w satisfies the adjoint e uation the entire field in-


tegral is eliminated, leaving on?y the boundary integral in
equation (57).
In an actual discretization the field terms are not zero. Figure 2 0 Advanced numerical wind tunnel.
but this result suggests that they should be small if a fine
enough mesh is used, and mi ht be dropped. This al- ues to be impr!ant in providing the basic building blocks
lows a drastic simplification ofthe treatment of complex for numerical simulauon. In narticular. better
...... error
..... esti-
..-
con6 urations. Pieliminary numerical experiments with ~~~~~~~ ~~~~~~~~~~ ~~~

mation procedures must be develo d and incorporated


airfoi and wing calculations indicate roughly the same
convereence wlth and without the field terms in the era-
I - %
in the smulation software to provi error control. The
basic simulation software is only one of the needed ingre-
dient. dients, however. The flow solviz must be embedded in a
user-friendly system for geometry modelin output anal-
ysis, and data management that will provife a corn lete
7. OUTLOOK AND CONCLUSIONS numerical desi n environment. These are the ingre$ ients
which are n d e d for the fuy realization of the concept of
Betteralgorithmsandbettercomputerhardwarehavecon- a numerical wind tunnel. Figures 19 and 20 illustrate the
tributed about equally to the progress of computational way in which a numencal wind tunnel might evolve from
science in the last two decades. In 1970 the Control Data current techniques, which involve massive data handling
6600 represented the state of the art in computer hard- tasks,to a fully integrated design envlronment.
ware with a s ed of about IO6 operations per second
In the long run, corn utational simulation should become
(one megafloprwhile in 1990 the 8 processor Cray YMP
offered a performance of about lo9 operations per sec- P
the pMci~altoo! o ,the aerodynamic design,processbe-
cause of e flehibiI!ty it provides for the rapid and com-
ond (one gigaflop). Correspondingly, steady-state Euler paratively inexpensive evaluation of alternative designs.
calculations which resuired 5.OOO-lO.OOO stem nrior to ind because it can be inrefated in anumerical designen-
1980 could be perfo&ed in IC50 steps in i99b using vironmentpviding for 0th mulu dsciphnary analysis
mulugrid acceleration. With the advent of massively par- and multi- isciplinary optimization.
allel computers it appears that the progress of computer
hardware may even accelerate. Teraflop machines offer-
ing further iniprovement by a factor of I ,OOO are likely to REFERENCES
be available within a few years. Parallel architectures will
force a reappraisal of existin algorithms, and their effec-
tive utilizauon will require
new parallel software.
8e extensive development of Ill R. Abid, C.G. Speziale, and S. Thangam. Ap-
lication of a new k-r model to near wall tur-
gulent flows. AIAA Paper 91-0614, AIAA 29th
In arallel with the transition to more sophisticated algo- Aerospace Sciences Meeting, Reno, NV, January
ri&s, the present challengeis to extend the effective use 1991.
of CFJJ to more complex applications. A key roblem is 121 H. Aiso. Admissibility of difference a roxima
the treatment of multi le s ace and time scapes. These tions for scalar conservauon laws. f % r o s h i ~
anse not onlv in turbu?ent Eows. but also in manv other Math. Journal, 23:15-61,1993.
situations suih as chemically reactin flows, com6ustion.
flame fronts and lasma dynamics. k o t h e r challenge, is [3] J. J. Alonso and A.,Jameson. Full implicit time-
presented by proilems with moving boundanes. Eh?-
ples include helicopter rotors. and rotor-stator interacuon
marchin aeroelashc so~utions. pa er 94-
0056, Af4A 32nd Aems ace Sciences hfeeting,
&M
in turbomachinery: Algorithms for these problems can Reno, Nevada, January 1954.
1-24

[4] J. J. Alonso, L. Martinelli, and A. Jameson. Mu!ti- [ZO] J.A. Benek, T.L. Donegan, apd N.E. Sub+ Ex-
grid unsteady Navier-Stokes calculauons with tended Chunera gnd embeddm scheme with ap-
aeroelastic a phcations. AlAA aper 95 0048, plications to viscous flows. AI.& Paper 87-1 126,
AIAA ~ 3 r dlerospace Sciences Geeting, Reno, AIAA 8th Computational Fluid Dynamics Confer-
Nevada, January 1995. ence. Honolulu, HI, 1987.
[5] B.K. Anderson, J.L.Thomas,,+d B. Van Leer. A [211 M. Ber er and A. Jameson. Automatic ada tive
companson of flux vector sphtUngs for the Euler grid reAement for the Euler equations.AIAA !our-
equations. AIAA Paper 85-0122, Reno, NV,Jan- nul, 23561-568,1985.
uary 1985. 1221. M. BergerandRJ. Levewe. Anadaptivecartesian
.
[6] W.K.Anderson, J.L. Thomas, and D,L. Whiffield. mesh algorithm for the Euler e uations in arbitrary
geometnes. AIAA Paper 89-1830.1989.
s
Multigrid accelerationof the flux s lit Euler equa-
tions.AIAA Paper 86-0274,AIAA 4th Aerospace
Sciences Meeting, Reno, January 1986. [23] J.P. Boris and D.L. Book. Flux corrected transport,
1 SHASTA, a fluid trans ort al orithm that works.
[7] T.J. Baker. Mesh generation by a sequence of trans- J. Comp. fhys., I1:38& 1975.
formations. Appl. Num. Math., 2515-528,1986.
[24] A. Brandt. Multi-leveladaptivesolutionsto bound-
[8] N. Balakrishnan and S . M. Deshpande. New up- value problems. Math. Comp., 31:333-390,
wind schemes wjth wave-parUcle s littin for in- %7.
viscid compressible flows. Reporr 81
dian Institute of Science, 1991.
Id
12, In-
M.O. Bristeau, R. Glowinski, J. Periaux. P. Pemer,
0. Pironneau, and C. Pokier. On the numencal
a
[9] B. Baldwin and H. L o a . Thin layer a proxima-
tion and algebraic model for separate turbulent
flow. AIAA Paper 78257.1978.
solution of nonlinear roblems in fluid d y n m c s
by least squares and Knit, element methods (U),
application
applicatioi to transonic flow simulations. Comp.
comp.
Meth Appl. Mech. andEng., 51:363-394,1985.
[IO] B.S. Baldwin and T.J.Barth. A one-e uation tur-
bulence trans rt model for high ReynJds number [26] R.J. Busch, Jr. Corn utational fluid dynamics in the
wall-boundeg0flows. AIAA Paper 91-0610, AIAA desi n of the No&o /McDonnell Dou las YF-
29th Aerospace SciencesMeeung, Reno,NV,Jan-
uary 1991.
23 h
,F prototype., A
h paper 91-1629, AIAA
21st Flud D narmcs, Plasmgynamics & Lasers
Conference. Aonolulu. Hawau. 1991.
[ l l ] T. J. Barth. Aspects of unstructured ds and fi-
P.
nite volume solversfor the Euler and avier Stokes
equations. In von Karman Institutefor 8.
[271 C. Canuto, M.Y. Hussaini, A. uarteroni, and
D.A. Zan Spectral Methods in urd DyMmics.
namics Lecture Series Notes 1994-05, Springer+erIag, 1987.
1994.
[28] M.H. Carpenter, D. Gottlieb, ,and S . Abar-
[ 121 T.J. Barth and P.O. Frederickson. Higher order so- banel. Time-stable boundary conditions for f i ~ t e -
lution of the Euler equauons on unstructured gnds difference schemes solving hyperbolic systems:
uadratic reconstrucuon. AIAA paper 90- Methodology and a lication to hi h order com-
E3 3 anuary 1990. act schemes. C
barch 1993.
tA
Ir&
oR
pe! 93-9, hmpton, VA,
[13] T.J. Barth and D.C. Jes rsen The design
and application of upwmrxhemes on unstruc- 1291 D.A. Caughey. A diagonal implicit multigrid algo-
tured meshes. AIAA paper 89-0366, AIAA 27th rithmfor the Euler equations. AIAA Paper 87-453,
Aeros ace Sciences Meeting, Reno,Nevada, Jan- 25th Aerospace Sciences Meeting, Reno, January
uary 1689. 1987.
[14] J.T. Batina. Implicit, flux-split Euler schemes for
unsteady aerodynarmc analysis mvolvln unstruc-
tureddynarmcmeshes. AIAApaper90-0836.ApnI Boundary Layers. Acadenuc Press, 19 4. f
[30] T. Cebeci and A.M.O. Smith. Analysis o Turbulent

1990. 1311 S.R. Chmvarthy.,Relaxation methods for unfac-


[ 151 E Bauer, P. Garakdian, D. Kom, and A. Jameson. tored impkit upwind schemes. AIAA Paper 84-
Supercritical Wng Sections 11. Spnnger Verlag, 0165, AIAA 22nd Aerospace Sciences Meeting,
New York, 1975. Reno,January 1984.
[16] 0. Baysal. e d M. E.,Eleshaky. , Aerodynamic [32] S.R. Chakravarth A Harten, and S., Osher. Es-
design optmuanon usmg sensiuvi anaysis and sentially non-oscaatory shock
corn utational fluid dynamics. ' 'A paper 91-
047f 29th Aeros ace Sciences Meeung, Reno,
of uniformly very hi h
0339, AIAA 24th bferospace
Nevada, January 1891. Reno, January 1986.
..., R.W. Beam and R.F. Warmine. An imdicit finite
1171 ~~~ ~~~

differencealgorithmforhyer6oIic s scams in con-


[33] R. Chipman and A. Jameson. Fully conservative
numencal solutions for unsteady irrotational tran-
servauon form. J. Comp. Bhys., 23:87-1 IO, 1976. sonic flow about airfoils. AIAA Paper 79-1555,
AIAA 12th Fluid and Plasma D namics Confer-
ence, W a r n b u r g , VA, July 19&.
1341 S.E. C W and S.D.,Thomas. Euler/experimental
correlations of SONC boom ressure signatures.
Reno, Nevada, January 1995. AIAA Paper 91-3276. AIAA 8 th Applied Aerody-
namics Conference, Baltimore, September 1991.
[19] J.A. Benek, P.G. Bunin and J.L. Steger. A 3-
D Chimera d embed?& techni ue In Pm- [351 T.J. Coakley. Numerical simulation of viscous tran-
ceedings d 7th ComputarioMI h ~ i Dynam-
h
ics Conference, pages 507-512, Cincinnati. OH,
sonic airfoil flows. AIAA Paper 87-0416. AIAA
25th Aerospace Sciences Meeting, Reno, January
1985. AIAAPaper 85-1523. 1987.
1-25

[36] J.P. CroisiLle and P. Villedieu. Kinetic flux spli,tting [53] W.Hackbusch. On the multi-grid method a plied
schemes for hypersomc flows. In M. Napobtano to hfference equations. Compuring, 2029f-306,
and F. Sobetta, editors, Pmc 13th International 1978.
Congress on Numerical Methods in FluidDynam-
ics, Dazes 31C-3313. Rome, July 1992. Springer [54] M.Hafez, J:C. South,andE.M. M-an. Artificial
compressibility method for numencal soluuons of
the transonicfull otentialequation.AIAA Journal.
[37] R.M. Cummings,Y.M.Rirk,L.B. Schiff,andN.M. 17:838-844,197!.
Chadeqian. Navier-Stokes,predictionsfor the F- 18
wing and fuselage a1 large- incidence. J. ofAircraj?, [55] M.G. Hall. Cell vertex multi d schemes for solu-
29:<65-574.1992. tion of the Euler uauons. E P m c . IMA Confer-
[38] G.Dahlquist. A s cial stability roblemforlinear ence on Numeric3 Methods for Fluid Dynamics,
multistep m e t h z B l T , 3:274! 1963. Reading, Apnl 1985.
[39] J.F. Dannenhoffer and J.R. Baron. Robust
adautation for comvlex transonic flows.
Papk 86-0495, A L h 24th Aerospace Sciences
ad [56] A. Harten. High resolution schemes for hyperbolic
conservation laws. J. Comp. Phys.. 49357-393,
1983.
Meeting, Reno,January 1986.
[40]D. Deganimd and L. Schiff. Computation Of ~ U I ~ U - [57] P.W. Hemker and S.P.Spekregse. Mulugrid solu-
lent supersom flows around pointed bodies havin tion of the steady Euler equatlons. In Pmc. Ober-
crossflowseparation. 1. Comp. Phys.. 66173-19t wol ach Meeting on Multigrid Methods. December
1986. 19d.
[41] B. Delaunay. Sur la sphhre vide. Bull. Acad. Sci- [58] J.L. Hess and A.M.O. Smith. Cgculation of
ence USSR VII: Class Scil, Mat. Nat., pages 793- non-Wing otential flow abou! arbitrary three
800.1934. dimensionafbcdies. Douglas Arcraft Report ES
40622,1962.
[42] S.M. Deshpande. On the Maxwellian distribution,
symmetnc form and entropy conservation for the [59] C. Hirsch, C.Lac01, andH. Deconinck. Convection
Euler equations. NASA TP 2583,1986. algorithms based on a dia onahation procedure
[43] A. Eberle. A finite volume method for calculat- forthernulti-dimensional~~re uauons ~n~mc.
ing transonic potential flow around wings from the AIM 8th CO utational Flu!d hnamics C a
ence. Dazes 3 - 6 7 6 , Hawau, June 1987.
minimum pressure inte al. NASA TM 75324, Paper ST1163.
1978. Translated from &B UFE 1407(0).
[44] P.R. Eiseman., A multi-surface method of coor- [60] C. Hirsch and P. Van Ransbmk. Multi-
dinate generauon. 1. Comp. Phys., 33:118-150, dimensional upwindin and artificial dissi ation.
1979. Technical re ort Publfshed in Fmntiers o Com-
putational duid'Dynarnics 1994, D.A. J u g h e y
and M. M. Hafez, editors, Wiley, pp. 597-626.
[61] D.G. Holmes and S.H.Lapon. Ada tive Irian-
ular meshes for compressible flow sofutions. In
hvceedings First International Conferenceon Nu-
merical Grid Generation in Corn utational Fluid
[46] L.E. Eriksson. Generatjon of boundary- namics, pages 413-424, Landhut, FRG,July
conforming gnds ,around wing-bod configura-
Al
tions usin transfimte mterpolauon. AA Journal,
201313-k320,1982.
386.
[62] T.J.R. Hughes, L.P. Frapca, andM. Mallet. A new
finite element formulauon for computauonal fluid
d namics.1, Symmetric forms of the compressible
d l e r and Navier-Stokes equations and the second
law of thermodynamics. Comp. Meth. Appl. Mech
namics Conference, San Diego, CA, June 1995. and Eng., 59:223-231,1986.
[48] R.P. Fedorenko. The s d of conver ence of one [63] A. Jameson. Iterative solution of transonic flows
iterative process. US&?Comp. Matf and Math. over airfoils and wings, including flows at Mach
Phys., 4:227-235,1964. 1. C o r n on Pure and Appl. Math., 27:283-309,
1974.
[49] C.W. Gear. The numerical integration of stiff ordi-
n a y differentialequations. Report 221,. Universi
of h o i s Department of Computer Science, 1967 [64]A. Jameson. Transonic potential flow calculauons
in conservation form. In Pmc. AIAA 2nd Com U
[501 M. Giles, M. Drela. and W.T. Thompkins. New- rational ~ ~ uDynamics
i d Conference, pages I&
ton solution of direct and inverse transonic Euler 161. Hartford, 1975.
equations. AIAA Paper 85-1530, Cincinnati, 1985.
[65] A. Jameson. Solution of the Euler equations by
[51] S.K.Godunov. A difference method for the nu- a mulugnd method. Appl. Math. and Comp.,
merical calculation of discontinuous solutions of 13:327-356.1983.
hydrodynamic equations. Mat. Sbomik, 47271-
306,1959. Translated as P R S 7225 by U.S. Dept. [66] A. Jameson. Solutionof the Euler e uations for two
of Commerce, 1960. dimensional transon~cflow by a mitigrid method.
[52] M.H. Ha. The im act of turbulence modelling on Appl. Math Comp., 13:327-356,1983.
the numerical prJctionof flows. In M. Na olitano
and F. Solbetta, editors, Pmc. o the 13th fnterna- [67l A. Jameson. Multigrid al orithms for compress-
f
tional Conference on Numerica Methods in Fluid
DyFmics, pages 27-46, Rome, Italy, July 1992.
ible flow calculations. In % e c o d Eumpean Con-
ference on Multigrid Methods, Cologne, October
Spnnger Verlag, 1993. 1985. Princeton University Report MAE 1743.
1-26

I681 A. Jameson. Non-oscillatory shock c turin [821 A. Jameson,,T.J.Baker,apdN.P.Weatherill. Calcu-


scheme using flux limited dissi ation. B.2%
Engquist, S . Osher, and R.C.J. %mmerville, ed-
lation of inviscid transonic flow over acomplete air-
craft. AIAAPaper86-0103,AIAA24thAerospace
itors, Lectures in Applied Mathematics, Vol. 22, Sciences Meeung, Reno, January 1986.
Pari I , Lorge Scale Computations in Fluid Me-
chanics, pages 345-370.AMs, 1985. [83] A. Jameson and D.J. Mavriplis. Multigrid solution
of the Euler eauations on unstructured and adantive
~ ~~ ~ ~ ~ ~~~ ~~~~ ~~ ~

[69] A. Jameson. Transonic flow calculations for air- 'ds. In S . MZCormick, editor, MultigridMethods,
craft. In F. Brezzi. editor, Lecture Notes in Marh- Kay,A lications and~upercomputing.~ecture
ematics, Numerical Methods in Fluid Dynamics, Notes in Ere and Ap lied Mathematics, volume
pages 156-242. SpringerVerlag, 1985. I IO,pages 413-430,Af)pril1987.
[70]A. Jameson. Multi 'd al onthmsforcompressible [841 A. Jameson, ,W. Schmidt, and E. 'hrkel. Nu-
flow calculations. F W . 8ackbuschand U. Trotten- mencal solutlon of the Euler equations by finite
ber , editors, Lecture Notes in Mathematics, Vol. volume methods using Runge-Kutta time stepping
I 2 h ~ g e 166-201.
s PrFeedin softhe2ndEuro- schemes. AIAA Paper81-1259,1981.
pean onference on Mulugrid detbods, Cologne,
1985,Springer-Verlag, 1986. . . A. Jameson. W. Schmidt. andE. Turkel. Numerical
1851
solutions of the Euler equations by finite volume
[71]A. Jameson. A vertex based multigrid algorithm methods with Run e Kutta time ste ping schemes.
for three-dimensional corn ressible flow calcula- AIAA paper 8I-12%, January I98!I
tions. In T.E. Tezduar and F.J.R. Hu hes, editors [86] A. Jameson and E. Turkel. Implicit schemes and
Numerical Methodr for Cam ressib& Flow - Fi: LU decompositions. Math. Comp., 37385-397.
f
nite Di erence, Element Al k tVolume Techniques,
1986. S M E Publication AMD 78. 1981.
[87]M. Ja a r m and A. Jameson. Multigrid solution of
1721 A. Jameson. Aercd namic desi n via control the- the dvier-Stokes uations for flow over wings.
ory. J. Sci. Comp., $233-260, b88. AIAA aper 88 073,AlAA 26th Aeros ace Sci
[73] A. Jameson. Automatic design of mansonic air-
encesbeeting,Reno, Nevada, January 1888. -
foils to reduce the shock induced pressure drag. In [881 D. Johnson and L. King. A mathematically simple
Proceedings of the 3Ist Ismel Annual Conference turbulence closure model for attached and sepa-
on Aviation and Aemnautics, Tel Aviv, pages 5-17, rated turbulent boundary layers. A A 4 Journal,
February 1990. 23:1684-1692,1985.
[74] A. Jameson. lime dependent calculations using [891 W.P. Jones and B.E. Launder. The calculation
multi 'd, with applicatlons to unsteady flows ast of low-Reynolds-number phenomena with a two-
airfoipand wings. AIAA aper 91 1596,
loth Corn utational Fluid gynamics Conference,
A%A e uationmodelof turbulence. Int. J. ofHeat Tron.,
1%:1119-l130,1973.
Honolulu.!hawaii. June 1991.
[90]W.H. Jou. Bwin Memorandum AERO-B113B-
[75]A. Jameson. Artificial diffusion, upwind biFing, L92-018,Septemfer 1992. To Joseph Shang.
limiters and their effect on accuracy and mulugnd
convergence in transonic and hypersonic flows. [91] T.J. Kao, T.Y. Su,andN.J. Yu. Navier-stokescal-
AIAA Paper 93-3359,AIAA 11th Computational culations for transport wing-body configurations
Fluid Dynamics Conference, Orlando, FL. July with nacelles and shuts. AIAA Paper 93-2945,
1993. AIAA 24th Fluid Dynamics Conference, Orlando,
July 1993.
[76] A. Jameson. Artificial diffusion. upwind biasing, [921 G.E. Karniadakis and S.A. Orszag. Nodes, modes
limiters and their effect on accuracy and,mulu- and flow codes. Physics Today, pages 3442,
d convergence in transomc and hy BONC flow.
.EM paper 93-3359. AIAA I ~ t gmputatiqnal
h
Fluid D namics Conference, Orlando, Flonda,
March 1993.
[931 M.H. Lallemand and A. Dervieux. A multi-
July 1993. 'd finite-element method for solvin the two-
%ensional Euler equations. In S.F. &Connick,
[77]A. Jameson. Optimum aerod namic desi n via editor, Pmceedin s of the Third Cop er Mountain
boundary control. Technicdre ort, A6ARp Con erence on &It, rid Methods, &care Notes
FDPNon K m an hstiFtespecial 8,urseon opu-
mum Desi n Methods m Aerodynmcs, Brussels,
d
in ure and AppJiedhathematics, pages 337-363,
Copper Mountam. Apnl1987.
April 199j
[94]A.M. Landsber ,J P Boris. W. Sandberg, and T.R.
[781 A. Jameson. MAE Technical Report 2050, F'rince- Young. Naval sh$ ;uperstructure desi
ton University, F'rinceton, New Jersey, October
1995.
three-dimension flows using an
le1 method. High Perfomnce Com uting ?99$
e&i2tmgF
Gmnd Challenges in ComputerSimu%tion, 1993.
[791 A. Jameson. Analysis and design of numerical
schemes for, gas d y n m c s 1, e f i c i a l diffusion, [95]P. D. Lax. H perbolic systems of conservation
upwind biasmg, b t e r s and theu effect on multi- laws. SIAM dgional Series on Appl. Math., U,
grid convergence. Int. J. of Comp. Fluid Dyn., 1973.
4171-218,1995.
[80]A. Jameson. Analysis and design of numerical
schemes for as dynamics 2, artificial diffusion
and dscrete dock shuchue. Int. J. of Comp. Fluid [97]P.D. LaxandB. Wendroff. Systems ofconservation
Dyn.. To Appear. laws. Comm Pure. Appl. Math., 13917-237,1960.
1811 A. Jameson and T.J. Baker. Improvements to [98] B. Van Leer. Towards the ultimate conservative
the aircraft Euler method. AIAA Paper 87-0452, differencescheme. U. Monotonicit and conserva-
AIAA 25th Aerospace Sciences Meeting, Reno, tion combined in a second order scgeme. J. Comp.
January 1987. Phys., 14:361-370,1974,
1-27

[99] B. Van Leer. Towards the ultimate conservative [ 1151 L. Mdnelli, A. Jameson, +d E. Malfa. Numerical
difference scheme. ID upstream-centered finite- simulation of three-dimennonal vortex flows over
difference schemes for ideal compressible flow. J. delta wing configurations. In M. Napolitano and
Comp. Phys., 23963-275,1975. E Solbetta, editors, Pmc. 13th International Con-
frence on Numerical Methoh in Fluid Dynamics,
ages 534-538. Rome, Italy, July 1992. Springer
eerlag, 1993.
[116] L. Martinelli and V. Yakhot. Y G - b F e d ,turbu-
lence transport approximations w~thapphcauons to
transonic flows. AIAA Paper 89-1950, AIAA 9th
Com utational Fluid Dynamics Conference, Buf-
[ 1011 B. Van Leer. P r o ~ s &multi+imension_al-up-
s f a l o , b , June 1989.
wind differencing. In M. Napolltano and F. Sol-
betta, editors, Pmc. 13th International Conference [I171 D.J. Mavri lis and A. Jameson, Multifid solu-
on Numerical Methotds in Fluid D f u z m y jages tion of the%avier-Stokes equauons on tnangular
1-26, Rome, July 1992. !jpnnger erlag 19 3 meshes. AIAA Journal, 28(8):1415-1425, August
1990.
[lo21 B. Van Leer, W. T. Lee, and P. L. Roe. Charac-
teristic tlme stepping or local preconditionin of [118] D.J. Mawiplis and L. Martinelli. Multigrid solu-
the Euler equauons. AIAA aper 91 1552, A h A tion of compressible turbulentflow on unstructured
loth Corn utational Fluid &namicE Conference, meshes using a two- uation model. AIAA Paper
Honolulu,%awaii, June 1991. 91-0237, January 1 9 3 .
[ 1031 D. Lefebre, J.Peraire, and K.Mor an. Fhte ele- [119] N. D. Melson, M. D. Sanehilt, and H. L. Atkins.
ment least squares solutions of the,
E! uler equations lime-accurate Navier-Stokes calculauons w~th
using linear and quadrauc ap roxunauons. Int. J. multigrid acceleration. In Pmceedings of the Sixth
Comp. FluidDynamics, 1:l-$3. 1993. Copper Mountain Conference on Multigrid Meth-
ods, Copper Mountam, Apnl1993.
[ 1041 S.K.Lele. Compact finite difference schemes with
tral-like resoluuon. CTR Manuscnpt 107,
?KO.
[lo51 J.L. Lions.
[121] E Menter. Zonal twc- uation k w turbulence
models for aerod namicYows. ~ I A A Pcper 93-
[106] M-S. Liou and C.J. Steffen. A new flux splitting 2906, AIAA 2 4 d Fluid D y n m c s Meehng, Or-
scheme. J. Comp. Phys., 107:23-39,1993. lando, July 1993.
[ 1221 E.M. Murman. Analysis of embedded shock waves
[lo7 E Liu and A. Jameson. Mu!tigrid Navier-Stokes calculated by relaxation methods. AIM Journal,
calculationsfor three-dimensional cascades. AlAA 12626433.1974.
aper 92-0190, AIAA 30th Aeros ace Sciences
beeting, Reno, Nevada, January 1982. [ 1231 E.M. Murman and J.D. Cole. Calculation of plane
steady transonic flows. AIAA Journal, 9:114-121.
[ 1081 R. Lohner and D. Marfin. An implicit linelet-based 1971.
solver for incompressible flows. AIAA a er 92
0668, AIAA 30th Aerospace Sciences?&etinL [ 1241 R.H. Ni. A multiple grid scheme for solving the Eu-
Reno,W.January 1992. lerequations. AlAA Journal,201565-1571,1982.
[lo91 R. Lohner, K. Morgan, and,]. Peraire. Improved [ 1251 S. 0bayas.F and K. Kuwakara. LU factorization
adaptive rehement strateges for the finite ele- of an imphcit scheme for the compressible Navier-
ment aerod namic conligurations. AIAA Paper Stokes uations. AIAA Paper 84-1670, AIAA
86-0499, Al%A 24th Aerospace Sciences Meeting, 17th F l 8 Dynamics and Plasma Dynamics Con-
Reno, January 1986. ference, Snowmass.June 1984.
[I101 R. Lohner, K. Morgan, J. Peraire, and O.C. [126] J.T. Oden. L. Demkowicz, T. Liszka, and
Zienkiewicz. F~mteelement methods for high W.Rachowicz. h- ad tive finite e!ement meth-
s ed flows. In Pmc. ALAA 7th Computational ods for compressibye an? incompressibleflows. In
Xid Dynamics conference. Cincinnati, OH, S. L. Venneri A. K.Noor, editor, Pmceedings of
1985. AIAAPaper85-1531. the Symposium on Computational Technology on
Flight Vehicles, a es 523-534, Washington, D.C.,
[l111 R. Liihner and P. Parikh. Generation of three.- November l d & r g a m o n .
dimensional unstructured grids by the advancing [127] S. Orszag and D. Gottlieh. Numerical analysis of
front method. AIAA Paper 88-0515. Reno, W , spectral methods. SIAM Regional Series on Appl.
January 1988. Math., 26,1977.
[ 1121 R.W. MacCormack ,and A.J. ,Paullay. . Com uta 11281 S. Osher. , Riemann solvers, fhe entro condi-
tionalefiiciency acheved by m e sphmng of E n i i tion, and difference approxunatlons.S l A r J . Num.
differenceoperators.AIAAPaper 72-154.1972. Anal., 121:217-235,-1984.
[ I 131 A. Majda and S: Osher. Numerical viscosity and [129] S. Osher and S. Chakravarthy., Hi h resolution
theentro condrtlon. C o r n onPureAppl. Math., schemes and the entropy condmon. h U J . Num
32:797-& 1979. Anal., 21:955-984,1984.
[114] L.Martine1liandA. Jameson.Validationofamulti- [130] S. Osher and E Solomon. Upwind difference
d method for the Reynolds averaged equauons. schemes for hyperbolic s stems of conservation
E 4 A paper88-0414.1988. laws. Math. Comp., 38:336-374, 1982.
1-28

U1311 S. Osher and E. Tadmor. On the convergence of [1481 C.L. Rumsey andYN. Vatsa. A corn arison of the
difference approximations to scalar conservation %
predictive capabilities of several tur ulence mod-
els using upwind and centered - difference com-
laws. Math. Comp., 5019-51, 1988.
puter codes. AIAA Paper 93-0192, AIAA 31st
[I321 H. Pailhe and H. Deconinck. A review of multi- Aerospace Sciences Meeting, Reno, January 1993.
dimensional upwind residual distribution schemes
for the euler equations. To appear in CFD Review, 11491 S.S. Samant, J.E. Bussoletti, ET. Johnson, R.H.
1995. Burkhart,B.L. Everson, R.G. Melvin, D.P. Youn
L.L. Erickson, and M.D. Madson. TRANAIR k
corn uter code for transonic anal ses of a r b i t r ~
[1331 N. A. Pierce and M. B. Giles. Preconditioning on
stretchedmeshes. Report9YlO 1995, OxfordUni- contf)gurations.AIAA Paper 87-0834.1987,
versity Computing Laboratory, Oxford, December
1995. [ 1501 K. Sawada and S. Takanashi. A numerical investi-
ation on wing/nacelle interferences of USB con-
U341 0. Pironneau. Optimal Sha e Desi n or ENiptic EgFation. In Pmceedin s AIAA 25th Aemspace
System. Spnnger-Verlag, dw
Yorf, f984. Sciences Meeting, Reno,&, 1987. AIAA paper
87-0455.
[ 1351 K.G. Powell and B. van Leer. A genuinely multidi-
mensional upwind cell-vertex scheme for the Eu- [151] C.W. Shu and S. Osher. Efficient implementa-
ler equations. AIAA Paper 89-0095, AIAA 27th tion of essentially non-oscillato shock-ca turing
Aerospace SciencesMeeting, Reno, January 1989. schemes. J. Comp. Phys., 77:43v471,198!.
[136] K. Prendergast and K. Xu. Numerical hydrody- [1521 C.W. Shu and S. Osher. Efficient implementa-
namics from gas kmetic theory. J. Comp. Phys., tion of essentially non-oscillato shock-ca turing
109:5366, November 1993. schemes 11. J. Comp. Phys., 83:%-78,1988.
[ 1371 T.H. pulliam and J.L. Steger. Implicit finite differ- [1531 C.W. Sbu, T.A. Zang. G. Er1ebacher.D. Whitaker,
ence simulations of three-dimensional compress- and S. Osher. High-order EN0 schemes ap lied
ible flow. AlAA Journal, 18:159-167,1980, to two- and three-dimensional compressible iow,
Appl. Num Math., 9:45-71, 1992.
[I381 J.J. Quirk. An altemauve to unstructured gnds [I541 B. R Smith. A near wall model for the IC - 1 two
for compuung as dynamics flowsnbout arbitranly equation turbulence model. AIAA paper 942386,
com lex two-cknensional bodies. ICASE Report
92-$Hampton. VA, February 1992. 25th AIAA Fluid D namics Conference, Colorado
Springs, CO, June 1'994.
11391 R. Radespiel, C. Rossow, and R.C. Swanson. An
efficient cell-vertex multigrid scheme for the three- [I551 L.M. Smith and W.C. Reynolds. On the Yakbot-
dimensional Navier-Stokes equations. In Pmc. Orszag renormalization ou for deriving turbu
AIAA 9th Corn utational Fluid D namics Con- lence statistics and modef PRys. FluidSA, 4:36L
ference, pages %9-260, Buffalo, I$Y, June 1989. 390,1992.
AIAA Paper 89-1953-CP. [1561 R.E. Smith. Three-dimensional aleebraic mesh
U1401 M.M. RaiandP. Moin. Direct numerical simulation eneration. In Pmc. AlAA 6th Corn uktional Fluid
of transition and turbulence in a spatially evolvin bynamics Con erence, Danvers, d A , 1983. AIAA
boundary layer. AIAA Pa er 91 1607 CP,
loth Corn utational Fluidgynakcs Conference,
d Paper 8 3 - 1 9 d
Honolulu,&. June 1991. [ 1571 R.L. Sorenson. Elliptic generauon of compressible
three-dimensional gnds about realistic aircraft. In
(1411 S.V.RaoandS.M.Deshpande. Aclassofefficient J. HauserandC.Taylor, editors,Infemotiono/Con-
kinetic U wind methods for com ressible flows. feynce on Numerical Grid Generation in Com U
Report9fFMI1, IndianInstituteo! Science, 1991. fafionalF/uidDynomics,Landshut, F. R. G., 1886
[ 1421 J. Reuther and A. Jameson. Aerod namic shape op- [ 1581 R.L. Sorenson. Tbree-dimensional elliptic gjid
eneration for an F-16. In J.L. Steger and
timization of wing and wing-bod; c o d
using control theory. AIAA paper 9 5 - 0 E % : B.Generation
F. Thompson, editors, Thne-Dimensiona/ ~ r i d
or Corn lex Confi umrions: Recent
Aerospace Sciences Meeting and Exibit, Reno,
Nevada, January 1995. Progress, lh8.AGhiDograpf?
H431 J. Reutherand A. Jameson. Aerod namic shape op- [159] P. Spalart and S. Allmaras. A one-equation tur-
J
timization of wing and wing-bo y c o d urations
usin control theory. AIAA aper 95-01f7, AIAA
bulent model for aerodynamic flows. AIAA Paper
92-0439, AIAA 30th Aerospace SciencesMeeting,
331Lf Aerospace Sciences d e t i n g , Reno,Nevada, Reno, Nv, January 1992.
January 1995. [160] C.G. Speziale,, E.C. Anderson, and R. Abid. A
[1441 H. Rieger q d A. Jameson: Solution of steady critical evaluatlon of two-equatlon models for near
three-hmensional comgressible Euler and Navier- wall turbulence. AIAA Paper9C-1481, June 1990.
Stokes e uations b an im licit LU scheme. AIAA ICASE Report 90-46.
p e r ' 8 h 6 1 9 , A h 26% Aeros ace Sciences [I611 J.L. StegerandDS Chaussee. Generation ofbodv- .-
ceeting, Reno, Nevada, January 19118. ~~

fitted coordinates usin h perbolic artial diffir-


~

ential e uations. S I A l ~s c i . ank'srat. comp.,


11451 A. Rizzi and L.E. Eriksson. Computation of flow 1:4314\7, 1980.
around wings based on theEulerequations.J. Fluid
Mech., 148:45-71,1984. [162] J.L. StegerandR.F. Warmine. Flux vectorsnlitting
of the ini-isid as dynamic equations with ap hca
=---:--=
[146] P.L. Roe. Approximate Riemann solvers, param- tions to finite Lfifferencemethods. J. Comp. Jhys.:
eter vectors, and difference schemes. J. Comp. 40:263-293.1981.
Phys., 43:357-372.198 1.
Cl631 B. Stoufflet,J. Periaux, F. Fezoui, and A. Dervieux.
[I471 P.E. Rubbert and G.R. Saaris. A general three- Numerical simulation of 3-D hypersonic Euler
dimensional otential flow method a lied to flows around space vehicles using adapted finite
V/STOL aeroiynamics. SAE Paper 680&, 1968. elements. AIAA paper 87-0560, January 1987.
1-29

I1641 G.StrangandG.Fix.AnaI sisoftheFiniteEIement


Method. Prentice Hall, 1 6 3 . analysis of turbulence.
Comp., 1:3-51, 1986.
f..Basic theory.
[ 1821 V. Yakhot and S.A. Orsza Renormalization oup
~ S C L
11653 R.C. Swanson and E. Turkel. On central-differcnce
and U wind schemes. J. Comp. Phys., 101:297- [1831 H.C. Yee. eaic and upwind TVD
306. P992. SF
schemes. In P%c. th GAMM Conference on Nu-
mrical Methods in Fluid Mechanics. Gottinaen.
[166] P.K. Sweby. High resolution schemes usin flux September 1985.
limiters for hyperbolicconservation laws. S h J.
NumAnaL,21:995-1011.1984. [ 1841 S. Yoon and A. Jameson. Lower-U per Symmetric-
Gauss-Seidel method for the Eu!er and Navier-
U671 S. Ta’asan, G.Kuruvila. and M. D. Salas. Aerod Stokes equations. AIAA Paper 87-0600, AIAA
namic desi n and optimization in one shot.
paper 92&, 30th Aeros ace Sciences Meeting
AIL 25th Aerospace Sciences Meeting, Reno, January
1987.
and Enbit, Reno, Nevada, %nuary 1992.
11681 Shlomo Ta’asan. canonical foms of multidimen-
sional, steady inviscid flows. ICASE 93-34. In-
stitute for Computer A lications in Science and
Engineering, Hampton,vA, 1993.
[169] E. T+,mor. Numerical viscosi and the entropy
condmon for conservauve
Math. Comp., 32369-382,1984.
d2
erence schemes.

[170] J.F. Ihompson, F.C. Thames,,and C.W. Mastin.


Automatic numerical generation of My-fitted
curvilinear coordinate system for field containing
any number of arbitrary two-dimensional bodies.
J. Comp. Phys., 15999-319,1974.
.~
11711 J.F.Ihomnson. Z.U.A. Warsi. and C.W. Mastin.
Boundaryhted coordinate s stems for numerical
solution of artial differentide uations: A review.
J. Comp. fiys.. 47:l-108. 1989.
[ 1721 E. Turkel. Preconditioned methods for solving the
incom ressible and low speed equations. J. Comp.
Phys.,!J2:277-298.1987.
[173] V. Venkatakrishnan. Newton solution of inviscid
and viscous problems. AIAA paper 88-0413, Jan-
uary 1988.
I1741 V. Venkatakrishnan. Convergence to steady state
solutions of the Euler uatlons on unstructured
ds with limiters, A l a aper 93-0880, AIAA
%st Aerospace Sciences deting,- Reno, Nevada,
January 1993.
[175] G.Voronoi. Nouvelles plications des parame-
tres continus a la theoriexs formes quadrati ues
Deuxihe &moire: Recherches sur les ar%el:
, a8.
loedres rimitifs. J. Reine Angav. Math., 134:198-
287,
[176] Y.Wada and M-S. Liou. A flux splitting scheme
withhi h resolution androbustness fordmontinu-
ities. &$aperW-OO83. AIAA 32ndAeros ace
Sciences eehng. Reno,Nevada, January 1984.
[177] N.P. Weatherill and C.A. Forsey. Grid generation
and flow calculations for aircraft geometries. J.
Aircraft, 22855-860.1985.
[178] D.C. Wdcox. A half a century historical review of
the k-w model. AIAA Paper 91-0615, AIAA 29th
Aerospace Sciences Meeting, Reno, Nv,January
1991.
[1791 F. Woodward. An improved method for the aerc-
dynamic analysis of wmg-body-tail configurations
in subsonic and su rsonic flow, Part 1 - the0
and application. N&A CR 2228 Pt.1, May 197?
[MO] P. Woodward and P. Colella. ?e numerical sim-
ulation of two-dimensional flutd flow with strong
shocks. J. Comp. Phys., 54115-173,1984.
[181] K. Xu.L. Martinelli, and A. Jameson. Gas-kinetic
finite volume methods, flux-vector splitting and ar-
tiftcial diffusion. J. Comp. Phys., 12048-65,1995.
I

Figure 21: Navier-Stokes Predictions for the F-18 Wing-Fuselage at Large Incident,
22a: RAE-2822 Airfoil 22b: NACA-0012 Airfoil
Figure 22: 0-Topology Meshes, 160x32

'1
- 4

- 3

- 5

-a

-8

-I

- 5

%:', & t&i d* a& A ra


Y

Figure 23: RAF,-2822 Airfoil at Mach 0.750 and a=3.O0H-CUSPScheme.


1-32

'1

P
2 k C, after 35 Cycles.
wob

24b Convergence.
Ci=0.3654, Cp0.0232.
Figure 2 4 NACA-0012Airfoil at Mach 0.800 and C Z = ~ . ~ ~ ~ H -Scheme.
CUSP

-
% %

a
I:
3 I $ - -
Wod
25a: C, after 35 Cycles. 25b Convergence.
Ci=0.3861, Cd=0.0582.
Figure 25: NACA-0012Airfoil at Mach 0.850 and a=l.OoH-CLJSPScheme.
1-33

a
*
3-
/*+ ' ...
*..
e 4 -
................:
;

-
*/
:4*..
2 1 **.

I . 4:
si

26a: 12.50% Span. 26b: 31.25% Span.


Cp0.2933, Cd=0.0274. Ci=0.3139,Cd=0.0159.

..............
9-
:/
*
c
*+:*a. .....
*.
3- :
t

26c: 50.00% Span. 26d: 68.75% Span.


Ci~0.3262,Cd=0.0089. Cl=0.3195,Cd=0.0026.
Figure 2 6 Onera M6 Wmg. Mach 0.840, Angle of Attack 3.06", 192x32~48Mesh. C~=0.3041,C~=0.0131
BCUSP scheme.
1-34

27a: Initial Wing 27b 40 Design Iterations


CL=O.SOOI,C~=0.0210, a=- 1.672" c&.Ol12, a=-0.283'
C~=0.5000,
Fixed Lift Mode.Drag Reduction at CL=.^.
Figure 27: Swept Wing Design Case (1). M=0.85,
1-35

(1'

UPPER SURFACE PRESSURE UPPER SURFACE PRESSURE

28a: Initial Wing 28b: 40 Design Iterations


C~=0.5001,Cp0.0210,a=- 1.672" C~=0.5000,C~=0.0112,u=-0.283'
Figure 28: Swept Wing Design Case (I), M=0.85,
Fixed Lift Mode.Drag Reduction at C~=0.50
1-36

9-

5-

2- $11
a
4
.*

% 9 -

8.

'i 1 ,

'1 i
3 -
29a: span station z=O.OO 29b span station 24.312

2%: span station ~ 4 . 6 2 5 29d: span station ~ 0 . 9 3 7

Figure 29: KO67 solution for initial wing.M=0.85,C~=0.4997,


C0=0.G207, (r=-1.970".
1-37

- -
U"?

P
I.'

f!
t
f '1
!J

30a: span station z=O.OO 30b: span station z=0.312

8
$1
7-

1-

U"?

'i
,j
- -
30c: span station z=0.625
f!J

30d: span station z=0.937

Figure 30: FL067 check on redesigned wing.M=0.85, C~=0.4992,C~=0.0094,a=-0.300"


1-38

- -
e
+c
s""*x++++++++++++++++++++++
a - +
;F ............... .......... ++++

+.
. + 8 -
+:
" ".
9 i t
.

.
t

."" 9 - ," . .
'I".

ii
c

t
t s
+
i

31a: span station z=O.OO 31b: span station z=0.312

'1
SJ
I E

- -
31c: span station z=0.625
'1
S i

31d: span station z=0.937

Figure 3 1: FL067 check on redesigned wing at a higher Mach number. M=0.86,C~=0.4988,C~=0.0097,


(r=-O.44O0.
2- 1

CFD RESEARCH IN THE CI 4NGING U.S. AERO JTICA INDI STRY


I

Paul E. Rubbert
The Boeing Company
Boeing Commercial Airplane Group
P.O. BOX3707, M / S 67-UC
Seattle, WA 98 124-2207
U.S.A.

SUMMARY the engineering community in industry. The


other is the need to unshackle the minds of
Changes are taking place in the world of researchers from the imprisonment of an
CFD that extend beyond the technical. They overly narrow value system, a task which
include change to the "research engine," the must be led by the money givers who inhabit
system infrastructure that powers CFD the research engine.
research, as it seeks to adapt to the new
industrial paradigm that is sweeping the
aeronautics industry, and the world. The INTRODUCTION
"research engine" involves government,
academia, and industry. Because it is a I find it interesting to contemplate those
system, all parts of it must participate in topics which are likely to be the pacing
change. None of the parts can exist in items and new challenges in CFD.
isolation. Traditionally, such an endeavor would focus
on the technology issues associated with
This paper analyzes the workings of the CFD; things like algorithmic developments,
research engine and finds that it is hardware architectures, and so forth.
encountering considerable strain. Resources Concerning the latter, Pradeep Raj has
for all elements of research are below recently presented an up-to-date review
historic levels. "Money givers" are faced (ref. 1) of the issues and pacing items in
with a lack of metrics and infrastructure for CFD technology. It addresses both the
telling them how to invest their resources functional characteristics and the operational
except in high level terms. Leaders of requirements that tomorrow's CFD codes
research are having to redefine their jobs. must have in order to be effective, and it
Researchers are hunkered down to wait it speaks for the U.S. aeronautical industry.
out. And value systems are in disarray and Also, Professor Antony Jameson, the f i s t
conflict. The adaptation of the research keynote speaker, will share with us his
engine to the changing world is far from vision of the technical challenges and future
complete. It is in transition. developments in CFD. There is little that I
could add to their remarks. Therefore, I will
The paper goes on to describe what the focus my remarks on challenges and pacing
author believes to be the principal items that extend beyond the technical.
characteristics and attributes of a well-
functioning research engine, together with a We live at an interesting time. Our world is
few personal experiences that shed some immersed in a period of large and rapid
light on how those attributes can be change. It is moving away from a dogmatic
achieved. He concludes that further belief and reliance upon technology
adaptation of the research engine will be innovation as being the most significant
paced by two key factors. One is the need to element of competitiveness. That is being
change the types of communication that take replaced by a new paradigm, one that is
place between the research community and centered about customer satisfaction, quality
and value as key goals.

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
2-2

We in industry are puisuing those goals by factors and forces that powered the research
focusing on processes (ref. 2). We now engine. This description appears to be
understand that the key to developing better universal. It looks the same, no matter
airplanes is to analyze, understand, and whether you reside in industry, academia, or
improve the processes by which airplanes in a government laboratory. It works the
are created. Similarly, the key to developing same way. Only the names of the players
better CFD is to analyze, understand, and may differ.
improve the processes by which CFD is
created. We also now understand that the Key players are the money givers. Their
leading principle of good processes is role is to divide up money into various large
customer focus and customer satisfaction. buckets, each directed at a particular
That principle applies equally to the category of research, and to distribute it.
processes that produce airplanes and the We all know who those people are. They
processes that produce CFD. are the ones to whom we write research
proposals. Money givers can be found in
And so it seems to me that the most NASA, in the National Science Foundation,
significant pacing item in the world of CFD in the Department of Defense, and in similar
is the need to analyze, understand and institutions in Europe. They also are present
improve the process by which CFD in industry.
capabilities are created. I call that process
the research engine. There is more leverage Most money givers are not close to the real
in fixing up the research engine and adapting details of airplane design, or to the detailed
it to the changes in the world than in processes that use CFD as a tool. They
anything else I can think of. And so that is operate at a higher, more strategic level. But
what I am going to talk about. they still need criteria by which to decide
how to divide up the money. It is instructive
to take a look at what some of those criteria
En in n H wi were.
Worked
One such criteria was to divide up the
The research engine as we know it today money based on historical precedent. That
involves industry, academia, and was, and still is, practiced far and wide. It is
government. Those three components a symptom of zero accountability and zero
interact with each other as a system. And ability to discern what is important.
like most systems, one component cannot be
changed without affecting the others. It Money givers are also susceptible to being
doesn't work for industry to change and the influenced by the visionary utterances of the
others not to change. We are all in this people who inhabit the lower left box of
together. figure 1, the research leaders. Research
leaders are in the business of creating and
The need to change pervades the entire marketing visions of how to make the world
research infrastructure. It involves better. Many of them have become very
information systems and the methods by good at creating visions for research that
which we communicate, including the holy will be looked favorably upon by the money
grails of technical societies, publications, givers. They treat the money giver as the
and technical conferences. It involves the customer. One result of that, of course, is
changing of value systems, which is almost that the research funding decisions that get
a cultural characteristic. And reward made can be quite unrelated to the true
systems. Changing is not easy. needs of the people who design airplanes for
a living.
I would like to begin by examining how the
research engine functioned in the era that we Money givers also are desirous of evaluating
are leaving behind. Figure 1 (see Page 2-3) the caliber of the researchers to whom they
presents a description of the fundamental will give money. It is rarely possible to
2-3

0 Divide funding into broad Prestige among peers


categories Number of refereed paper:
0 Perturbations on 0 Is it novel?
historical levels 0 Number of codes
Back to the
0 Weak accountability well for next deposited on airplane
0 Susceptible to “gee year’s funding companies
whiz” visions 0 Whatever is perceived to

0 Hope that good things influence the money givers


will happen 0 Amount of research

monies captured
0 Size of your empire
~~

-(lt Dollars 4 A publication


0 Dream up “gee whiz” Turn ideas into primitive
visions that capture code
funding No standards
0 Make decisions on the A “gee whiz” demo
detailed content of the
annual research plan
Research plan
I) nursed to semi-
convergence
Write, present and publish
a paper

Figure 1. The Engine That Powered CFD Research

point to a feature on an airplane and say today travels largely by other means.
“this research contributed to So, it was
---.‘I Another consequence of the numbers game
necessary to establish other measures in is that it encourages researchers to attack
order to create a value system which could problems that they know how to solve rather
be applied to individuals. than the problems that need to be solved.
And so our entire research infrastructure was
One popular measure was to look at the caught up in a value system that was largely
prestige bestowed upon a researcher, not by unrelated to what was important to the
his customers, but by his peers, the other engineers who design and build airplanes for
researchers. Can you imagine what a living. What counted was paying homage
automobiles would be like if the criteria for to a value system that controlled access to
designing them was to please the other the annual pot of money necessary to
designers, rather than the people who want support the research leader and hisher staff.
to use cars to drive about in?
A standard part of the job of being a
Another popular measure has been to count research group leader was also to make all
the number of refereed papers that are of the important decisions concerning the
produced by a researcher. One consequence detailed content of the annual research plan.
of this is that our journals and conferences After all, since research leaders are normally
have become littered with papers whose real exposed to new and emerging technology
contribution is low or nonexistent. The that a design engineer is not, it was quite
journals have evolved into being primarily a obvious to research leaders that they, and
scorekeeping system. Scientific information not design engineers, should be in charge of
2-4

defining the annual research plan. And so were dependent upon outside contract
the design engineering community was funding as a source of research money.
excluded from participating.
Communication over the fence was mostly
The researchers themselves focused their one way. It consisted primarily of attempts
work on paying homage to the value system, by researchers to interest the engineering
, because that is what entitled them to go back community in the products of their research.
to the well for next year's funding, and to The system coined a name for this, calling it
become eminent in the eyes of their peers. "technology transfer."
So there it is. A stable, self-sustaining The favored means of lofting the results of
research engine that was capable of CFD research over the fence was to send it
, functioning quite smoothly, all by itself in across on the wings of a scientific
its own little world. It did so for many publication. The publication was the
years. Its weakness, of course, is that it had messenger that told of its charms and 'I

, been almost disconnected from the attributes. And to make sure that at least
community of people who we now some folks in the airplane company would
understand to be the customers of CFD see it, the researcher empowered his delivery
, research, namely the practicing engineers system to honk, to attract attention. Such
who design airplanes for a living. honking is frequently heard at technical
conferences and symposia. In fact, that
Figure 2 (see Page 2-5) exhibits the seems to have become the prime motivation
interfaces, such as they were, that existed for conference attendance. Overlooked was
between the research engine and the the fact that airplane design engineers rarely
aeronautical industry. One such interface attended those conferences.
involved the money givers, who were visited
periodically by clouds of collective wisdom The boards of the fence have names
passing overhead. Those clouds appeared in inscribed upon them, entitled "conferences,"
the form of high level advisory committees, "journals," "perceptions," "value systems,"
wishes of the U.S.Congress, or of industry "reward systems," etc.. Those pillars of
executives, depending upon where the tradition and conventionality are turning out
money giver happened to reside. It is not to be among the factors that impede our
entirely coincidental that these clouds are ability to create a research engine that is
shown to be comprised of the condensation more properly connected to the customer.
of hot air rising from airplane companies. In
any event, the resulting fallout from these It was a very eye-opening experience to us
clouds caused the money givers to in the United States when NASA instituted
occasionally re-balance their research some dramatic changes in communication.
portfolios. They changed the format of some of their
conferences from one wherein the
The other interface lay between the researchers did all of the honking to one in
researchers and the practicing engineers who which industry did most of the talking and
reside in airplane companies. This interface researchers did most of the listening. Lo and
is characterized by the fence in figure 2. behold, it was discovered that the research
Interestingly, the site of the fence was not community was not in fact immune to
always in front of the door of the airplane learning about what was important. We
company. It frequently could be found found that they could even learn from people
inside the airplane company, standing who didn't have PhDs and a lengthy record
between the internal company research of refereed publications. The power of two-
department and the practicing engineers who way communication began to be unlocked!
designed the airplanes. In those cases, the
company research departments paid most So, somewhere along this journey of change
allegiance to the research engine and acted we must abandon or at least supplement our
as an integral part of it, particularly if they old, one-way habits of communication as
2-5

I I
.-\

i I
-1

Figure 2. Intevfaces W i t h the Aivplane Industry


institutionalized by the conventional relevance to industry's needs. Academia is
scientific establishment. We must replace struggling to play a part, while finding a role
them with forms of communication that do for the individual graduate student and the
the job that needs to be done. We are faced educational mission. There are many
with the challenge of tearing down a fence conflicting forces at work. It is a difficult
whose pillars appear to be set in solid problem to even think about, much less
concrete. resolve.
But we have changed. Figure 3 (see Page 2-
v 6) presents my view of the current state of
affairs in the United States. In the right
The process of changing has begun. Various hand part of the figure one can observe a
people and organizations in all parts of the new player appearing in industry. At
research engine are experimenting with new Boeing we call these people "process
and different ways of operating. We are owners,'' but that is not a universally used
searching for a more effective research name. What is universal is the realization
engine, but we have not found it yet. withiin airplane companies that processes are
really important and that somebody must
When I look around, I see an increased level therefore be in charge of them.
of tension throughout the infrastructure.
Many researchers feel pressured to become And so, these "process owners'' represent a
more "applied." NASA is being battered new connection to the research engine.
from many sides, with some voices calling They have created a gap in the fence through
for them to get back to basic research while which their voices are being increasingly
others are calling for them to increase their heard. Money givers, research leaders, and
researchers alike in government, academia,
Money Givers
Infrrstrucliircfor
informrlinn.
Idhack.
m t r i c s not
esLihlished

L
a
Research Leaders
What lira my

--
job hecnme?

don't make waves

Figure 3. The Research Engine in Transition


and industry are being exposed increasingly The individual researchers are impacted by
to their input. They are postured to evolve this evolving research engine and its
eventually into a strong component of the changing power structure. They feel
overall research engine. buffeted from several directions, not the
least of which is a value system that is
The life of a money giver has become more crumbling and in disarray. Their attitude is
challenging. They now all subscribe to the "don't make waves and do what people in
new vision of investing in things that reduce power say they want done." They are not
cycle time, cost, and so forth. But they particularly happy.
mostly lack an infrastructure of established
methods and memcs to guide them. They And so we have not yet arrived at a properly
are inventing and innovating as they go. functioning research engine. I don't have a
complete vision of what that research engine
The life of a research leader is also would look like, but I do know many of the
changing. The more progressive ones view attributes that it should have. Those
their new role to be to define and manage attributes include:
rhe process of developing the annual
research plan, rather than personally making a lead role in supporting the swategic
the planning decisions. The new style of direction set by the industrial
operating that I most frequently encounter is enterprise.
for the research leader and researchers to a proper balance between basic and
simply ask the process owners what they applied research across the R&D
want and need, and then to set about food chain.
implementing it. That is not leading to
many of the attributes that we desire in the a recognition of the importance of
research engine. vision building within the research
process.
2-7

ability to draw upon all of our be communicated, and when. And then we
intellectual resources, both in have to institute mechanisms to make it
developing the research plan and in happen.
executing it.
nimbleness in translating the output I have been fortunate enough to have
of research into products and enjoyed the privilege of running a research
operation that encompassed the entire span
processes. of a research food chain, from foundational,
a value system that causes more of enabling algorithm technology, and fluid
the "right" things to occur. mechanics, to production software, and
customer support. In that position, I was
and perhaps most important of all, a able to experiment, so I learned about some
value system that supplies high things that don't work and other things that
levels of human motivation and do work in properly connecting the two ends
sense of worth, one that leads people of the R&D food chain.
from within to do more of the right
things, and to make it fun once again One thing that doesn't work well at all is to
to be a researcher. have research leaders at the head of the food
chain simply ask the folks at the bottom of
Even though my vision of how to the food chain what they want or need, and
accomplish all of that is yet incomplete, I then to blindly carry out their wishes. That
find within myself a growing conviction leads mostly to short-term, evolutionary
about some of the things that tomorrow's improvements of limited vision. It leads to
research engine must contain. One of those tactical research rather than the strategic
things is a better understanding of the proper research which belongs at the head of the
distribution of roles, responsibilities, and food chain, and it places the researchers in
core competencies that should prevail across the position of "the boiler mom'' staff. They
the R&D food chain comprising industry, have much more to offer than that. Many
academia, and government. What that people have yet to learn the true meaning of
distribution should be can be derived by the words "customer focus" that have
testing it against the axioms that accompany entered our language.
the new indusmal paradigm, an exercise that
certain segments of the research What does work, not only well but
establishment find to be somewhat incredibly well in connecting the two ends
threatening. One outcome of that testing is of the R&D food chain, is to do the
the finding that a best and proper role for following four things:
academia, and for much of NASA, is to
concentrate on the foundational, 1. Eliminate the constraints imposed
overarching, enabling technology research in a researcher's mind by the value
which comprises the head of the R&D food system under which he/she was
chain. educated. Make it O.K. to do
Another of my convictions is that we must things that are outside of the limits
find a much better way of connecting the top imposed by an overly narrow value
and the bottom of the R&D food chain. This system. Create a mind-set and a
curiosity within the researcher to
is something that we as a country have not wander freely up and down the
yet learned to do well at all. And yet the R&D food chain and even into
issues involved are central to achieving a manufacturing.
research engine that contains the attributes
that we desire. This is easier said than done. But it can be
Connecting the two ends of the food chain is done. I've done it! It has to be done,
an issue in communication. We have to because, more than anything else, it is the
develop an understanding of what needs to key that unlocks the power and the
potential of the highly educated, highly
2-8

paid research people whose minds have environment.'I This is not


been refined and proven by our rigorously accomplished by talking mostly
competitive academic system up to the PhD with the management, which has
and post doctoral levels. The organization been our past practice. It is the
or the nation that does this earliest and best direct exposure to the daily issues
will have a very significant competitive faced by the engineering design
advantage. process that really turns on the
creation juices. It is what enables
2. Expose and educate the researchers the parable which says "necessity
who inhabit the head of the food is the mother of invention" to
chain in the high level strategic operate!
thinking that supports the
enterprise in which their customers These three steps, tearing down the
are engaged. We don't do that very imprisoning walls of the value system,
well today, and yet this is the exposing the researcher to the strategic
element that enables researchers to thinking, and exposing h i d e r to the
identify and prioritize head-of-the- engineering world so as to enable the
food-chain research topics in researcher to look beyond what the customer
accordance with their strategic says he needs is the one means that I have
leverage. It must be realized that found to be consistently successful in
strategically relevant research creating ideas for research that have high
should be the primary relevancy and which are supercharged by
responsibility of the head of the bringing to bear the latest and greatest in
research food chain. The lower enabling technology while bathed in the
levels of the chain, where most light of high level strategic thinking. This is
process owners reside, are focused what we must strive for in our research
on tactical implementations. engine.
3. Expose the researcher to the real The reason that one must proceed to a fourth
world of the aerospace engineer. step is that, at this stage, the customer will
This only works well when carried generally not agree with or approve of the
out at the engineering site. Let the ideas and plans that the researchers have
researcher "look over the shoulder" formed as a result of the first three steps.
of the engineering community or Not yet!
the process owner as they
encounter their daily challenges. The reasons are several. One is that the
Let the researcher build personal differing educational background of an
relationships with real engineers. engineer frequently makes it impossible for
What works even better is to him to understand the approach being
expose a team of researchers proposed by the researcher, or to assess the
encompassing a complementary set risk involved in turning ideas into reality.
of differing skills and strengths, And engineers who are immersed in hot
because you will then be deploying projects are much more focused on the
a more complete set of intellectual answer they need tomorrow than in the
assets. The imperative is to strategic directions of interest at a higher
"enable the researcher to look corporate level. They tend to be tactical
beyond what the customer thinkers. But support and enablement of
high level strategic direction is what head-
says helshe needs, and to of-the-food-chain research is all about!
formulate a vision of what
helshe could provide that And so a means must be found to allow the
would really be useful to the researchers to proceed with development of
their ideas in the face of customer
customer and hislher
2-9

opposition. This requires an act of courage that will be "touched and felt" and usually
on the part of the research leader and the should not be comprised.
money givers. But in my experience it
rarely fails to produce handsome dividends. Some people (managers and software
The only problem, if it should even be called specialists in particular) will be troubled
a problem, is that at this stage nobody yet with the idea of producing code that is
knows in exactly what form that dividend undocumented, which probably does not
will be experienced. That appears only in adhere to standards, and which contains
step four. shortcuts. That is because they interpret the
code to be the product. They fail to realize
Step 4 is what I call "vision building." This that the primary product of research at this
is the key activity that converts push to pull stage is vision, not code!
in the R&D food chain. The primary cause
of failed research - and I define failed to The best way I found to build vision was for
mean research that doesn't get picked up and the researchers to again return to the
used by anybody - is that the vision from customer site. They would identify real
the head of the food chain that propelled the design problems being faced by the design
research, and the vision from the bottom of engineers and they would set up and run
the food chain about what those folks think demonstrations of their new CFD
is useful, have no common intersection. If technology on those problems. This led to
those two visions, originating from opposite side-by-side comparisons of new versus old
ends of the food chain, cannot be made to ways of doing things. It frequently did not
intersect, the research will not be accepted. contribute much at that point to the
It will be ignored by the people who call the engineering project's near term design goals
shots in determining what CFD gets used in because the code was still developmental,
the design of airplanes. fragile, hard to use, perhaps containing a few
bugs, and not yet trustworthy.
And so, a key element in the successful
operation of an R&D food chain is the What it did do, and do well, was to build
process that I call "vision building," a vision within the minds of design engineers.
process for bringing together the separate A typical reaction to a set of these
visions that originate at the two ends of the calculations would be ' I s 0 that is what you
R&D food chain. What does it take? can do! Well, if you add this and that, I can
Throwing publications or codes over the use it for
-----.'IThat is vision building! At
fence, which is the traditional approach to that point the engineer becomes an advocate
vision building, doesn't work well at all. of the research. This is when "push"
Presenting "gee whiz" papers at conferences changes to "pull" in the R&D food chain.
doesn't work. Arguing back and forth
doesn't work well either. Neither does The other thing that must happen is that the
voting. I've tried them all. researcher must be able to now let go of his
original vision, the one that led him to
What works is for the research community produce the CFD technology that is being
to produce something that an engineer can demonstrated. He must allow himself to be
"touch and feel," usually a CFD code influenced by the engineer-now-becoming-
capable of performing a small number of the-customer. He must adopt a new and
computations that illustrate what can be better vision.
done. This is not the time or the place for
well-documented code, user friendly input Vision building must be a two way street. It
formats, or polished and orderly software. is a coming together, in the middle, of what
Rather, the researcher at this point is were originally different visions at opposite
engaged in a race to discovery and ends of the food chain. It is not for one end
understanding before his fragile support of the food chain to convince the other end
system runs out of patience. Shortcuts are that its vision is best. It demands two-way
acceptable and encouraged, with one communication. It is intense. It requires
exception. That exception is execution
efficiency. This is one of the key measures
2-10

face-to-face interactions over a period of laboratories descending upon industry sites


time. It demands that a new paradigm of for seven weeks each. But I can envision a
communication be built into the research selected subset of strategically placed
engine! research leaders perhaps doing it. And if we
experiment with different formats and
This is the type of vision building that exposure times, we can probably reduce the
generates the high levels of motivation and exposure time significantly. We simply must
feelings of personal worth that must be develop a new paradigm for communication!
present in a good research engine. It results Another interesting experiment would be to
in engineers and process owners anxiously provide that type of exposure to the money
awaiting the results of your research, calling givers who inhabit the research
you to find out how things are going, infrastructure.
offering to help you, and telling your money
giver to give you more money. It creates I don't yet know what research Michael
passion. It also causes researchers to drive Lewis will choose to work on. That will be
themselves from within to work 16 hours a his decision. In any event, I am now
day, 7 days a week. In that kind of contemplating a second experiment of
environment, it is a lot of fun to be a inviting him back for a try at vision building
researcher. whenever his research has progressed to the
proper state. It will provide him with the
Can we really be bold enough to think in opportunity to expose Boeing people to
terms of government or academic "touch and feel." I will attempt to measure
researchers really interacting with industry his impact on the change in vision that he is
in those ways? Well, this past summer, I able to create within Boeing people, and I
and the Director of ICASE (Institute for will attempt to ascertain how his own vision
Computer Applications in Science and has been caused to change. I will look for
Engineering), Dr. M. Yousuff Hussaini, an intersection of those two visions as a
conducted an experiment in communication. measure of the effectiveness of his research.
He sent one of his research staff, Dr.
Michael Lewis, to Boeing for seven weeks. In my view, the two purposes of
One of those weeks was spent being tutored communication that I am testing with the
in the teachings of competitiveness and Michael Lewis experiments are the key
strategic direction. The other six were spent communications that we must build into the
in learning and observing first hand what the research engine of tomorrow. One is to
practice of business acquisition, engineering, communicate strategic alignment and a
design, manufacturing, and customer broad understanding of the customer and
support was all about. The thing that he was his environment. The other is to provide a
allowed to do during these seven weeks means for vision building, the process of
was to engage in research. achieving an intersection of the vision from
the head of the research food chain with
At the end, I interviewed Michael. I found the vision from the engineering trenches.
that he had learned enough to be able to That is the process that converts push to
"look beyond what industry says it needs pull and opens the door to industrial
and to gain an understanding of what he exploitation of research.
could contribute in terms of research that
could really help industry but that we were The other component of the research engine
probably unaware of." That is what we must that will be particularly influential in leading
strive to achieve in the minds of all research change is the value system. We simply must
leaders who profess to be working at the find a way to tear down the walls that are
head of the research food chain in areas that imprisoning the minds of many of our most
are related to aeronautics. brilliant people.
I cannot envision the entire population of Value systems cannot be created or even
university faculty and government modified very much by proclamation. It
2-1 1

doesn't work to simply proclaim that we will


now adhere to a new and different set of
values. In the long run, it is the money
givers within the research engine who have
the only real power over the value systems.
Value is ultimately associated with those
endeavors that bring in money. That is true
in industry, in academia, and in government.
It will be up to the money givers to do the
right things.

REFERENCES
1. Raj, P., "Requirements for Effective
Use of CFD in Aerospace Design,''
NASA CP 3921, pp 15-28, May 1995
2. Rubbert, P. E., "AIAA Wright
Brothers Lecture: CFD and the
Changing World of Airplane Design,"
ICAS-94-0.2, September 1994
3- I

Parallel Compi ting in Computational Fluid Dynamics


Doyle D. Knight
Department of Mechanical and Aerospace Engineering
Rutgers University - The State University of New Jersey
PO Box 909 - Piscataway, N J 08855-0909
[email protected]

Abstract 2 What is Parallel Compudng?

The paper presents an overview of parallel com- This section presents an anecdotal discussion of
puting in computational fluid dynamics. A tax- the earliest refernce to parallel computing, de-
onomy of parallel computing architectures and scribes Flynn’s and Bell’s classifications of paral-
programming paradigms is described. Issues in lel computer architectures, and briefly discusses
parallel computing are discussed including do- the message passing and data parallel program-
main decomposition and load balancing, perfor- ming paradigms.
mance, scalability, benchmarks and portability.
Examples of experience with parallel computing
in the aerospace industry is described.
2.1 Introduction

Parallel computing is the simulataneous opera-


tion of multiple computational tasks on a com-
1 Overview
puter system. Parallel computing has been an
integral part of computing systems from their
This paper is intended for researchers in Com- beginning. The earliest reference to parallel
putational Fluid Dynamics (CFD) who do not computing appears to be the description by
have experience in parallel computing. It pro- L. Menabrea of Charles Babbage’s computer.
vides a description of parallel computing hard- Among the principal virtues of an earlier (but
ware architecture, software paradigms, the prin- evidently not hal) design, Menabrea describes
cipal issues in utilizing parallel computing for the capability (and importance) of parallel com-
CFD, and examples of use of parallel comput- puting [I]:
ing in the aerospace industry.
Parallel computing, particularly in computa- “. . . Secondly, the economy of time: to
tional fluid dynamics, is a broad field of research convince ourselves of this, we need only
and development. The software and hardware consider that the multiplication of two
technology is developing at an extraordinary numbers, consisting each of twenty fig-
pace. The reader is directed to the numer- ures, requires at the very utmost three
ous journals on parallel computing (e.g., Inter. minutes. Likewise, when a long series
Journal of High Speed Computing, The Journal of identical computations is to be per-
of Supercomputing, Inter. Journal of Parallel formed, such as those required for the
Programming, Inter. Journal of Supercomputer formation of numerical tables, the ma-
Applications), as well as recent conferences and chine can be brought into play so as to
workshops (e.g., Parallel CFD ’95), for further give several results at the same time,
information. Additionally, extensive informa- which will greatly abridge the whole
tion is available on the World Wide Web, e.g., amount of the processes.”
ht t p :/ / www .cnb .compunet .de/ par a/par a.html,
ht tp:/ /www .netlib .org/nhse/ . Also, the first general purpose electonic digi-
I
tal computer ENIAC, built to compute projec-
tile and firing tables for the US Army in World

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
3-2

War 11, was a parallel computer with 25 inde- puter. Modern single-processor workstations
pendent computing units (20 accumulators, 1 or micro-computers are examples of this cate-
multiplier, 1 divider/square rooter, and 3 table gory. Single Instruction Stream/Multiple Data
look-up units) performing different tasks for the Stream (SIMD) computers have several compu-
solution of the specific problem. Moreover, the tational units which can perform the same op-
ENIAC used decimal arithmetic internally (as op- eration (e.g., adding two numbers) simultane-
posed to the binary arithmetic used on modern ously on different parts of the data stream. An
computers) and operated on all ten decimal dig- example is the Cray C-90. Multiple Instruc-
its of a number in parallel. The ENIAC was pro- tion Stream/Single Data Stream (MISD) implies
grammed in hardware, i.e., using a plugboard to simultaneous different operations by separate
wire connections between the units. However, computational units on the same data stream.
the parallel computing capability of the ENIAC Examples of this type are rare. Multiple Instruc-
was never fully realized in practice. After two tion Stream/Multiple Data Stream (MIMD) in-
years of operation, it was reconfigured as a serial dicates multiple computational units operating
centralized computer [2]. simultaneously on multiple data streams. Ex-
amples are the Thinking Machines Corporation
There are four distinct levels of parallelism
CM-5, the Cray T3D and, indeed, the E N I A C .
[l]. The highest level is job, where the com-
puter system operates simultaneously on unre-
lated tasks (e.g., a CFD simulation for an F- Table 1: Flynn’s Taxonomy
18 and a CEM simulation for a B-2). The sec-
ond level is program, where the computer sys- Acronvm Definition
tem operates simulaneously on different parts SISD Single Instruction Stream -
of the same program (e.g., the parallelization Single Data Stream
of a DO loop across multiple processors). The SIMD Single Instruction Stream -
third level is instruction, where the different in- Multiple Data Stream
structions are performed in parallel (i.e., fetch- MISD Multiple Instruction Stream -
ing one instruction from memory while perform- Single Data Stream
ing an arithmetic operation). The fourth level is MIMD Multiple Instruction Stream -
arithmetic and bit, where parallelism is achieved Multiple Data Stream
within an individual arithmetic or bit instruc-
tion. This paper focuses on the second level I I

(program) of parallelism in computational fluid Multiprocessors Distributed Memory Multiprocessors


KSR, BBN
dynamics. We consider the issues of parallelism Single Address Spuce
Central Memory Multiprocessors
Shared Memury
in the context of a single program (e.g., the sim- C r q . Fujirsu. Hirurhi, ISM, DEC. SCI, Sun

ulation of a combustion chamber) operating on


a parallel computer.
Multicomputers Distributed Multicomputers
lnrel Purugon, CMS. NCUEE, Neworb of Workrrurio
Multiple A d r e s s Spuce
2.2 Classification of Parallel Messuge fussing
ICentral Mu1ticommter.s I I
Computer Architectures I I

Flynn [3] originated a classification of paral- Figure 1: Bell’s taxonomy of MIMD architec-
lel architectures which has become widely tures (with examples)
accepted (Table 1). Four distinct categories
Flynn’s classification, although useful for
are defined based on the data stream which is
broadly categorizing parallel computers and
the sequence of instructions and/or data exe-
widely cited, is nonetheless incomplete, and var-
cuted or operated on by a processor. Single
ious other classifications have been introduced.
Instruction Stream/Single Data Stream (SISD)
Bell [4] subdivides the MIMD category into two
is the conventional serial architecture employ-
subcategories as indicated in Fig. 1. Multi-
ing a single stream of data and a single pro-
processors are parallel computers with a single
cessor. This is also known as the von Neu-
address memory (shared memory), i.e., the
mann computer (or architecture) or a serial com-
. . central memory (RAM) is organized into a sin-
3-3

gories, the central memory multi-processors as


described previously, and the distributed mem-
ory multi-processors (Fig. 1) where the indepen-
dence of the distributed memories is hidden from
the user by means of an automatic data trans-
fer mechanism (caching). For a multi-computer,
the distributed memory eliminates the scalabil-
ity problem associated with a single memory
a ) Multiprocessor b) Multicomputer
of limited bandwith. However, multi-computers
Figure 2: Multiprocessor and Multicomputer incur the computational cost and program com-
plexity of message passing.

gle logical address domain which is accessible Other classifications of parallel computers have
to all of the processors (Fig. 2). Processors been developed, e.g., Shore [5], and Hockney
P 1 , . . . ,P n can access the same data in mem- and Jesshope [l].
ory (i.e., the same address location), albeit not
simultaneously. Examples are the Cray C-90 2.3 Parallel Programming
and SGI Power Challenge XL. Multicomputers
are parallel computers with multiple distributed There are two basic types of parallel program-
memory address spaces (Fig. 2). Processor ming paradigms (or environments). As the
P1, . . . ,P n have dedicated, independent mem- name suggests, message passing involves the
ories M1, . . . ,M n which are not directly acces- explicit use of send and receive functions by the
sible by each other. Examples are the Intel applications programmer. These functions com-
Paragon, IBM SP2 and networks of individual municate information between the memory as-
workstations. If processor P1 needs to access signed to individual processors. Many manufac-
data in the memory assigned to processor Pn, turers of distributed memory parallel computers
it sends a message to Pn requesting the data, have developed specialized message passing li-
and P n complies. The transfer of data from braries (e.g., nCUBE, Intel), although standards
the memory of one processor to the memory are emerging (see $3.5.2 and 3.5.3). The data
of another is denoted message passing, and parallel paradigm involves a single program
is a principal characteristic of multi-computers. which controls the distribution of data across
All communications between processors occur all processors, and the operations on the data.
through a communications network C in Fig. Typically, the data parallel language supports
2. Many different types of communications net- array operations and permits entire arrays to be
work topologies have been developed (see Fig. 3 used in expressions. Manufacturers of shared
of Bell [4]). memory parallel computers have developed spe-
The relative advantages and disadvantages of cialized compiler directives for data parallel pro-
multi-processors us. multi-computers have been gramming (e.g., Cray C-90 and SGI). An emerg-
widely studied, and numerous research (and pro- ing standard for a data parallel language is High
duction) machines of both types have been con- Performance Fortran ($3.5.1).
structed [4]. Although greatly oversimplified,
the main issues are as follows. For a multi- 2.4 Examples of Parallel Computers
processor, the shared memory eliminates the
computational cost and program complexity of Table 2 lists a number of current parallel com-
message passing. However, a multi-processor puters. It should be emphasized that the infor-
with a single shared memory is not scalable, mation shown does not fully describe the capa-
i.e., the architecture cannot simply be scaled bilities (and limitations) of a parallel computer.
to an arbitrary number of processors and arbi- Other relevant factors include memory band-
trary memory size. This arises from the limi- width, cache memory, 1/0 bandwidth, compiler
tation on data transfer rate (bandwidth) be- technology, debugging software, etc. Further-
tween memory and processors. This has led to more, the performance specifications change fre-
a subdivision of multiprocessors into two cate- quently due to product upgrades.
3-4

Table 2: Examples of Current and Future Parallel Computers

Name Class Max No. of MFlops/ Max Memory Type of


Processors Processor (GByte) Memory
Convex SPP1200/XA MIMD 128 240 32 Shared
Cray J-90 SIMD 32 200 8 Shared
Cray C-90 SIMD 16 1000 2 Shared
Cray T-90 SIMD 32 2000 8 Shared
Cray T3D MIMD 2048 150 128 Distributed
Cray T3E (2896) MIMD 2048 600 1024 Distributed
DEC 8400 5/300 SIMD 12 600 14 Shared
Fujitsu VPP300 SIMD 16 2200 32 Distributed
IBM SP-2 MIMD 128 266 256 Distributed
Intel Paragon XP/S 35 MIMD 512 150 16 Distributed
NCUBE-2 MIMD 4000 4 250 Distributed
NCUBE-3 (Dec 95) MIMD 12000 100 3000 Distributed
SGI Power Challenge XL SIMD 18 360 16 Shared
Thinking Machines CM-5 MIMD 512 160 64 Distributed
Thinking Machines CM-500 (Fall 95) MIMD 2048 160 256 Distributed

LEGEND
GByte Gigabyte ( lo9 byte)
MFlops Millions of floating point operations per second (theoretical maximum)

NOTES

1. Maximum Number of Processors may refer to processing elements on some systems.

2. Memory does not include secondary memory storage (e.g., Solid-state Storage Device (SSD)
on the Cray C-90/T-90).
3. Dates in parentheses indicate manufacturer’s published date for availability.
3-5

3 Issues in Parallel Computing many cases it is necessary to incorporate dy-


namic load balancing, wherein the load on each
Effective utilization of parallel computing in processor is monitored and the overall task load
computational fluid dynamics involves numer- redistributed to achieve an approximate uniform
ous issues which must be adequately addressed. load.
In this section, we focus on severalkey questions, An example of a simple dynamic load balancing
in the context of development of new codes for method is presented in Borrelli [6]for hypersonic
parallel computing. reacting flow. The chemical reactions are im-
portant only when the local static temperature
3.1 Domain Decomposition and Load exceeds 2000 deg K, and the ratio of computa-
tional work for reacting vs. non-reacting flow is
Balancing
approximately ten. The dynamic load balancing
The partitioning of data and computational algorithm decomposes the domain by assigning
a weighting function of either 1 or 10 to each
tasks among multiple processors is denoted do-
main decomposition. An example is shown in cell, corresponding to non-reacting and react-
Fig. 3. A two-dimensional structured grid for a ing, respectively, and subdividing the domain to
jet engine nozzle is partitionedinto subdomains, achieve an approximate d o r m average weight-
and each subdomain assigned to an individual ing function for each subdomain.
processor. This approach typifies the domain
decomposition for a multi-computer. The do- 3.2 Performance
main decomposition may occur prior to or dur-
ing the execution of the flow code. A key issue is the performance of a CFD code
on a parallel computer. Many difFerent mea-
sures of performance have been proposed, and
there is an active debate regarding the most ap-
propriate. However, in solving a given problem
(e.g., viscous flow past an F-18), the true mea-
sure of performance is simply the wall clock
time to completion. Inotherwords, given the
opportunity to choose among different compu-
Figwe 3: Multi-block grid (from [46]) tational resources, the individual typically se-
lects the resource which yields the amwer in the
The principal objective of domain decomposi- shortest elapsed time, subject to existing con-
tion is to maintainuniform computational activ- straints (e.g., budget, system load, etc).
ity on all processors. This is known as load bal-
ancing. For a fixed numerical algorithm (e.g., Of course, it is impossible to model this selec-
the Euler equations) on a fixed grid, load balanc- tion process in a universal manner, and thus the
ing is straightforward, i.e., each processor is as- development of performance measures have fo-
signed approximately the same number of cells. cused principally on more ideal cases. One per-
However, several factors can complicate load formance measure is megaflop (i.e., millions of
balancing. First, the nature of the governing floating point operations per second) us. num-
equations can change during the computation. ber of processors. An example is presented in
An example is combustion, where the chemical Fig. 4 from Simon et al (81 for two different
reaction source terms are computed only when codes: a 2-D unstructured Euler code [9],and
the local static temperature exceeds a preset a 3-D particle simulation code for rarefied gas
value [6, 71. Second, the number of govern- flows [lo]. Both codes were executed on an In-
ing equations in a given subdomain can change. tel iPSC/860 multicomputer for 2" processors
An example is particle tracking where particles where n = 1,...,7. The unstructured Eder
can accumulate in a subregion (e.g., recircula- code achieves a substantially higher megaflop
tion zone). Third, the grid can change during performance than the particle code.
the computation due to adaptation. Thus, in Another performance measure is eflciency,
3-6

3.3 Scalability

7 Panicle Sirnulolion Cale The impact of scaling a given parallel computer


architecture to increasingly larger number of
processors is a key concern. Although this prob-
lem may be viewed from several perspectives, it
is instructive to examine it in the following con-
text. Consider the solution of a given computa-
tional fluid dynamics problem, e.g., a Reynolds-
averaged Navier-Stokes simulation of an entire
aircraft configuration using a fixed number of
No. of Processors grid points. How does the eficiency of the com-
putation depend on the number of processors ?
This question may be treated (albeit simplis-
Figure 4: Megaflops of Two Codes on the Intel tically) by a straightforward analysis proposed
iPSC/SSO by Amdahl [ll]. Denote the execution time of
the program on a single processor by t l . As-
i.e., the fraction of the peak performance (rela- sume that an analysis of the program and algo-
tive to a single processor) achieved on a machine rithm structure indicated that a portion of the
by a specific code. It is defined as code could be reprogrammed for parallel execu-
tion (e.g., the product of a matrix and a vector,
' CPU time for one processor
n x CPU time for n processors
(1) which is a common operation in iterative meth-
ods for solution of linear systems). Let t p denote
Typically, the efficiency 9 is plotted against the the cpu time on the single processor for this po-
number of processors n. A related quantity is tentially parallelizable section. Let t , denote the
the speedup S defined as CPU time for the remaining unparallelizable (%.e.,
scalar) code. Neglecting the cost of scheduling
S = n9 processors, communications between processors
(ifany) and synchronization time (i.e., the time
Efficiency can depend strongly on the algorithm. required to allow all processors to reach a com-
An example is presented in Fig. 5 from Simon et mon point following execution of the parallel sec-
ol [8] for the same codes as in Fig. 4. Here the tion of code), the efficiency of a parallel compu-
trend is opposite to the megaflop performance tation with n processors is
measure, i.e., the 3-D particle code retains 88%
efficiency at n = 128, while the 3-D unstruc-
tured code drops to 52% at n = 128.
and defining the parallelizable fraction f E

tP/(t" t t P ) ,
1
(4)

This is known as Amdahl's Law and is dis-


played in Fig. 6. The precipitous drop in ef-
Plrtidc Simulation Code
Unstructured Grid Code
ficiency for all but the highest possible paral-
lelizable fractions is strikingly clear. Even for
f = 0.99, the efficiency 9 is 0.5 at n = 101.
16 32 48 64 80 % 112 128 In some cases, the communications cost may
No.of Processors yield even lower efficiencies than predicted by
Amdahl's Law. Consider a fixed domain Z) of
N 3 cells on a multi-computer with n processors.
Figure 5: Efficiency of Two Codes on the Intel
Assume a equi-distribution Z)k, k = 1,. .. ,n of
iPSC/860
N 3 / n cells to each processor. Typically, a halo
3-1

Nevertheless, benchmarks provide insight into


-
- f=.lO the relative performance of different parallel
-
_c_
f = .30
r = .50
f = .70
computers. One of the most widely cited is
the NAS Parallel Benchmarks [13,14]which
-
_c
f = .yo
f=.W includes five kernels (two dimensional statistics
from a Gaussian pseudo-random number gener-
ator, multigrid 3-D Poisson equation, conjugate
gradient methods computation of the smallest
0.2 eigenvalue of a large sparse symmetric positive
definite matrix, 3-D Fast Fourier Transform, and
O’O 16 32 48 64 80 96 112 128 integer sort) and three simulated CFD applica-
No. of Processors tions (SSOR algorithm for block 5 x 5 system,
scalar pentadiagonal system, and block tridiago-
nal system). The NAS Parallel Benchmarks are
Figure 6: Amdahl’s Law
described algorithmically, rather than in a spe-
cific programming language3.They have been ex-
of fictitious cells are added to each processor ecuted on numerous machines including Convex
which represent the additional information nec- Exemplar SPPlOOO, Cray C90/T90/J90/T3D,
essary to integrate the flow variables within cells DEC Alpha Server 8400, Fujitsu VPP500, IBM
assigned to the processor by a single time step. SP2 (Thin and Wide Node) and SGI Power
The number of halo cells is proportional to the Challenge XL. Saini and Bailey [16] make sev-
number of cells in Z)k which share one or more eral observations. These include 1) the perfor-
faces with other subdomains Dl, I # IC, and is mance per unit cost (e.g., MFlops per dollar)
therefore O( N3/n)2/3. The ratio of communica- of the Cray C-90 was the lowest of all systems
tions time to flowfield integration time, denoted tested4, and 2) all vendors employed their own
by C, is therefore specialized parallelization directives to achieve

O(N3/n)2/3 = ( 1 )’I3
(5)
maximum performance. Future enhancements
to the NAS Parallel Benchmarks include the de-
Cx N3/n N3/n velopment of a version using High Performance
Thus, for a fixed number of cells, the relative Fortran and Message Passing Interface (see be-
cost of communications can increase as the num- low).
ber of processors increases’.
3.5 Portability
3.4 Benchmarks
In recent years, significant effort has been de-
Numerous benchmarks have been developed for voted to the development of standardized en-
parallel computers2. All benchmarks have limi- vironments for development of parallel codes.
tations, of course, and the overemphasis on (and Three specific areas of activity are discussed
misuse of) benchmarks has naturally led to a here, namely, development of a standard For-
somewhat skeptical attitude towards them. This tran for parallel computing (HPF), a standard
is perhaps best epitomized by Bailey’s “Twelve for heterogeneous, network-based parallel com-
Ways to Obfuscate the Performance of a Parallel puting environments (PVM), and the more re-
Machine” [121. cently developed standard message-passing in-
terface MPI. There are many other similar re-
‘An alternate definition of efficiency (denoted as search efforts in progress; however, space does
scaled emency, and it counterpart, scaled speedup)
has been proposed whereby the ratio of communications
not permit their discussion here.
cost to computational cost remains fixed as n is increased. 31n contrast, for example, to the LWPACK benchmark
This is achieved by increasing the problem size ( L e . , N) [15] for the matrix of order 100 which is written in FOR-

this implies that N 3 n.-


with the number of processors. From the above analysis,

’A compendium of benchmark reports is a-


TRAN and which may not be modified, including the com-
ment statements.
4The system cost is assumed to be the list price.
vailable at http://performance.netlib.org/performance/-
html/PDSreports.html.
3-8

3.5.1 High Performance Fortran puters are fairly similiar. Below, we provide a
brief description of PVM. Description of other
High Performance Fortran [17, 18, 191 is systems are available (e.g., [24, 25, 26]), and a
a data parallel language which extends For- reasonably comprehensive listing has been com-
tran 90 to provide additional support for the plied by Turcotte [27]. Comparisons of the rel-
data parallel programming style while main- ative merits of different systems have also been
taining compatibility5 with Fortran 90. Devel- published (e.g., [28]).
opment of HPF was initiated in 1991 through
PVM (Parallel Virtual Machine), created
the establishment of the High Performance
by the Heterogeneous Network Project (Oak
Fortran Forum, and the language specifica-
Ridge National Laboratory, the University of
tion was published in May 1993. At present,
Tennessee and Emory University) initiated in
twelve vendors have announced support of HPF.
1989, consists of two software packages [29, 30,
Additional information may be obtained at
31, 32, 331. The first is a daemon p v m d 3 which
http:/ /www .erc.msstate.edu/hpff/home.html.
executes on all of the computers which comprise
A good introduction to HPF is provided by Fos-
the virtual parallel machine. PVM is designed
ter [21].
to enable any user with a valid login to install
HPF extends Fortran 90 to include specific com- and initiate pvmd3. The user specifies a list of
piler directives to control the alignment and dis- computers which comprise the virtual parallel
tribution of data on parallel machines, and in- machine, and starts p v m d 3 on each one. The
troduces new parallel features and additional PVM application can then be initiated from any
intrinsic library functions. For example, the of the computers. The second is a library of
PROCESSORS directive specifies the shape and PVM routines Zibpvm3.a which contains the user
size of an array of (abstract) processors, and callable routines for message pasing, spawning
the ALIGN directive aligns elements of different processes, coordinating tasks and modifying the
arrays with each other, thereby indicating that virtual machine.
they should be distributed across processors in
PVM has been successfully implemented on nu-
the same manner. New intrinsic functions intro-
merous computer architectures [33]. These in-
duced by HPF include NUMBER-OFJROCESSORS
clude heterogeneous and homogeneous networks
and P R O C E S S O R S S H A P E which allow a program
of computers, and also “individual” massively
to obtain information on the number of proces-
parallel computers (e.g., Intel Paragon and Cray
sors on which it executes and the connection
T3D). PVM is widely utilized in academia, in-
topology.
dustry and government laboratories. It is es-
Examples of applications written in HPF are timated that more than 10,000 individuals or
presented in Hawick and Fox [22] and Mueller installations have obtained the PVM software
and Ruehl[23]. A more extensive list is available and approximately 20% to 25% are actively us-
on http://www.npac.syr.edu/hpfa/bibl.html. ing it [34]. An index of PVM software may be
obtained by sending the message send index
from pvm3 to [email protected].
3.5.2 Parallel Virtual Machine (PVM)
An example of a PVM application is the Kor-
A recent major advancement is the develop- ringa, Kohn and Rostoker coherent potential ap-
ment of heterogeneous, network-based parallel proximation (KKR-CPA) method for computing
computing environments. Unlike fixed paral- the electronic properties, energetics and other
lel computer architectures (e.g., Cray C-90, In- ground state properties of substitutionally disor-
tel Paragon, etc.), these network-based paral- dered alloys [33]. An approximate three month
lel computers are created as a virtual machine effort converted the 20K line KKR-CPA code
using software tools such as PVM, Linda, P4 for PVM. The code achieved approximately 200
or Express. Typically, any number of different MFlops using a network of ten IBM RS/6000
networked computers may be connected to form (6 model 530’s and 4 model 320’s) worksta-
a parallel machine, although usually the com- tions, which is estimated to be approximately
82% of the maximum floating point capability
‘For a description of Fortran 90, see Metcalf and Reid
of this virtual system. Also, the PVM KKR-
POI.
3-9

CPA code achieved over 9 GFlops performance Due to its recent introduction, there are a rela-
using a network of twenty seven Cray C-90 and tively small number of applications to date us-
Cray Y-MP processors scattered across several ing MPI. A recent review by Skjellum, Lusk
sites. Furthermore, the PVM KKR-CPA code and Gropp [39] describes recent applications in-
was successfully demonstrated for a virtual ma- cluding unsteady incompressible viscous flows,
chine consisting of two Intel Paragons, a CM-5, groundwater modeling, volume visualization
an Intel i860 and IBM workstations, which were and traffic simulation. Native MPI implementa-
geographically distributed at several sites. tions are currently under development by several
parallel computer vendors [40].
Load balancing, latency and bandwidth are
clearly important issues for implementation of
a virtual machine with PVM or other similar
tools. In a heterogeneous environment, due con-
4 Parallel Computing in
sideration of the relative performance of indi- Aerospace Research
vidual hosts is obviously needed in domain de-
composition. Latency (i.e., the time required to Despite the extensive research on parallel com-
initiate a message) can be a critical issue, de- puting, only a small fraction of numerical sim-
pending on the ratio of communications to com- ulations of aerospace research problems employ
putation. Network bandwith may be restricted parallel computing. A survey of the citations
due to existing traffic. Recent enhancements for parallel and other computers for three jour-
to PVM [34] provide for improved performance. nals is presented in Table 3. The period July
For example, the message passing performance 1993 through July 1995 was surveyed for all ar-
of PVM on the Intel Paragon‘ is only 5% to 8% ticles presenting research involving significant
slower than the native functions [34]. numerical simulation. Approximately 44% of
these articles indicated that a serial or vector
machine (single processor) was employed, while
3.5.3 Message Passing Interface (MPI)
only 3.4% specifically noted that a parallel com-
puter was used. Approximately 52% did not in-
MPI (Message Passing Interface) is a mes-
dicate that machine used. If the statistics for
sage passing standard for homogeneous and het-
the first two categories are assumed statistically
erogeneous parallel and distributed computing
representative of the last group, than an overall
systems. The development of the MPI standard
estimate (upper bound) for the parallel applica-
is a multinational effort which was initiated in
tions is 7%.
1992 and is supported by ARPA, NSF and the
Commission of the European Community. The Why are so few research simulations performed
MPI standard was published in 1994 and is de- on parallel computers ? Certainly, research on
scribed in [35, 36, 371. A good introduction to parallel computing has shown the capability for
MPI is provided by Foster [21], and a brief de- solving a wide range of fluid dynamics problems.
scription is presented in [38]. At the Parallel CFD ’95 Conference, applica-
tions of parallel computing were presented for
An MPI program includes one or more processes
reacting flows, Euler and Navier-Stokes solvers,
which communicate with each other through
spectral methods, multigrid methods, and adap-
calls to MPI library routines. There are two
tive schemes. Numerous other applications
types of communications, namely, point-to-
have been developed (e.g., see, for example,
point communication between pairs of processes,
[41, 42, 43, 441).
and collective communication between groups of
processes. Several variants of “send” functions I posed this question to a number of experts in
are provided to enable users to achieve peak per- parallel CFD. The answers tended to be fairly
formance. Two basic types of communications similar, and not at all surprising. All focused
topologies are provided: a Cartesian grid and an on the issue of calendar time required to solve a
arbitrary process graph [38]. particular problem. As one person stated, “The
machine which you use to solve a problem is ir-
‘Using the functions p v m p s e n d o and p v m p r e c v 0
introduced in PVM Version 3.3.
relevant. The only thing that matters is how
quickly you can get the problem done.” At the
3-10

Table 3: Citations of Parallel and Other Computers (July 93 - July 95)

Journal Parallel Serial/Vector Not Stated Total


A I A A Journal 7 104 110 221
Journal of Aircraft 0 48 47 95
Journal of Fluid Mechanics 8 44 76 128
Total 15 196 233 444
Percent 3.4% 44.1 % 52.5% 100.0 %

present time, many CFD researchers who are (NOWs) for production analysis and design.
not using parallel computing view parallel CFD Two examples are Pratt & Whitney (East Hart-
as 1) lacking a decisive advantage performance ford, CT, and Palm Beach, FL) and McDonnell
advantage (e.g., MFlops) over conventional se- Douglas Aerospace (St. Louis, MO).
rial (and vector) computers in many instances,
Pratt & Whitney (P&W) initiated their Net-
2) difficult to program efficiently, and 3) lacking work of Workstations concept [46] in mid-1989.
in portability.
The decision was motivated by two factors.
All of these factors are likely to diminish in the First, P&W had an installed base of worksta-
near future, and thus the use of parallel comput- tions which had been acquired principally for
ing in aerospace research should increase. Mi- design/drafting work, but which were effectively
croprocessor CPU performance continues to im- unused in the evenings and on weekends. Thus,
prove by a factor of 1.5 to 2.0 per year7 [45], and there was a surplus of compute cycles which
consequently parallel machines are now compa- could be employed for analysis and design, pro-
rable or faster than traditional vector super- vided that the computational tasks could be de-
computers. For example, the Cray T3D (512 composed and parallelized. Second, their exist-
processors) is on average 41% fasters than the ing Cray X-MP, purchased in 1986, was both
Cray C-90 (16 processors) for the three sim- severely overloaded and limited in capability
ulated CFD applications in the NAS Parallel (e.g., memory). Hence, there was a significant
Benchmarks. The Cray T-3D (1024 processor) incentive to invest resources in development of
is 128% faster. The IBM SP2-WN (160 proces- a new paradigm for CFD analysis and design.
sors) was also significantly faster than the Cray
The P&W approach consists of several parts.
C-90 (16 processors) [16]. Also, the emergence
The flow solver is NASTAR, a 3-D struc-
of standards in parallel programming languages
tured grid multi-block Navier-Stokes code. Do-
(e.g., HPF) and message passing functions (e.g.,
main decomposition is straightforward, z. e.,
PVM, MPI) simplify the development of parallel
each block is assigned to an individual processor
code and significantly enhance its portability.
(workstation). An example is shown in Fig. 3.
The momentum, energy and turbulence scalar
equations are solved using Successive Line 'Un-
5 Parallel Computing in der Relaxation (SLUR). The SLUR iterations
Aerospace Industry are performed independently within each block,
with periodic updating of the boundary condi-
Parallel computing has a major presence in the tions to transmit information between blocks.
aerospace industry. Within the past few years, The optimal updating strategy is found by nu-
several major aerospace corporations have de- merical experiments. The pressure correction
veloped extensive Networks of Workstations equation is solved to satisfy the continuity equa-
tion, and employs a parallelized Preconditioned
'The rate of improvement of microprocessor perfor-
mance is much faster than for the specialized processors
Conjugate Residual (PCR) algorithm. The ma-
developed for traditional vector machines (e.g., Cray C- jority of the computational effort is expended in
90) the pressure correction equation, and thus con-
' I . e . , the ratio of the execution time on the Cray C-90 siderable effort was focused on efficient paral-
to the Cray T3D was 1.41.
3-1 I

lelization of the PCR algorithm. Management of


the individual block computations is performed
by Prowess (Parallel Running of Workstations
Employing Sockets), developed by P&W, which
provides communications, parallel job process MulUpl. PE Rings

control, accounting, reliability and workstation


user protection. Communications between in- BATCH CLUSTER

dividual workstations is performed directly us-


ing sockets which emulate a file 1/0 paradigm.
Checkpointing is employed to achieve high reli- Figure 8: Pratt & Whitney network backbone
ability. Workstation user protection is the im- in East Hartford, CT (from [46])
plementation of the P&W policy that the in-
teractive user has the fist priority on a work- FDDI Gigaswitches. There are approximately
station. Thus, for example, Prowess suspends 200 ethernet segments.
(or terminates) any remote process executing
Pratt & Whitney has concluded that their Net-
on a workstation as soon as any activity is de-
work of Workstations paradigm has been suc-
tected on the workstation's keyboard or mouse.
cessful. Fischberg et a1 [46] cite a reduction in
Idle worksations capable of executing NASTAR
design time of 50% to 67% for a high pressure
are identified using the Load Sharing Facil-
compressor and fan design, respectively.
ity (LSF) software from Platform Computing
Corporation. McDonnell Douglas Aerospace initiated their
Network of Workstations concept [47] in late
The P&W workstation network employed for
1992. The decision was motivated by factors
parallel computing is substantial. Approxi-
similar to Pratt & Whitney. First, McDonnell
mately 400 to 600 workstations are employed
Douglas had a substantial number of worksta-
daily for parallel CFD jobs at P&W's East Hart-
tions (mostly Hewlett-Packard 7xx,plus a small
ford, CT facility, and another 300 to 400 work-
number of IBM RS/6000 and Silicon Graphics)
stations at Palm Beach, FL. The growth in us-
which had been acquired principally for CAD.
age of the workstation network for parallel CFD
These workstations were typically utilized dur-
application is displayed in Fig. 7.
ing the daytime, and largely unused in evenings
Clay XMP Equivalents and on weekends. Second, their existing Cray
Jan. 1992 -Aug. 1994
X-MP/18 was both heavily loaded and limited
in capability (e.g., memory), and the corporate
financial position precluded a multi-million dol-
lar new supercomputer acquisition. Third, Mc-
Donnell Douglas wanted to gain experience with
parallel computing technology.
The McDonnell Douglas Aerospace approach
consists of several parts. The flow solver is
NASTD, a proprietary 3-D structured grid
Figure 7: Daily parallel CFD throughput on multi-block compressible Euler/Navier-Stokes
Pratt & Whitney's East Hartford, CT worksta- code. The code is heavily utilized at McDon-
tion network (from [46]) nell Douglas, with typically fifty active users.
A straightforward domain decomposition is em-
A critical element in the Network of Work- ployed, whereby a grid block (subdomain) is as-
stations approach is the network configuration. signed to an individual processor (workstation).
Adequate communications bandwith is essential The code is operated in a master/slave relation-
for effective distributed parallel computing. The ship using PVM [32] for process control and ex-
P&W East Hartford, CT network architecture is plicit message passing between processors.
shown in Fig. 8. It includes multiple Fiber Dis-
tributed Data Interface (FDDI) 100 Mbps back- Parallel computations using NASTD are per-
bone networks connected by Digitial Equipment formed in the evenings and on weekends us-
3-12

ing up to 400 workstations in clusters of 15 to References


20 workstations per job. The reliability (i.e.,
the percentage of submitted jobs which com- [l]R. Hockney and C. Jesshope, Parallel Com-
plete successfully) exceeds 95%. Numerous dif- puters 2. Bristol, England: Adam Hilger,
ficulties were resolved in achieving this perfor- 1988.
mance, many of which were management issues,
e.g., negotiating scheduled hardware, software [2] A. W. Burks and A. R. Burks, “The
and network maintenance (which had oftentimes ENIAC: the First General-purpose Elec-
occurred at random intervals at nights and on tronic Computer,” Ann. Hist. Comput.,
weekends), and changing the perception that vol. 3, pp. 310-399, 1981.
the individual user “owned” the workstation and [3] M. Flynn, “Some Computer Organizations
could therefore reboot it whenever desired (thus and Their Effectiveness,” IEEE Tnznsac-
terminating any slave process in operation and tions on Computers, vol. C-21, pp. 948-960,
crashing the entire parallel computation). 1972.
[4] G. Bell, “Ultracomputers: A Teraflop Be-
6 Conclusions fore Its Time,” Communications of the
ACM, vol. 35, pp. 27-47, August 1992.
Several main conclusions can be drawn regard- [5] J. Shore, “Second Thoughts on Parallel
ing parallel computing in CFD: Processing,” Comput. Elect. Engr. , vol. 1,
pp. 95-109,1973.
0 There is a large number of vendors of par-
allel computers whose systems offer a wide [6] S. Borrelli, A. Matrone, and P. Schi-
range of performance. ano, “A Multiblock Hypersonic Flow Solver
for Massively Parallel Computer,” in Par-
0 Modern parallel computers can equal or ex-
allel Computational Fluid Dynamics ’92
ceed the performance of the largest multi-
(R. Pelz, A. Ecer, and J. Hauser, eds.),
processor Cr ay supercomput ers .
(New York), pp. 25-37, Elsevier Science B.
0 The aerospace industry has taken a leading V., 1993.
role in the application of parallel computing
[7] P. Schiano and A. Matrone, “Parallel
to practical analysis and design.
CFD Applications: Experiences on Scal-
0 The aerospace research community (e.g., able Distributed Multicomputers ,” in Par-
academia and research laboratories) has allel Computational Fluid Dynamics: New
taken a leading role in research on par- Trends and Advances (A. Ecer, J. Hauser,
allel computing, but has not significantly P. Leca, and J. Periaux, eds.), (New York),
employed parallel computing in solving pp. 3-12, Elsevier Science B. V., 1995.
aerospace research problems.
[8] H. Simon, W. V. Dalsem, and L. Dagum,
0 The development of message passing stan- “Parallel CFD: Status and Future Require-
dards (e.g., PVM and MPI) and data par- ments,” in Parallel Computational Fluid
allel programming language standards (e.g., Dynamics: Implementations and Results
HPF) will expand use of parallel computing (H. Simon, ed.), pp. 1-29, MIT Press, 1992.
in CFD.
[9] V.Venkatakrishnan,
H. Simon, and T. Barth, “A MIMD Imple-
7 Acknowledgments mentation of a Parallel Euler Solver for Un-
structured Grids.” Technical Report RNR-
91-24, NASA Ames Research Center, 1991.
I am indebted to R. Agarwal, T. Barth, R. Cos-
ner, C. Fischberg, J. Lewis, R. Pelz, J. Shang, H. [lo] J. McDonald, “Particle Simulations in
Simon, and V. Venkatakrishnan for their com- a Multiprocessor Environment .” Technical
ments and suggestions, and to H. Lau for the Report RNR-91-02, NASA Ames Research
journal survey. Center, 1991.
3-13

G. Amdahl, “The Validity of the Single [22] K. Hawick and G. Fox, “Exploiting High
Processor Approach to Achieving Large Performance Fortran for Computational
Scale Computing Capabilities,” in AFIPS Fluid Dynamics,” Tech. Rep. SCCS-661,
Conference Proceedings Spring Joint Com- Northeast Parallel Architecture Cent er,
puting Conference, vol. 30, pp. 483-485, November 1994.
1967.
[23] A. Mueller and R. Ruehl, “Extending High
[12] D. Bailey, “Twelve Ways to Fool the Performance Fortran for the Support of
Masses When Giving Performance Results Unstructured Computations,” in Proc. of
on Parallel Computers ,” Supercomputing 9th AGM Inter. Conf. on Supercomputing,
Review, pp. 54-55, August 1991. 1995.
[13] D. B. et al, “The NAS Parallel Bench- [24] N. Carrier0 and D. Gelertner, “Linda in
marks,” International Journal of Super- Context ,” Communications of the ACM,
computer Applications, vol. 5, no. 3, pp. 63- vol. 32, pp. 444-458, April 1989.
73,1991.
[25] R. Butler and E. Lusk, “User’s Guide to the
[14] D. Bailey, J. Barton, T. Lasinski, and H. Si- P4 Programming System.” Argonne Na-
mon, “The NAS Parallel Benchmarks,” tional Laboratory, Technical Report ANL-
Tech. Rep. NASA TM 103863, NASA Ames 92/17,1992.
Research Center, July 1993.
[26] A. Kolawa, “The Express Programming
[15] J. Dongarra, “Performance of Various Environment .” Workshop on Heteroge-
Computers Using Standard Linear Equa- neous Network-Based Concurrent Comput-
tions Software .” Report CS-89-95, Com- ing, Tallahassee, October 1992.
puter Science Department, University of
Tennessee, and Oak Ridge National Lab- [27] L. Turcotte, “A Survey of Software Envi-
oratory, 1995. ronments for Exploiting Networked Com-
puting Resources .” Engineering Research
[16] S. Saini and D. Bailey, “NAS Parallel Bem- Center for Computational Field Simula-
chmarks Results 3-95,” Tech. Rep. NAS 95- tions, Mississippi State University, January
011, NASA Ames Research Center, April 1993.
1995.
[28] T . Mattson, “Programming Environments
[17] H. P. F. Forum, “High Performance Fortran for Parallel and Distributed Comput-
Language Specification,” Scientific Pro- ing: A Comparison of p4, PVM, Linda
gramming, vol. 2, pp. 1-170, 1993. and TCGMSG,” The International Jour-
nal of Supercomputing, vol. 9, pp. 138-161,
[18] C. Koelbel, D. Loveman, R. Schreiber,
September 1995.
G. Steele, and M. Zosel, The High Perfor-
mance Fortran Handbook. Cambridge, MA: [29] V. Sunderam, G. Geist, J. Dongarra, and
The MIT Press, 1994. R. Manchek, “PVM: A Framework for
Parallel Distributed Computing,” Journal
[19] H. Performance Fortran Forum, “High Per-
of Concurrency: Practice and Experience,
formance Fortran Language Specification,”
vol. 2, pp. 315-339, December 1990.
tech. rep., Center for Research on Paral-
lel Computing, Rice University, November [30] G. Geist and V. Sunderam, “Network
1994. Based Concurrent Computing on the PVM
System,” Journal of Concurrency: Practice
[20] M. Metcalf and J. Reid, Fortran 90 Ex-
and Experience, vol. 4, pp. 293-311, June
plained. New York: Oxford Science Pub-
1992.
lications, 1990.
[31] J. Dongarra, G. Geist, R. Manchek,
[21] I. Foster, “Designing and Building Par-
and V. Sunderam, “Integrated PVM
allel Programs .” ht t p :/ / ww w .mcs.d.gov /-
dbPP/.
3-14

Computing,” Computers in Physics, vol. 7, [44] D. B. et al, ed., Proceedings of the Sev-
no. 2, pp. 166-175,1993. enth SIAM Conference on Parallel Process-
ing for Scientific Computing, SIAM, 1995.
[32] A. G. et al, “PVM3 User’s Guide and Ref-
erence Manual.” Oak Ridge National Lab- [45] J. Hennessy and N. Jouppi, “Computer
oratory, 1994. Technology and Architecture: an Evolv-
ing Interaction,” IEEE Computer, vol. 24,
[33] V. Sunderam, G. Geist, J. Dongarra, and
no. 9, pp. 18-29, 1991.
R. Manchek, “The PVM Concurrent Com-
puting System: Evolution, Experiences and [46] C. Fischberg, C. Rhie, R. Zacharias,
Trends,” Journal of Parallel Computing, P. Bradley, and T. DesSureault, “Using
vol. 20, no. 4, pp. 531-547, 1994. Hundreds of Workstations for Production
Running of Parallel CFD Applications,” in
[34] A. Beguelin, J. Dongarra, A. Geist,
Parallel Computing 1995, 1995.
R. Manchek, and V. Sunderam, “Recent
Enhancements to PVM,” The International [47] R. Comer, “Experiences at McDonnell
Journal of Supercomputing, vol. 9, pp. 108- Douglas in Converting CFD Production to
127, September 1995. Parallel Processing,” in Parallel Computing
1995, 1995.
[35] M. P. I. Forum, “MPI: A Message-Passing
Interface Standard,” Journal of Supercom-
puter Applications, vol. 8, 1994.
[36] M. P. I. Forum, “MPI: A Message-Passing
Interface Standard,” tech. rep., University
of Tennessee, 1994.
[37] W. Gropp and E. Lusk, “Message Passing
Interface.” http://www.mcs.anl.gov/mpi,
1994.
[38] L. Clarke, I. Glendinning, and R. Hempel,
“The MPI Message Passing Interface Stan-
dard,” March 1994. ftp://par.soton.ac.-
uk/pub/mpi/paper .ps.
[39] A. Skjellum, E. Lusk, and W. Gropp,
“Early Applications in the Message-Passing
Interface (MPI),” The International Jour-
nal of Supercomputer Applications, vol. 9,
no. 2, pp. 79-94,1995.

[40] W. Gropp and E. Lusk, “Implement-


ing MPI: the 1994 MPI Implementors’
Workshop.” ht tp: // www .mcs.anl.gov/mpi-
impl/paper.ps, October 1994.

[41] H. Simon, ed., Parallel Computational


Fluid Dynamics, MIT Press, 1992.
[42] R. Pelz, A. Ecer, and J. Hauser, eds., Paml-
le1 Computational Fluid Dynamics ’92,El-
sevier N.H., 1993.
[43] A. Ecer, J. Hauser, P. Leca, and J. Periaw,
eds ., Parallel Computational Fluid Dynam-
ics. (New York). Elsevier N.H., 1995.
4- 1

Portable Parallelization of a 3D Eulermavier-Stokes Solver for Complex Flows


I
B. Eisfeld, H. Ritzdorf*, H. Bleecke, N. Kroll
Institute of Design Aerodynamics
DLR, D-38108 Braunschweig
Germany
*Institute SCAI
GMD St. Augustin, Germany

1 SUMMARY P viscosity
This paper describes the portable parallelization of the P density
FLOWer code, a large, block structured CFD solver for (T normal stress components
industrial use. Basic requirements for the parallelization
7 shear stress components
are identified, and the strategies applied for its parallel-
ization are explained. Special emphasis is put on the @ components of the energy dissipation
parallel heart of the program, the communications li- function
brary CLIC-3D. Results obtained on several platforms
I demonstrate the success of the method chosen and allow Indices
an assessment of today ‘s capabilities of parallel comput-
a k algorithmic ideal
ers in CFD applications. Parallel computations of air-
craft configurations of varying complexity prove that ijk discrete point
parallel computers have become operational in aircraft I laminar
development. t turbulent
LIST OF SYMBOLS X in x-direction
C specific heat at constant pressure Y in y-direction
8’’ vector of artificial dissipative fluxes Z in z-direction
E total energy
CO at infinity
-
~
F flux tensor 1. INTRODUCTION
H total enthalpy When looking on the progress made in CFD during the
k heat transfer coefficient last decade, one observes that improvements are made
NE? number of blocks in two directions: The algorithms became more flexible
P and faster, e. g. by multigrid techniques, and the hard-
n outward pointing unit normal vector
ware platforms increased in main memory and CPU per-
Pr Prandtl number formance. As far as the progress in computer power is
P pressure concerned, experts predict that only parallel architec-
4 velocity vector tures will allow further improvements leading to peak
performances of about 1 TFLOP/s [ I , 21.
f ii residual vector
Therefore, since this type of super computers might re-
S speed-up
quire a new type of application software, the develop-
T temperature ment of parallel flow solvers is mandatory, if one wants
t execution time to exploit. their abilities in the future. This could be
U velocity in x-direction treated as an isolated subject, when dealing with ques-
tions of basic research interest, but when concerning
V volume
large codes in industrial use, several constraints are lim-
V velocity in y-direction iting the development.
G vector of conservative variables First of all the effort spent for parallelization must be
W velocity in z-direction justified by the gain in compute power or the reduction
Y ratio of specific heats of computing costs, respectively. Secondly, large CFD
1

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, ffom 2-5 October 1995, and published in CP-578.
4-2

solvers usually have been developed throughout a long


period involving a number of different scientists, and = ( p pu pv p w p E
they are applied by numerous users which both must be -
respected by a parallelization. Last but not least, there is and F is the flux tensor being defined by
not just one parallel architecture available at the mo-
ment, but the platforms differ in the design of the CPUs
(vector versus RISC processors), the memory organiza-
tion (shared versus distributed memory) and the com-
munication systems (hardware and software). There-
fore, if one wants to be able to follow any hardware
development in the future, one must keep the parallel-
ization as flexible as possible.
The paper presented here describes the portable parallel-
ization of the FLOWer code which is currently carried with the abbreviations
out within the project POPINDA (Portable Paralleliza-
tion of INDustrial Aerodynamical applications) funded @x = uoX+ V T , ~+ woXz- k-aT
by the German Ministry of Research (BMBF). The ax
FLOWer code is a block structured CFD solver for com-
plex flows in configuration aerodynamics. It has directly +
@y = uoTXy V O + W T ~ ,- k-JT (4)
Y ay
evolved from the DLR-CEVCATS code [3] and is de-
veloped in close cooperation of the DLR with the Ger- + V T + w o Z- k-aT
man national research center for computer science
@, = YZ ax
(GMD) and the German aeronautical industry (DASA)
as a multi purpose flow solver. The elements of the viscous stress tensor are determined
by Newton's law of skin friction, i. e.
After a description of the numerical algorithm of this
large CFD code in the next section, the strategy chosen aU 2 + j
for its parallelization will be explained outlining the ox = - 2 p - + - p v . u
ideas of how to meet the requirements for large applica-
ax 3
tion programs in industrial use. The communications li- av 2 +
oy = - 2 p - + - p v . u
+
brary CLIC-3D which solves the portable parallelization ay 3
problem of such codes is then reviewed.
Benchmark results obtained on various platforms dem-
onstrate the portability of the FLOWer code and allow
an assessment of today 's parallel platforms. Further-
more, computations of different aircraft configurations
show that such architectures have become operational
for CFD applications and what effects on the obtainable
performance occur. Finally, the computation of a 6 mil-
lion grid point test case on 129 processors indicate the
future potential of parallel processing in CFD.
Further simplification is obtained by applying a thin
2. NUMERICAL METHOD OF THE FLOWer shear layer approximation accounting only for grad'ients
CODE normal to surfaces.
For the non-dimensional pressure and temperature the
2.1 Governing Equations following relations hold
The FLOWer code is solving the Euler- or Navier-
Stokes equations in conservative form [3,4] written as p = p(y-l)(E-$)

+ and the system is closed by the relations for the trans-


where W denotes the vector of conservative variables
port coefficients
P = PI + Pt
4-3

the flow field is split into regions for each of which the
'
(".?)
k = C 1' Pr,
(7)
generation of a structured grid is possible. Figure 1 is
showing schematically such a grid topology around a
transport aircraft. As one can see, the flow field is subdi-
vided into four areas of similar size around the wing
where the laminar viscosity F~is given by Sutherlands's body. Three subdomains are covered by one block each
formu I a (blocks 1 to 3), whereas the fourth region is further sub-
divided, due to the presence of an engine there (blocks 4
to 9). The engine is surrounded by a polar grid (blocks 8
and 9) which is adapted by blocks 6 and 7 to the general
0-Htopology (blocks 3 to 5).
In turbulent flows the eddy viscosity pt is computed The program then treats the blocks more or less inde-
from the algebraic Baldwin-Lomax model [5]. pendently of each other which can only be done prop-
erly by exchanging data of the current solution at block
2.2 Discretization and Time Integration interfaces before each time step.
The governing equations are discretized by the method Therefore, the blocks are surrounded by layers of
of lines separating the space and time coordinates. After dummy cells, which at block intersections correspond
the space discretization, a system of ordinary differen- with the physical cells of the neighboring domain.The
tial equations in time results involving each finite vol- FLOWer code allows an overlap width of two cells re-
ume. For any hexaeder of the structured grid one obtains sulting in second order accuracy of the scheme at those
the equation. boundaries. This is necessary, in order to treat the artifi-
cial dissipation terms correctly which otherwise could
spoil the solution as shown in [9].
Currently, the FLOWer code allows different exchange ~

strategies for the data at block intersections varying in


The space discretization is central, so that an artificial
dissipation term due to Jameson et al. [6] is added
damping high frequency oscillations and allowing a suf-
effort and accuracy [IO].
I
ficiently sharp resolution of shock waves in the flow
field. The resulting system of equations then reads

I
+ +
with Ri,k and Dijk being the vector of the residuals and
7
the artificial dissipative fluxes respectively.
The time integration is carried out by an explicit, hybrid
multi stage Runge-Kutta scheme which is accelerated
by the techniques of local time stepping, enthalpy damp-
ing (Euler) and implicit residual smoothing [7].
This procedure is embedded into a powerful multigrid
algorithm [3, 81 which allows standard single grid com- Fig. 1 Schematic multiblock decomposition of the flow
putations as well as a successive grid refinement and field around a generic transport aircraft.
Decomposition into 9 blocks due to the adaption
simple or full multigrid, respectively. As is illustrated in
of an engine fitted polar mesh to a global 0 - H
[3], where a more detailed description can be found, I
topology.
high convergence rates can be obtained, using this tech-
nique. 3. PARALLELIZATION OF THE FLOWer CODE
I

2.3 Block Structure 3.1 Requirements


Since structured grids around complex geometries can- When parallelizing a large CFD solver as the FLOWer
not be generated as one logically rectangular domain, code, the parallelization cannot be treated in isolation,
the FLOWer code is block structured. That means that but must be integrated into the general development
4-4

procedure [91. Therefore, certain objectives must be and how to achieve portability. When parallelizing the
met, the most important of which are specified in the FLOWer code, general considerations led to the follow-
to1lowing . ing guidelines allowing to meet the requirements stated
above:
Portability
The FLOWer code is developed by a number of scien- Grid partitioning as parallelization strategy
tists working at different locations on a variety of com- The idea is to map the different blocks to different pro-
puters. Furthermore, it is applied by several users run- cesses where they are solved separately. Between thc it-
ning the program on other platforms than the eration steps the boundary data are exchanged via the
devclopers. Finally, the life time of the program will network.
certainly exceed that one of most of today's computers, This technique is not only said to be efficient when solv-
so that portability is a major requirement: ing partial differential equations 11 1 , 121, but moreover
Thc FLOWer code must run on any platform, it may be guarantees the conservation of the sequential develop-
scqucntial or parallel ! ment history, because it is directly based on the sequen-
tially well established multi block method.
Conservution of the development history
Whcn developing the parallel FLOWer code, its algo- Separation of computation and communication
rithm had already reached a high degree of maturity es- A strict application of this rule allows an algorithmic de-
tablished by various scientists during a long period velopment which remains independent from the paral-
within the DLR-CEVCATS code. Moreover, the users lelization or other hardware aspects. Additionally, the
had bccome experienced with its handling and in inter- code structure can be kept modular more easily which is
preting its results. Therefore, the parallelization had to highly desired from software engineering reasons. Fi-
respcct that development history: nally, the portability problem becomes much easier to
Thc FLOWer code must not be completely re-written handle, when concentrating the communication parts
due to its parallelization ! within separate units.

Low parallelization effort Communication by message passing


Parallelization is only one means of high performance Besides efficiency arguments, the decision for the mes-
computing and should not be done just for its own sake. sage passing communication model results mainly from
The effort spent for parallelization must therefore be the portability demand. Using a parallelizing language
justified by the corresponding gain in performance or re- would have caused a complete re-implementation of the
duction in computational costs, respectively: FLOWer code which was clearly unacceptable, and par-
allelizing compilers are only available on some plat-
The parallelization of the FLOWer code must achieve
forms restricting the portability of the code.
the highest performance possible at lowest costs !
In the contrary, it should be noted that the message pass-
3.2 Parallelization Strategies ing approach does not exclude a parallelization by auto-
Parallelization of a CFD solver means mapping of in- tasking supported by compiler directives [9].
herent parallelism incorporated in the program to a par-
Use of u portable communications library
allel architecture using a communication model. As far
as structured codes are concerned, there is parallelism Combining the requirements for the parallelization of
on statement level (multiply / add), in the data (loops the FLOWer code with the above guidelines leads to the
over all points of a block) and in the geometry (the dif- demand for a portable communications library
fercnt blocks) which can be expressed by parallelizing Such a high level library should perform all typical op-
languages, e. g. HPF or C++, parallelizing compilers erations necessary in parallel mode involving communi-
(directives / autotasking) or by message passing, i. e. by cation between the different processes. Its usage should
explicitly sending and receiving data to and from differ- therefore guarantee parallel portability, keep the sequen-
ent processes. Moreover, the parallel hardware design tial code almost unchanged and reduce the paralleliza-
varies with respect to the arrangement of CPUs, mem- tion effort drastically. Moreover, it should lead to a
ory and the interconnecting network (shared / distrib- highly reliable parallelization.
uted memory, hybrid constructions) [91. This library has been developed by the GMD as CLIC-
Therefore, one has to decide which type of parallelism 3D and will be described in the following section.
should be exploited using which communication model,
4-5

4. THE CLIC-3D COMMUNICATIONS LIBRARY node processes. The host itself does not participate in
the solution process which Is exclusively carried out by
4.1 Background the node processes. Consequently the user application
The communications library CLIC-3D (Communica- program is seperated into a host and a node program as
tions Library for Industrial Codes in 3 Dimensions) is shown in figure 3.
currently developed by the GMD within the German re-
search project POPINDA. It is based on the former
GMD Comlib library and supports general block struc-
tured PDE solvers, particularly involving multigrid al-
gorithms. Its development was based on the observation
that for this class of programs the communication pat- HOST
terns are generally quite similar, although the numerical
algorithms might differ considerably. data distribution

The major aim of the CLIC development is, to make


programming for complex geometries as easy as for V
simple single block domains providing high level rou- NODE
tines for all communication and mapping tasks. The
CLIC user interface, therefore, provides the application
data exchange
program with all necessary data on the problem to be
solved. Fig. 3 Host-nodestructure of the parallel FLOWer
Currently, the CLIC library supports cell vertex and cell code
centered discretizations. The host program reads in the same input parameters as
The portability of the CLIC library is achieved using the the sequential user program. Then, CLIC routines read
PARMACS as portable message passing interface [ 131. in the description of the block structured grid, create the
This system was chosen, because it is a commercial node processes and map the blocks onto the allocated
product and not public domain as PVM [14], and MPI node processors respecting load balance aspects as far
[ 151 was not yet available at the time, when the POP- as possible. Then, the input parameters are distributed to
INDA project started. The corresponding software lay- the node processes. Finally, another routine reads in the
ers of the parallelized FLOWer code are illustrated in grid coordinates and sends them to the corresponding
figure 2. node processes only. After reading and distributing all
data to the nodes, the host process waits for output gen-

U FLOWer erated by the node processes and writes it to the desired


units.
Each node process executes an identical node program

52
which may contain the complete sequential code. In
CLIC-3D case of the FLOWer code, the only differences in paral-
lel mode are:
The input data is not read in but received trom the
host process
Global operations involving all blocks are passed to

?l vendor’s systems
the CLIC library for performation
The data exchange at block boundaries is carried out
fully automatically by the CLIC library
Fig. 2 Software layers of the parallel FLOWer code Write statements are replaced by parallel output rou-
tines of the CLIC library
4.2 General Code Structure A schematic flow chart of the parallel FLOWer code is
Since the CLIC library is based on the PARMACS mes- given in figure 4.
sage passing system, it is designed for a host-node (mas- Further activities of the CLIC library consist in the anal-
ter-slave) programming model. The host process starts ysis of the given block structure, in order to allow a spe-
the distributed application on several nodes, performs cial treatment of grid singularities. For each segment
the input and output and transfers data to and from the
4-6

edge and point the adjoining blocks and the number of are received in the order they come in, and the buffers
adjoining cells is determined leading to a topological are unpacked. If necessary, the procedure is repeated for
classification. If the segment is part of the physical segment edges and corner points, so that finally all block
boundary, the boundary conditions of all adjoining interfaces are updated correctly.
blocks are determined, additionally. Finally, geometrical
Exchange ( 1 1 Segmcnl Daua
singularities are detected, so that the user can inquire all
data for a special treatment of irregular grid points.
HOST NODE 1 NODE 2
-
-C

pniccss I
ciinlrol strciuii
diaa sLrcam
pniccss 2

-.
-
control stream
data stream
ill1 all
hhicks hliicks
ill1 ill1
dlstrlbute data cu1s cuts

ill1 rcccivc huller


hliicks hlocks
I ill1

I CULs onpick huller

Exchange iilcdgc and corner point &iVa accordingly


Fig. 4 Schematic flow chart of the parallel FLOWer
code based on the CLIC library Fig. 5 Schematic flow chart for a parallel data
The same data is needed by the CLIC library for optimi- exchange.
zation of the data exchange at block boundaries. The
Global operation
aim is, to send the minimum number of messages neces-
sary for a correct update of the boundaries. This is im- Global operations involving all blocks of a given block
portant especially on coarse grids of multigrid algo- structure are necessary, e. g. for the computation of a
rithms where the communication may become global residual. They are carried out within another spe-
significantly time consuming. Basic idea is the introduc- cial CLIC routine using an embedded binary tree for the
tion of a global orientation for larger portions of the process topology. As illustrated by figure 6, each parent
block structure leading to a fast exchange procedure. process receives data from its child processes, performs
Only in topologically more complicated situations addi- a local operation on this data and communicates it to its
tional messages must be sent. own parent process. Afterwards, the process waits for a
message from its parent process containing the correct
Another specialty of the CLIC library is the possibility
global value which is obtained at the end of the chain.
of parallel output, i. e. output files can be directly writ-
After its reception, this value is further communicated to
ten by the node processes.
the corresponding child processes.
4.3 Examples of High Level CLIC Operations -1-
rcceivc Ihim child ~~~ECSSCS 1 icccivc I'mm child pnrcsscs I
Exchange of boundary data
pnrcss I
As already mentioned, the grid partitioning strategy re-
quires an exchange of boundary data at the interfaces of send IO pnrcnc proccss scnd I O percm pnrcss
the blocks. Therefore, the information on the topology
of the block structure is stored in terms of block surface
receive trim child piircssc?
segments in a file that is read in by the CLIC library.
During the initialization phase, this information is ana-
lyzed with respect to the necessary send and receive op-
erations within a data exchange procedure. scnd 10 child pi~~ccsscs

When the corresponding exchange routine is called on r 1


each process, as sketched in figure 5, all interface data rcccivc Imm pnrcnl pnrcss rcccivc trim piacni pnccss

of the blocks on a process is stored segmentwise in a re- 1. 1.


spective buffer which is sent (asynchroneously block- scnd hi child piircsscs scnd lo child pioccsscs

ing) to the corresponding neighboring block on another


process. Afterwards, messages of the other processes
4-7

5. RESULTS erate number of 32 CPUs. The CM-5and the Intel Para-


As a first result it )uld : nc that the parallel gon showed to have weaker single processors, so that
F'LOWer code using the CLIC library meets all of the re- they need many more CPUs in order to reach the perfor-
quirements stated above: mance of the other machines.
- The FLOWer code is fully portable, in sequential as
well as in parallel mode.
The effort spent for the development of the sequen-
tial FLOWer code and its predecessor CEVCATS
was M y conserved.
. The effort needed for the parallelization was
extremely low Cmy JPO C n y Y-MP Cmy CPO NEC SX-3
The results obtained with this code are given in the fol-
lowing. Fig. 8 Relative execution times on single processor
vector computers
5.1 Performance Measurements
Since parallelization is a means of increasing the com-
pute power for CFD applications, performance measure-
ments were carried out on several platforms. With this
not only the portability of the F'LOWer code is demon-
strated, but an assessment of different architectures is
possible.
As test case the flow around a non-swept wing consist-
ing of NACA 0012 airfoils was computed at M = 0.6
and a = 0" (figure 7). %o different grids with 4oooO 1

and 320000 cells were used, respectively, that were sub- Intel IBM NEC Gray
Pam&mXPlS SPZ Cmju-3 1936 C916
divided into I , 4 and 8 blocks in the small case and into 8132Pmc. 4 1 3 2 P m . 8132Pmc. 8I16Pmc. 818Pms.
1 , 4, 8, 16 and 32 blocks in the large case. Each block
was of equal size and was mapped to one CPU on the Fig. 9 Relative execution times on parallel computers
parallel machines leading to an ideal load balance.
5.2 Speed-up for Aircraft Configurations
For evaluating the potential of the parallelization of the
FLOWer code, speed-up measurements were carried out
for a more realistic configuration. The inviscid flow
around the generic DLR-F4 wing-body combination
shown in figure 10 was computed on a grid consisting of
approximately 410000 cells which was subdivided into
I , 4 and 8 equally sized blocks, respectively. For the
conditions of Mach number M = 0.75 and incidence a =
OD, 35 W cycles involving 4 multigrid levels were per-
formed on an IBM SPI computer.
The results obtained for different communication sys-
Fig. 7 NACA 0012 wing test case for performance
measurements. tems available there are plotted as speed-up versus pro-
cessor number in figure 11. As can be seen, PVM using
Figures 8 and 9 show the obtained computing times on
an Ethernet connection restricts the processor number to
various parallel and vector machines with respect to the be employed to only four indicating that workstation
time needed on a Cray C90 single processor. As can be clusters based on the Ethernet are not suitable for paral-
seen, the single processor performance of the NE!C SX-3
lel computations with the FLOWer code. The result can
is hard to beat, even by parallel vector computers using be markahly improved, when replacing the Ethernet by
up to 8 CPUs. On the other hand, the results show that the IBM high performance switch, but still the fastest
parallel RISC processor architectures, as the B M SP2
runs were obtained using the IBM MPUp communica-
or the NEC Cenju-3. are able to compete with or even to
tions system.
outperform the Cray C90 single processor using a mod-
4-8

The algorithmic ideal speed-up is also plotted in figure


11, and as can be seen, is reached to a degree of approx-
imately 95% using the MPUp system.
Of course, the decrease of the maximum obtamable
speed-up reported above is not satisfactory, but on the
other hand it is queshonable, whether its value is mean-
ingful for complex CHI problems at all. First of all, for
speed-up measurements one would need problems that
are small enough to be computed on a single CPU, in or-
der to get a reference value. Secondly, the decrease in
the maximum speed-up is only felt, because it was pos-
sible to compute a single block solution for the DLR-F4
Fig. 10 Iso-Mach contours and block structure of the wing-body combinahon. On the other hand the multi
DLR-F4 wing-body combination (M = 0.75, block cases were ideally load balanced, because all
a=. )
'
0 blocks were of equal size.
When dealing with more complex configurations, this
will certainly not be the case. In such situations there
will be several blocks from grid generation reasons
which cannot be guaranteed to have all the same number
of points. Therefore, more complicated test cases must
be studied.

5.3 Parallel Computation of a Generic Aircraft


As a more realistic configuration, the DLR-ALVAST ge-
neric aircraft model carrying a high bypass engine [I71
was computed at a Mach number of M = 0.75 and an in-
cidence of a = 1.0'. The grid for this test case consists
of about 575000 cells and is subdivided into II blocks
the size of which is varying between 4096 and 87552
0.0 ,. cells. This is a typical situation where neither load bal-
0.0 1.0 2 1.0 4.0 5.0 LIO 7.0 8 S N
ancing nor single block computations are possible, both
Fig. 11 Speed-up versus processor number for the due to grid generation reasons. The configuration is
DLR-F4 wing-body combination on IEM SP1 shown in figure 12.
What can be observed, is that even with the most power-
ful communciations systems on the IBM SPI, the accel-
eration obtained is considerably deviating from the lin-
ear speed-up. This effect is caused by an increase of
operations due to the multiple computations of points at
block interfaces.
Therefore, there is an upper limit for the maximum ac-
celeration below the linear speed-up the which can be
obtained from single processor computations of the
multi block cases. This value is called algorithmic ideal
speed-up an is defined as the ratio of computing times
for the one block case and the multi block case multi-
plied with the number of processors that could be em-
ployed, i. e. the number of blocks: Fig. 12 ALVAST generic aircraft configuration. Iso-Mach
lines at M = 0.75 and a = Oo.
4-9

In order to study the corresponding effects on the paral- 5.4 Feasibility Study for Large Problems
lel performance, 50 W-cycles were performed mapping Since parallelization is believed to be the appropriate
the 1 1 blocks to 1, 7 , 8 and I O processors of an IBM method of tackling the future grand challenge problems
SP2 respectively. The single processor result was ob- in design aerodynamics, attempts must be made, in or-
tained on a slightly more powerful wide node, whereas der to demonstrate the feasibility of this approach.
the parallel runs were obtained on weaker thin nodes. Therefore, the viscous flow field around the DLR-F4
As can be seen from figure 13, on 8 processors a speed- wing-body combination (compare figure IO) was com-
up of 6.6 can be gained, but a further increase does not puted on a grid generated by the Deutsche Airbus com-
lead to an improvement any more. This behavior is ex- pany consisting of 6.6 million grid points subdivided
actly what must be expected looking on the block struc- into 128 blocks of equal size. 800 multigrid cycles were
ture and the mapping strategy of the CLIC library. performed on a 129 processor IBM SP2 (1 host + I28
The work load per processor is determined by the num- nodes) which took less than three hours of response time
ber of grid points to be solved, and the largest number of ( 1 3 seconds per cycle). The convergence of the compu-
points on any processor constitutes the total execution tation is given in figure 14 in terms of the logarithmic
time of the parallel run. When mapping the 11 blocks to density residual versus the number of multigrid cycles.
less than 11 processors, there will always be more than
one block per CPU. Therefore, the CLIC library applies DLR-F4 wing-body combination
a mapping strategy that tries to distribute the blocks ac- 6.6 million points, 128 blocks
cording to their size, so that the work load on the nodes
is as equal as possible.
Up to 8 processors one is able to continously reduce the
maximum grid size per CPU by simply mapping the
largest block of the heaviest loaded node to an addi-
tional processor. But when employing 8 nodes, the max-
imum work load is determined by the absolutely largest
block which of course cannot be reduced any further by
mapping the block structure to more CPUs. Therefore,
the minimum computing time or maximum speed-up, .-
100 200 300 400 500 600 700 800 N
respectively, is to be obtained on 8 nodes and remains
constant afterwards, as illustrated by figure 13.
Fig. 14 Density residual versus number of multigrid
Any further increase of the speed-up would require an cycles. DLR-F4 wing-body combination (6.6
additional blocking of the largest block which is million cells) on 129 processors of an IBM SP2.
planned to be automatically supported by the CLIC li- A grid convergence study was carried out by repeating
brary in the future. the computation on four grids each differing in the num-
ber of total points by a factor of 8. The result is given in
figure 15 in terms of the total lift coefficient versus the
scaled grid size. As one can see from an extrapolation of
the development of the lift between the levels three and
s : 575000 cells one, the large grid size of 6.6 million cells is necessary,
8.0 - in order to get the lift within an accuracy of one percent.
Since the quality of the solution was spoiled by regions
IBM SP2 of highly distorted cells, a repetion of the study is
6.0 -
I

planned with an improved grid.


4.0 -
Nevertheless, what is proven, is that such large scale
-
problems can be treated with the parallel FLOWer code
2.0
and that today 's parallel hardware is already allowing
o . o ~ ~ ' ' ' ' " " ~ " ' ' ~ ' " ' ~ ' ' ' ' '
such computations. Therefore, this study is a promising
0.0 2.0 4.0 6.0 8.0 N, 10.0 demonstration of the potential of parallel processing in
CFD heading towards the solution of the aerodynamic
Fig. 1 3 Speed-up versus processor number for ALVAST grand challenges expected in the future.
generic aircraft configuration on IBM SP2.
4-10

algorithmically limited by the grid partitioning strategy,


because points at block interfaces are multiply com-
puted increasing the total number of operations. But this
0.65 r drawback is only felt for simple problems, where a
speed-up can still be measured and which are, therefore,
far away from being a grand challenge.
Parallel computations of a generic aircraft consisting of
a wing-body combination carrying a pylon with an en-
gine demonstrate, that the complexity of today 's prob-
lems in configuration aerodynamics can be tackled on a
parallel computer. Speed-up measurements with respect
to a multiblock single processor computation give sat-
isfatory results, but also reveal the necessity for an auto-
0.40
0.000 0.001 1 1 ~ 8 ~0.002 matic load balancing tool that allows to map an initial
block structure to a higher number of processors than
Fig. 15 Total lift coefficient versus scaled number of grid given blocks.
points. DLR-F4 wing body combination Finally, a Navier-Stokes computation of the flow field
computed on 129 processors of an IBM SP2. around a wing-body combination on a grid consisting of
6.6 million points on a 129 processor IBM SP2 outlines
6. CONCLUSIONS the potential of parallel processing in CFD for the fu-
This paper shows, how the computational power of par- ture. It proves that high numbers of processors can be
allel architectures is exploited by the three-dimensional successfully handled in numerical aerodynamics and
structured solver for complex flows FLOWer under the that parallelization, indeed, is a promising means for
restricting demands for portability, conservation of the solving the grand challenge problems.
former development history and minimization of the
parallelization effort. 7. ACKNOWLEDGEMENTS
The basic considerations to use the grid partitioning ap- The work reported on here has been funded by the Ger-
proach as parallelization strategy and to strictly separate man Ministry of Research within the parallelization
communication and computation lead to the implemen- project POPINDA involving the following organiza-
tation of the message passing based portable CLIC-3D tions:
communications library supporting any high level oper- Daimler Benz Aerospace Airbus Bremen, Daimler Benz
ation occuring in typical partial differential equation Aerospace DASA-LM Manching/Ottobrunn, Deutsche
solvers on structured meshes. With this library the paral- Forschungsanstalt fur Luft- und Raumfahrt Braunsch-
lelization meets all general requirements for the devel- weig, Gesellschaft fur Mathematik und Datenverarbei-
opment of large codes in industrial use. tung St. Augustin, IBM Wissenschaftliches Zentrum
Performance measurements on a large variety of com- Heidelberg.
puters of different architecture demonstrate the compre- The authors want to thank all contributors of these insti-
hensive portability of the CLIC based FLOWer code and tutions.
allow an assessment of today's hardware capability in
CFD. Still the NEC SX-3 vector computer appeared to LITERATURE
be the most powerful machine solving a standard bench-
mark problem, but with a moderate number of 32 RISC Holst, T. L., Salas, M., D., Claus, R. W., "The
processors the IBM SP2 already outperforms a Cray NASA Computational Aerosciences Program -
C90 single processor, and a 32 processor NEC Cenju-3 Toward Teraflops Computing", AIAA-92-0558,
is at least competitive. 1992
Speed-up studies for a typical wing-body combination
Agarwal, R. K., "Parallel Computers and Large
show that the communication system has a decisive in-
Problems in Industry" in "Computational Methods
fluence on the achievable overall acceleration. It turns
i n Applied Sciences", Elsevier Science Publishers,
out, that Ethernet based workstation clusters communi-
1992
cating via PVM are not suitable to replace true parallel
computers as far as performance is concerned. Kroll, N., Radespiel, R., Rossow, C.-C., "Accurate
Additionally, the maximum speed-up to be obtained is and Efficient Flow Solvers for 3D Applications on
4-1 1

Structured Meshes”, AGARD FDPNKI Special Interference Phenomena of Modern Wing-Mounted


Course on Parallel Computing in CFD, 1995 High-Bypass-Ration Engines by the Solution of the
Euler-Equations”, AGARD-FDP Symposium on
4. Rossow, C.-C., “Berechnung von Stromungsfeldern “Aerodynamic Engine/Airframe Integration”, 7- 10
durch Losung der Euler-Gleichungen mit einer October 1991
erwei-terten Finite-Volumen Diskretisierungsmeth-
ode”, DLR-FB 89-38, 1989

5. Baldwin, B. S., Lomax, H., “Thin Layer Approxi-


mation and Algebraic Model for Separated Turbu-
lent Flows”, AIAA-78-257, 1978

6. Jameson, A., Schmidt, W., Turkel, E., “Numerical


simulation of the Euler equations by finite volume
methods using Runge-Kutta time stepping
schemes”, AIAA-8 1- 1259, 1981

7. Kroll, N., Jain, R. K., “Solution of the Two-Dimen-


sional Euler Equations - Experience with a Finite
Volume Code”, DFVLR-FB 87-4 1, 1987

8. Atkins, H., “A Multiple-Block Multigrid Method


for the Solution of the Three-Dimensional Euler and
Navier-Stokes Equations”, DLR-FB 90-45, 1990

9. Eisfeld, B., Bleecke, H.-M., Kroll, N., Ritzdorf, H.,


“Parallelization of Block Structured Flow Solvers”,
AGARD-FDPNKI Special Course on Parallel
Computing in CFD, 1995

10. Rossow, C.-C., “Efficient Computation of Inviscid


Flow Fields Around Complex Configurations Using
a Multiblock Multigrid Method”, Communications
in Applied Numerical Methods 8, pp 737-747, 1992

1 1 . Keyes, D. E, “Domain Decomposition: A Bridge


Between Nature and Parallel Computers”, ICASE
Report No. 92-44, 1992

12. Lonsdale, G., Schuller, A., “Multigrid efficiency for


complex flow simulations on distributed memory
machines”, Parallel Computing 19, pp 23-32, 1993

13. Hempel, R., Hoppe, H.-C., Supalov, A., “PAR-


MACS 6.0 Library Interface Specification”, GMD
St. Augustin, 1992

14. Sunderam, V. S., “PVM, a framework for parallel


distributed computing” in “Concurrency, Practice
and Experience”, Vol. 2(4), pp. 315 - 339, 1990

15. Message Passing Interface Forum, “MPI: A Mes-


sage-Passing Interface Standard”, University of
Tennessee, 1994

16. Rossow, C.-C., Ronzheimer, A., “Investigation of


5- 1

A Parallel Spectral Multi-Domain Solver Suitable for DNS and LES


Numerical Simulation of Incompressible Flows
A. Pinelli and A. Vacca
School of Aeronautics, Polytechnic University of Madrid
Plaza Cardenal Cisneros 3
28040 Madrid, Spain

Abstract 2 NAVIER-STOKES EQUATIONS AND


TIME SPLITTING SCHEME

In the present paper we introduce and discuss an efficient par- When the incompressible Navier-Stokes equations
allel algorithm for the spectral multi-domain solution of the au 1 1
incompressible Navier-Stokes equations. Firts, the algorithm -+-(U.VU+V.(UU)) = -Vp+-AU (1)
at 2 Re
is given in its basic form for the 2-dimensional case and, later
on, a possible extension to 3-dimensional flows exhibiting a V.U = 0 (2)
homogeneous (periodic) direction is proposed. The algorithm
are solved by means of a projection method [2], with the diffu-
is validated both for its parallel performances, and its accu-
sive terms treated in an implicit fashion [3],the time stepping
racy.
procedure consists in a cascade of scalar elliptic kernels, to be
solved at each time step. Namely two (for the two-dimensional
equations) Helmohltz problems for the inversion of the diffu-
sive part, and a Poisson problem for the pressure need to be
1 INTRODUCTION solved at each time step. It is then clear that, in order to
achieve a globally efficient algorithm, it is of fundamental im-
portance to tackle effectively the mentioned scalar problems.
In the last years domain decomposition methods have gained For the sake of completeness in the following the adopted frac-
much attention in the CFD comunity. One of the most rel- tional step scheme (i.e. Van Kan's pressure correction method
evant features of such methods is concerned with the possi- [4])is given
bility of tuning the accuracy of the numerical discretization
according to the expected behaviour of the solution in each 0 - U" A ( O + U " ) = - V p " - - L 3( C J " ) + - L1( U " - L )
subdomain. Consequently, subregions of flow field contain- At 2Re 2 2
ing sharp boundary layer, can be enclosed within subdomains (3)
with high resolution, while low resolution can be assigned to
subregions where smooth solutions can be expected. clan = U ( ( , + 1)At) (4)
U"+1 -fi 1
These advantages can be fully exploited when discretizing the +-v(p"+' -p") = 0 (5)
equations with spectral methods which guarantee a fast decay At 2
of the error with the number of the nodes, termed as "spectral V.U"+' = 0 (6)
accuracy".
where L (U) represents the advective term
On the other hand domain decomposition methods might pro- f (U ' vu + v ' (UU)).
vide a natural stabilization strategy for the spectral discretiza- In the first step, a non physical intermediate velocity field fi
tion which is a "central one" in nature. In fact the local cell is computed. In fact, fi does not satisfy the incompressility
Peclet number can be locally diminished by reducing the mesh
spacing within the critical subdomain, without the introduc- condition. Then in the second step 0 is projected onto the
tion of any particular stabilization procedure. divergence free space to get an adeguate velocity approssima-
tion of U"+'.
From the computational point of view, the domain decompo- The scheme with the given boundary conditions is nothing else
sition techniques is well suited for parallel computing, even then a second order Crank-Nicolson Adams-Bashforth scheme
if in practical case several difficulties arises whenever good with an U ( A t 2 ) deviation in the tangent direction of the
performances have to be reached [l]. boundary. By applying the divergence operator to (6),it turns
out that the latter is equivalent to
In the first part of the present paper, a parallel algorithm for
the solution of the bidimensional incompressible Navier-Stokes
equations is presented. After a brief introduction of the time
splitting scheme used for the time discretization of the un-
steady incompressible Navier-Stokes equations, the attention
will be mainly focused on the the spectral multidomain a p
proach and on its parallel features. Performance results con-
cerning the parallel implementation on two different MlMD
parallel architectures will be presented. The second part of In the next section the attention will be focused on the way
the paper is concerned with the application of the algorithm each scalar elliptic problem has been tackled in the framework
to three dimensional unsteady problems. of a spectral multidomain discretization.

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
5-2

3 SPACE DISCRETIZATION where ker(y) is the kernel of operator 7, and its orthogonal
complement It-' is defined as:
In the present work, a Legendre spectral collocation technique
coupled with a domain decomposition method has been used A-' I {ti E H,L(R): I(ti,uo) = 0 v U0 E It-} (21)
for the space discretization of the differential equations. Ad-
ditional references can be found in ( [6], [ 5 ] )for the projection Therefore, the solution U E Hd(R)of problem (12) can be
decomposition method, and in ( [7]) for the spectral apnroxi- uniquely decomposed as
mation ni-thod.
U =UO +ti, uo E It- and ti E Ii' (22)

3.1 Elliptic terms Since the restriction yo of the operator y to Ii' is an isometric
isomorphism between It-' and H;I2(r) it follows that
The following problem, rappresentative of one of the elliptic
scalar problems mentioned in the previous section, is consid- v tic IP 3 4 H;l2(r):
~ ~ = ~ ; l t l (23)
ered hereafter:
Identity (22) can be reformulated as:

u=u0+7c1$ with uo E It- and 4 E H;/2(r) (24)

where a is a real constant 2 0 , and-where R k a n opencon- Thus, problem (12) can be easily proven to be equivalent to
nected set R c R'; in particular, R = U ~ , R with ; R, is a the set of the two following ones:
closed rectangle having either common side or common ver- Problem ( P l ) : find uo E IC such that:
tex with each neighbour; a 2 0 is either identically equal to
zero (i.e., for the Poisson problem related with the pressure) or 1(uo,uo) = (f,U O ) L l ( f l ) v U0 E Ii (25)
is equal to 2lAtRe (i.e., for one of the momentum equations),
and the equivalent weak formulation of (lo), (11) is: Problem (Pa): find t+b E H;"(r) such that:

find U EHd(R) such that l(y0-l+, YO'Z) = ( f 7 Y c 1 Z ) L y n ) V z E H;12(r) ('6)

l(u, U) = (f,U ) L 2 ( f l ) v U E HO'(0) (12) Problem P1 is nothing else than the solution of N decoupled
elliptic problems with homogeneous Dirichlet boundary con-
where Hd ( R ) is the real Hilbert space defined as follows: ditions on both 30 and r. To build its discrete conterpart,
a standard Legendre collocation method has been used ( [7]).
aU To this end, the unknowns are decomposed into a series of
H,'(R) {U E L'(R) : -E P ( R ) (13) Legendre polynomials:
ax 1

equipped with the scalar product: k=l I=1

where Lk is the k t h Legendre polynomial. Likewise, the func-


(vu.vu + a u u ) d o Vu, U E H,'(R)
j(u, U) =
J, (15) tion U is decomposed into a series of Lagrange polynomials
constructed on the Gauss-Lobatto nodes.
Following the classical domain decomposition technique prob-
lem (12) is decoupled into a set of problems within each sub-
domain plus an additional problem a t the interfaces r: k=l I=1

r = (R\Ro)\aR with $20 = uEIR, ( 16) where Lak is the k t h Lagrange polynomial for which
L a k ( x , ) = Sk,,. By taking into account the expression of U
Let H ; / ' ( r ) be the completion of the normed vector space S and U and by replacing the scalar product l ( . , .) by its discrete
defined as: counterpart, the differential problem reads:

S f {z E Co(r): 34 E C,"(R) such that z = #q-} find UkJ, 15 k 5 N , , 1 5 15 Ny such that


{ z k , i [ - ( e ) k , l + auk,l - fk,l]La'(xk)La3(yl)WkWl

(29)
=0 v2,J
dr=z
where W k are the Gauss Lobatto weights for the quadrature.
where 4r is the restriction of 4 on r. Using the definition of Lagrange polynomials (La,(x,)= a,,),
The linearity and continuity of the operator the disretized equations become:

into H ; " ( r ) and the fact that CF(R) is dense in Hd(R) leads
to the existence and uniqueness of a linear and continuous An efficient procedure to solve the given algebraic problem
operator y (trace operator) from Hd(R) onto H i " ( r ) defined will be given in the next section.
as As concern problem P2, if {(,} i = 1 , ..,m is a set of linearly
74 = 4r ~4 E ~ d ( 0 ) (19) independent functions which constitute a base for H ; " ( r ) ,
then the discrete version of problem P2 reads as:
The y operator allows to identify two closed mutually orthog-
onal subspaces M
/(ye1~ ~ t C t ~ ~ =~ ( f' >E~ Oj l ) ( ~ ) ~ z (vn j) = 1, .., (31)
Ii 2 ker(y) = {UO E Hd(R) : yuo = 0 ) (20) I=1
5-3

Typically M corresponds exactly to the number of points on A final remark concerns the importance of achieving an ef-
the interface. To set up an algebraic equivalent of (31) the ficient technique to invert the decoupled Dirichlet problems
operator yo-' should be explicitly formulated. In practice, the (PI). To this end, we make use of a modified matrix diagonal-
operator yo-' is never required if an iterative procedure is in- ization approach [9].The Legendre collocation approximation
troduced to solve problem P2. To illustrate this point, it must to one of the mentioned subproblems migh be re-written as:
be remarked that irk E Ir" must satisfy the orthogonality con-
dition: UD+DTU+alU = F (36)
I ( U k , u o ) = 0 v U0 E IC (32) where D is the collocated Lagrange second derivative matrix
which corresponds to the solution of N elliptic problems (25) acting on the subdomain internal nodes, U is the unknow
with Dirichlet boundary conditions: homogeneous on d n and matrix ordered by rows, and F is a modified right hand side
to be iteratively determined on r. matrix keeping into account the effects of the boundary values.
To provide a t each iteration k the condition on r for problem First, we determine the eigenvalues of D , its left and right
(32) the Green's formula is applied to (31) eigenvector system (ordered by columns) and the respective
inverses.
Er-' D E, = A (37)
E;' DT El = A (35)
where U k = y o 1 Cfl, of'(, is the solution a t iteration k of
Matrices E,, El, E:', E;' and the diagonal eigenvalue matrix
problem (32), where & represents the jump of the normal A are computed and stored in a pre-processing stage. Indicat-
derivatives on r. Rk is the residual a t iteration k, from which ing with fi = E;' U E1 and with F = E;' F El we invert
the updating of the boundary value Uktllr can be obtained the diagonalised problem:
within the chosen iterative procedure.
The convergence rate of the iterative procedure strongly de- Ai! + fiA - ai! = E; (39)
pends on the choice of the basis { E , } [8]. For the present work
the basis functions proposed by Ovtchinnikov [8] have been and recover the final solution as:
used. These constitute a nearly optimal basis, in the sense
that the condition number of system (31) is bounded by a U = E,~!E;' (40)
constant independent of M , where M is the dimension of the Having solved the eigenvalue problem in a pre-processin
subspace of H;"(I') generated by span{.$} i = 1, .., M . stage, the recursive solution cost turns out to be order n f
In view of the character of the algebraic problem (symmetric operations, n being the number of nodes used to discretized
positive defined) the conjugate gradient has been employed to each direction within a single subdomain.
solve problem (31).

3.3 Projection step treatment


3.2 Solution procedures for multiple prob- If each single differential problem is tackled with the algoritliiu
lems described in the previous section, a t the end of each time step
When multiple solutions for an elliptic problem of the form the solution is equivalent to one, hypotetically achieved by
(10, 11) are required (i.e., within a fractional step time ad- solving the whole domain a t once.
vancement), it turns out to be much more efficient to invert, The last statement requires some comments. When finite di-
once for all (in a pre-processing stage), the abstract operator mensional approximation of the space where the solution is
S handling the interface unknowns. sought are considered numerical problem might arise within
To introduce the method let us reconsider problem (10, 11). the fractional step algorithm in the interface neighbouring re-
With reference to the previous section, we reconsider the same gions. In particular when the projection step (5) is considered,
differential problems: problem P1 (25) and the the differential a straigh use of the results obtained with the present multido-
problem leading to the solution on the interface (P2), here main method leads a discontinuous value of the divergence
given in the following abstract form: free velocity field along the interfaces.
From the numerical point of view, these discontinuities, even
Sak = bk (34) if limited to a set of measure zero (r) might introduce an
artificial "numerical boundary layer" that the whole time in-
Where the ak's refers to the Galerkin coefficients of the solu- tegration procedure cannot damp out and that might lead
tion on the interface, and the bk's are the Galerkin coefficients to catastrophic instabilities. To avoid such a drawback two
of the jump of the normal derivatives produced by the solution solutions are possible. The first one relies upon increasing
of the N problems P1. the dimension of the approximation subspaces to reduce the
Let us now consider the M problems (see 31): jumps a t the interface. The second one consists in replacing
the gradient of the pressure in equation (5) with an equiv-
Sak = 6'
IJ (35) alent function in L2(R), which differs from the original one
Meaning problems with a jump of the normal derivatives lead- only along sets of measure zero. In particular, the gradient
ing to a unitary Galerkin coefficient i and zero values for all Q = V(4"" - 4") of the solution, achieved by solving ( 7 )
the other coefficients j, ( j# a). Succesive inversions, through with the previously outlined multi-domain spectral method is
the iterative procedure outlined in the previous section, al- substituted with the vector function p defined as:
low for constructing by columns the operator S - ' . The latter
might, then, be considered as a capacitance Galerkin matrix
that applied to the Galerkin coefficients of the computed nor-
mal derivatives jumps (problem P1) release the coefficients of
the solution on the interface to guarantee a zero weak normal V i , j = 1,2 component.
derivative jumps between subdomains. It is also remarked where r,, = 0,n RI, w5 (U;)is the Gauss-Legendre quadra-
that matrix S-' is simmetryc because obtained from the dis- ture weight along r,, (either j or i) corresponding to the node
cretization of a self-adjoint problem (33). Of course this is a (x,y) in the subdomain Gr (n,)and Q' ( Q " )is the restriction
nice property leading to an evident storage reduction. of Q in a, (Gs) evaluated in (z,y).
5-4

3.4 Accuracy tests master process calculates the guess values for the Dirichlet
problems. These values are then trasmitted to the slave pro-
To test the accuracy of the proposed spectral multi-domain cesses: each of the slaves solves the Dirichlet problems for
algorithm we have considered the classical Taylor-Green ana- the assigned domains; it should be noted that, in this case,
lytical test case for the 2-dimensional incompressible Navier- the domain decomposition (which allow the slaves to operate
Stokes equations: in parallel) derives directly from the multi-domain approach.
After this first phase, the slaves transmit the calculated values
u(z, y) = - cos(n3:) sin(ny)e-t/2na (42) a t the domain interfaces to the master, which calculates the
new values by applying a Conjugate Gradient algorithm, and
u(2, y) = sin(7rs) cos(ay)e-t/2ffa (43) communicates these values to the slaves for the next iteration.
p(z, y) = -1/4 (cos(27rs) + ~os(27ry))e-’~’ (44) The main causes of inefficiency in using parallel architectures
are an uneven load-balancing and the communication over-
on the domain R = [0,2] x [0,2]. The following set of boundary heads. In general, the multidomain technique can generate
conditions have been applied: load balancing problems because the size and/or computa-
tion of blocks can widely differ; however, in our case each
0 on the edges I = 0 and 3: = 2 homogeneous Dirichlet domain has the same number of points. Thus, if the number
conditions for U and homogeneous Neumann conditions of domains is a multiple of the number of processor, we obtain
for U. an optimal load balancing. The communication overheads is
on the edges y = 0 and y = 2 homogeneous Dirichlet mainly related to the Conjugate Gradient algorithm: a t each
conditions for U and homogeneous Neumann conditions time iteration, data need to be exchanged between processors
for U. containing adjacent domain interfaces and the master proces-
sor. Because of the sequantial flow of these activities, it is
The tests have concerned both time and space accuracy. The not possible to overlap computation and communication, so
latter has been measured imposing an extremely small value the time spent for these communications can represent a not
for the time step. Different configurations have been consid- negligible part of the overall computing time.
ered and the error has always been measured according with The parallel version of the code has been developed for mes-
the L2(R) norm. The following table, showing the results of sage passing environments. In particular, the code has been
different tests with different domain partitioning configura- written in Fortran 77 plus PVM 3.3 communication primi-
tions, summarizes the accuracy measurements both for one of tives. In order to meet the goal of overlapping computation
the velocity components and for the pressure. and communication, non-blocking communication primitives
have been used. Note that the parallelism is exploited only
among slaves: the master and the slaves cannot operate in
parallel. Anyway, as the great part of the computation is de-
manded to the slaves, the obtained performances on various
homogeneous parallel systems are quite good.

4.1 Performance evaluations


For the tests, we have used two different parallel machines.
From the given results, the accuracy of the solution is quite The first is a CONVEX CZ10-MPPO with a vector processor
evident. It is remarked that the convergence for the pressure and 4 scalar processor HP 730 connected via FDDI. The sec-
is lower than for the velocity, but nevertheless, still spectral. ond machine is a MElKO CS2 with 18 super-Sparc processors
In order to measure the time accuracy of the present scheme, connected through a switching network. Both the machine
we considered the same test case with a prescribed discretiza- are distributed memory MIMD parallel computers. The tests
tion in space (4 subdomains 14 x 14 nodes each) sufficient to have been performed by using a number of domain multiple
deliver optimal spatial accuracy. In the following table we of the number of processors used, so that load balancing is
present the relative Lz(R)norm of the velocity error achieved guaranteed. Hence, the cause for the loss of efficiency are the
after 1 time unit. time t , spent for the communication and the idle time t , of
the slaves waiting for the master results. Note that, while the
time step size I Relative L2 x-component velocity error time t, is indepenedent of the number N of processors, the
0.1 I .4 x lo-’
~~
time t , increases according to N: so, for a given problem, a
linear decrease of the efficiency is expected.
0.01 I .5 x 10-J
In figg. (1-4) the results obtained on the Meiko machine are
0.00: .3 x 1 0 - ~
0.0001 .5 x 1 0 - ~ shown. Note that the values of efficiency are quite good, ex-
pecidy when the number of points for each domain increases.
Moreover, when the number of processor grows the efficiency
linearly decrease, as expected.
Figures ( 5 , 6 ) shows a comparison of the results obtained for
both the Meiko CS2 and the Convex MPPO machines. It
4 PARALLEL IMPLEMENTATION should be noted that the Convex machine performs better
than Meiko when two processors are used; on contrary, by in-
As concerns the parallel implementation of the given algo- creasining the number of processors the performances of the
rithm, we have used a slightly modified version of master- Meiko are better. This behaviour is essentially related to the
slave computational model. In particular, the major differ- different characteristics of the interconnection networks; the
ence with respect to the classical model is that our master ac- FDDI network of the Convex allows very fast communication
tively cooperates with the slaves during the calculation phase, between two processors a t time (the optical fiber is a common
while in the standard version, the master is only demanded shared resource). On the other hand, the CS2 switching net-
to distribute initial data and to gather the results. In our work allows to simultaneously execute different communica-
implementation, the activities are shared between master and tions, so reducing the overall communication time (as matter
slaves as follow. At the beginning of the computation, the of fact, also the presence of properly designed communication
12 , , I I I I 8 , I I I

2 4 6 8 1 0 1 2 2 4 6 8
Nproc Nproc
Fig. I: 12 domains with 15 x 15 nodes; speed up Fig. 3 16 domains; speed up

0.8 9 x 9 points e
15 X 15 points o

0.7
2 4 6 8 1 0 1 2
0.7 ' 2
I

4 6
I
8
Nproc Nproc

Fig. 2: 12 domains with 15 x 15 nodes; efficiency Fig. 4: 16 domains; efficiency

)roce8sors which handle the communication an behalf of the to take advantage of the given multi-domain solution method
processor
J P ~ C has to he taken into account). to solve them efficiently.
To further reduce the computing time, we have also used bet- In particular, let
erogeneous systems In fact, whenever the execution of differ- NI?--l
ent tasks constituting the same program is stnctly sequential,
heterogeneous processing can help in enhancing performance u:(z,y,z) = G:*(z,y)e'*'. t = 1,2,3, (45)
by placing a task on the most sutahle madune for that task. k=-N/2
To this goal, tests have been performed by placing the master
process on a vector computer for a more efficient calculation,
and the slave proceqses on a parallel homogeneous system with
scalar processors.
However, in our case the time spent by the master is a negligi-
ble part of the total comuputing time; so, the test performed N/2-1
by using an heterogeneous environment have shown no appre- i,(z,y,z) = t,,k(z,y)e'*', : = 1.2,3,' (47)
ciable improvements *=-NI1

and
5 %DIMENSIONAL EXTENSION N12-1

In this section, we present a method extension which allows =


6i,(z.y,~) 66,,k(z,y)e'", i = 1,2,3, (48)
for the simulation of three-dimensional Rows with one peri- h-N/2
odical direction. For this class of flows it is possible to take
advantage of the classical Fourier decomposition of the flow with I = fl.Applying the same methodology as for the
variables in the periodical d k c t i o n . This choice leads to re- 2-dimensional case, the %D dimensional algorithm can be re-
duce all the three-dimensional scalar differential problems in formulated as:
the physical space (momentum equations and pressure cor- ..
For every n = 0,l. (n being the time counter)
rection equation) into a sequence of twodimensional %alar
dfferential problems in terms of the transformed variables. 1 For I = 1,2,3,solve for i t , k (the predicted velocity field)
3nce the twdimensional problems are set up, it is possible the momentum equations. for k = -N/2, ...,NI2 - 1:
5.1 A 3-D test case
To validate the proposed 3-D algorithm we have considered
a direct numerical simulation (DNS) of a low R e y ~ l d snuni-
4 , I 1 ber fully turbulent channel flow. This flow ([IO]) might be
considered periodic both in stream and spanwise direction
Convex 0 iI the dimensions of the computational box are made I w e
Meiko 4 enough. In the present case we took as Fourier direction the
streamwise one, while, to impose periodicity spanwise we im-
posed the edges of the subdomains to be neighbours one with
the other. AU the lenghts have been made non-dimensional
with the channel hall-height, and the velocity ha9 been noti-
dimcnsionalized with the center-line velocity. With this s e
lectim the Reynolds number Re = h / v has been fixed to
1 the value of 6000 and the computational box had dimensions
2 3 4 2, 2. .8 in streamwise, normal to the wall and spanwise direc-
Nproc .
tions respectively The grid configuration in a section normal
to the mean flaw is displayed in figure (7).
616.5 : 3 domains with 11 x 11 uodes; speed up

Convex
Meiko 0

0.7
2 3 4
Nproc
Fig. 7 Grid configuration in the normal p h e
Fig. 6 3 domains with 11 x 11 nodes; efficiency
Five subdomains. the first and the latter selected to embedd
the wall sublayer, are used. Each suhdamain contains 20 x 20
nodes, while in the Farier direction 24 modes are employed.
The present case has been run on a IBM RSGWO 360H work-
station with about 100Mflops peak performance. The CPU
requked for each full time iteration is of about 4.5 seconds
when the Galerkin capacitance matrix is computed and atored
2 Solve for fiit' the pressure correction, for k = in a pre-processing stage.
- N / 2 , ..., N I 2 - 1: ~.
1
(50)
3 For I = 1,2,3, update the velocity field, for k =
-N/2*...3N/2-1;

The subscript I has been introduced to stress the fact that the
collocated derivatives are computed in the two non-periodical
directions only. The term rks,,k represents the k t h mode
of the transform of the right-hand-side of the i l h momen-
tum equation. The treatment of the boundary conditions is I to 100
straightforwad and does not introduce any supplementary losIV4

-
difficulty. Despite its apparent complexity, this algorithm Fig. 8 Mean streamwise velocity near the WU
presents the advantage that all the computations of the ellip
tic terms take place in the transformed space (for the periodic After having reached a statistical steady state we measured
direction) leading to the full exploitations of the %dimensional some typical turbulent value to the quality of the ob.
algorithm. tained mults. In figurr ( 8 ) we compare the obtained velocity
5-7

Momover the data from the channel DNS simulation seems


to confirm the viability of the present algorithm to deal with
complex turbulent flow configurations. At the same time ii
should be stressed that the capability of selecting the accuracy
in determined flow regions might reveal to he a powerful tool
3.0 ~ , . . . _.__ . . ~, .~ for resolved Large Eddy Simulations in complex canliguratmns
(i.e.. when approximate wall conditions are not available).

. .
I"'"')
prsnmi
IU'Y'I pmnt
we1and Willma* p . A A A.,),

Jlmensz am MOR 7 ACKNOWLEDGEMENTS


.
Y
9

y
0,
2.0
'
4
The present authors are grafeful for the support and the com-
puter time provided by lRSlP (Istituto Ricerche c Sistemi
E
.
U i
0'\
lnformatici ParaUeli CNR, Napoli It.). We are also indebt
4i with Dr. Di Pietm for his help and assistance when setting

.-.
>
k, up the parallel version of the code. The first author likes to
l.0 ,: mention the contructive discussions and support of Prof J .
~ ."Y 0 0 ; -
Jim&=.
#'
.
".DO
. ~- .!.*
0

REFERENCES
oo------.~="-u
Y "
, , .,., . . , , . .
[l] G. De Pietro, A. Pin& and A. Vacca, A Parallel Imple.
mentation of Spectral Multi-Domain Solver, for lncom-
pressible Navier-Stokes Equations. In Proceedings of Par-
d e l CFD Conference '95,CaltTexh Pasadena Ca (1995).
Elsevier Amsterdam.
[2] A.Chorin, A.and Marsden, A Mathematical Introduction
to Fluid Mwhanics, Springer-Verlag. New York, 1979.

[3] A. Pmelli and A. Vacca, Chebyshev Collocation Method


and Multidomain Decomposition for the Incompressible
Navier-Stokes Equations, Int. J. Num. Meth. in Fluids,
18, (19!34), 781.
141 J. Van Kan, A Second Order Accurate Pressure-
Correction Scheme for Viscous Incompressible Flow. J.
Sci. Stat. Comp., 7, (1986),870.
[SI V. Agoshkov and E. Ovtchinnikov, Projection Decompo-
sition Method, CRs4 Tech. Rep. Cagliari, Italy, 1993.

C..,. -.. ...... 4


[6] A. Quarteroni, Mathematical Aspects of Domain Decom-
position Methods, European Congress of Mathematics.
F. Mignot (eds.),Birkhauser Boston, 1994.
[7l C. Canuto, M.Y.H u d n i , A. Quarteroni and T.A. Zang,
Spectral Methods in Fluid Dynamics, Springer-Verlag,
New York, 1988.
(81 E. Ovtchinnikov, On the Construction of a Well Condi-
tioned Basis for the Projection Decomposition Method,
Fig. 1 0 Instantaneous normal plane velocity field CRS4 Tech. Rep. Cagliari, Italy, 1993.
(91 G. Golub and C. Van h. Matrix Computations. The
Johns H o p k i ~
University Press. Baltimore, London 1989
(I1 edition).
6 CONCLUSION
[lo] J. Jimen6z and P. Moin, The Minimal Flow Unit in Near
The present work has been concerned with the solution of Wall Turbulence. Jou. Fluid Mech. 225, (1991),240.
the unsteady incompressible Naner-Stokes equations, using a
high order collocated spectral multi-domain method. The ra- [ll] T.Wei and W. Willmarth, Reynolds Number Effects on
tionale behind the choice and development of the method is the Structure of a Turbulent Channel Flow. Jou. Fluid
given both by the possibility of coupling the potential high Mech. 204, (1989),57.
accuracy of spectral methods with the Eedble framework of-
fered by multi-domains methods, and with the natural way in
which a parallel implementation of the present algorithm can
be achieved.
In particular, we have shown how the developed algorithm al-
lows for the solution of completely independent and balanced
subproblem leading to full exploitation of MIMD parallel
computers.
On Improving Parallelism in the Transonic Unsteady Rotor
Navier Stokes (TURNS) Code

Andrew M. Wissinkt
Aerospace Engineering and Mechanics, University of Minnesota
107 Akerman Hall, 110 Union St. SE, Minneapolis, MN 55455
Anastasios S. Lyrintzis*
School of Aeronautics and Astronautics, Purdue University
1282 Grissom Hall, West Lafayette, IN 47907
Roger C. Strawd
US Army Aeroflightdynamics Directorate, Mail Stop 258-1
NASA Ames Research Center, Moffett Field, CA 94035

ABSTRACT land, takeoff and maneuver in areas inaecessible to


A parallel implementation of the three-dimensional fixed-wing aircraft. The ability to predict the flow
Navier-Stokes Rotorcraft flow solver TURNS is stnd- around helicopter rotors is vital for the control of
ied. We investigate two modifications of the LU-SGS high-speed losses, vibration and noise.
operator to improve parallel performance. The first l'kansonic flow is normally encountered on rotors in
is the Data-Parallel LU Relaxation (DP-LUR) tech- high-speed forward flight. Various transonic flow
nique. This operator uses a Jacobi sweeping pro- models have been used for the modeling of the tran-
cedure in place of the Gauss-Seidel sweeps in LU- sonic aerodynamics around the rotor. The transonic
SGS. The resulting algorithm is very amenable to small disturbance potential formulation is the sim-
parallel processing but requires significantly more plest approximation used. A more accurate formula-
computational work. The second approach is a Hy- tion is the full potential formulation. Two examples
brid technique which maintains the nearest neigh- of these full potential formulations are the FPR [l]
bor communication patterns of DP-LUR but uses (Full Potential Rotor) code, and the RFS2 [2] code.
the more efficient Gauss-Seidel sweeps of LU-SGS The main advantage of the full potential rotor codes
for the on-processor computations The TURNS is that they can provide a good solution at a low
code, with the DP-LUR and Hybrid operators, is cost (CPU time). These codes, however, require an
implemented on the massively parallel Thinking Ma- approximate wake model to calculate the induced
chines CM-5 using a MIMD (i.e. requiring mes- downwash. The wake models are based on simple
sage passing) approach. The convergence qualities linear aerodynamics and, consequently, have a nar-
and the CPU time of the two implicit operators row range of applicability.
are studied for an example calculation, computing
the quasi-steady three-dimensional flowfield around A more accurate CFD method is the Transonic
a helicopter blade with subsonic and transonic tip Unsteady Rotor Navier Stokes (TURNS) code, re-
Mach numbers. Both the DP-LUR and Hybrid mod- cently developed at NASA Ames by Srinivasan and
ifications of LU-SGS show very good parallelism, co-workers [3-51. TURNS is capable of computing
and maintain the convergence rate of LU-SGS. How- the tip vortices and the entire vortical wake as a
ever, the Hybrid method uses less overall CPU time part of the overall flowfield solution. The code has
than DP-LUR been demonstrated to calculate accurately the three-
dimensional flow around the tip of a helicopter rotor
in both hover and forward fight at subsonic and
1. INTRODUCTION transonic flow conditions [3-111.
In recent years helicopters have proven to be eco-
nomical and convenient vehicles with their ability to Recently, TURNS has been applied in a multidisci-
plinary setting, computing a near-field CFD solution
'Presented at the 77th AGARD F.D.P Meeting, Seville, that is then used as input for a Kirchhoff method
Spain; Oct. 2-5 1995 that predicts the far field noise (121. The code is
tNASA Fellow, Graduate Assistant currently used by NASA, the Army, various Uni-
Associate Professor versities, and the major US helicopter companies.
5 h a r c h Scientist

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithm"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
6-2

However, one drawback of TURNS is the amount respectively. The quantities &, &, Ey, and tZare the
of computation time it requires. An acceptable cal- coordinate transformation metric8 and J is the Jaco-
culation with TURNS requires a supercomputer of bian of the transformation. The pressure pis related
Cray-class. A typical quasi-steady coarse-grid Eu- to the conserved quantities through the perfect gas
ler computation by TURNS requires about 30 min- equation of state
utes CPU time on a Cray C-90, while an unsteady
computation requires 3-4 hours. Finegrid viscous p = ( y - l ) ( e - - ( uP ztvatwa)} (3)
computations require considerably more time.
Parallel computers, which include massively parallel The viscous flux vector S is incorporated in the code
supercomputers as well as workstation clusters, are but the calculations given in this paper are all invis-
begixning to replace traditional vector supercomput- cid (i.e. z = 0 in Eq. 1) so the viscous terms are not
ers for large scale computations due to their lower described here. Details can be found in [4].
cost and high peak execution rates. At present, The governing equations are applied to an inertial
TURNS is inefficient on parallel machines. The main reference system that moves with the blade. Because
bottleneck preventing better parallel efficiency is the the blade is rotating, the system is eontinuously un-
LU-SGS algorithm [16] used for the implicit time steady. In order to get a quasi-steady starting so-
step. The objective of our work is to study tech- lution, the blade must be held in in fixed pwition.
niques that will improve its efficiency. Thus, the This is done, in effect, by adding source terms to the
majority of this paper will focus on the LU-SGS al- right hand side
gorithm and some modifications thereof which im-
prove its parallel efficiency. Initial results of this 0
effort were presented in reference [13].
Although the TURNS code is primarily used for ro-
tor CFD calculations, the solution algorithm is the
.-+[ ?f] (4)

same as many other CFD methods. Consequently,


the parallelization procedures proposed here could where C l is the angular velocity of the rotor. The
readily be used for other codes that use the LU-SGS
implicit operator.
R
! vector is used only for the quasi-steady case to
get a starting solution. It is not used for unsteady
calculations
2. CODE DESCRIPTION
The governing equations for the TURNS code are The inviscid fluxes are evaluated using Roe's upwind
the unsteady, compressible, thredimensional thin differencing [14] in all three directions. The use of
layer Navier-Stokes equations. These equations are upwinding obviates the need for user-specified arti-
applied in conservation form in a generalized body- ficial dissipation and improves the shock capturing
conforming curvilinear coordinate system in transonic flowfields. Third order accuracy is ob-
tained using van Leer's MUSCL approach [15] and
flux limiters are applied so the scheme is Total Vari-
8,Q +acE +8,F + 8cG = k a c S + 2 (1) ation Diminishing (TVD).
The final Euler discretized form of Eq 1 in unfac-
where 7 = 1, E = F ( z , y , z , t ) , q = q ( z , y , z ,t ) , and tored implicit delta form is
C = ((1, U, z,t). The coordinate system (I,y, z , 1 )
is attached to the blade. The vector of conserved [ I + h ( 6 c A " + 6 , B n + 6 ~ C " ) ] A Q "=-hRHS"
quantities is Q,and the inviscid flux vectors E , F . (5)
and G are where
RHS" = 8 ~ E " + 8 , F " + & G n - ZJZ" (6)
I is the identity matrix, h is the time step to which
the formulation is described more completely in [4],
and AQ" = Q"+'-Q". The 5 x 5 matrices A, B and
C are the Jacobians of the flux vectors with respect
to the conserved quantities (e.g. A = q
BE )

3. IMPLICIT OPERATOR
The TURNS code uses the two-factor LU-SGS
(Lower-Upper Symmetric Gauss Seidel) algorithm
where H = (e + p) and U, V , and W, are the con- of Yoon and Jameson [16]for the implicit time step.
travariant velocity components (e.g. U = It + & U + The LU-SGS algorithm has been used in a number
fyw + & w ) . The Cartesian velocity components U. of well-known CFD codes (e.g. INS3D [17], OVER-
v , and w are defined in the I, y. and z directions, FLOW [IS]) primarily for it's stability properties
with larger timesteps. Classic implicit methods such which can also he written
as Beam-Warming approximate factorization have a
large factorization error (of order At') which further
restricts the size of the time step. The two-factor
LU-SGS method has enhanced stability along with
a reduction in factorization error (order Ata) that
make it an attractive alternative. Unfortunately, the
LU-SGS method is difficult to parallelize.
The LU-SGS scheme resembles a typical LU factor- In the first step of (14), sweeps updating 6Q' are
ization scheme with diagonal preconditioning to in- performed in the positive direction (that is, from 1
crease robustness. The scalar diagonal terms are to j,,,.,, k,.,, Zma,) through the solution domain.
obtained by use of approximate Jacobians, avoiding The second step then computes 69" by sweeping
costly matrix inversions. The Jacobian terms A , B , hack through the domain in the opposite direction.
and C in Eq.5 are split into "+"
and "-"parts, with This algorithm can he vectorized using a hyperplane
positive parts constituting only the positive eigenval- approach, as outlined in [19]. Vectorization is done
ues and negative parts constituting only the negative across hyperplanes in which j+k+kconal. This is
eigenvalues. The positive matrix is backward differ- outlined in Fig. 1.
enced and the negative matrix is forward differenced,
as follows

(7)

This splitting ensures diagonal dominance. Approx-


imate Jacobians are constructed using a spectral ap- Figure 1: Domain sweeping strategy used by LU-
proximation SGS algorithm. Can vectorize on hyperplanes where
j + k + l = consl.
1
A* = sf- A f P A I ) f€ p a l (8) While the hyperplane approach leads to good vector
A,(
execution rates, it is difficult to parallelize for two
where PA is the spectral radius of A (in the { direc- reawns; 1) the size of the hyperplanes vary through-
tion). out the grid, leading to load balancing problems, and
PA = [IAaII = VI + a 104 (9) 2) there is a recursion between the planes, leading
to a large amount of communication.
E is some small value (e.g. .001),and s; is defined
Parallelization of the LU-SGS algorithm in (14) has
as been addressed by other researchers. Barszcz et
al. 1191 implemented the LU-SSOR algorithm, which
is similar to LU-SGS, on a massively parallel ma-
chine by restructuring the data-layout using askew-
hyperplane approach. Although they were able to
The same procedure is used in the q and C directions extract reasonable parallelism with this approach
to form the B and C terms. the data-layout is complex and considerable effort
Substituting this development into Eq. 5, we arrive was required to implement the domain partitioning
at a system of the form in an efficient manner when using a MIMD (Multiple
Instruction Multiple Data) implementation. Also,
LD-IUAQ" = -hRHS" the restructuring of data on the left hand side in
turn causes the right hand side layout to be skewed
where and extra communication is required. Overall, the
LU-SGS algorithm (14) is not conducive to efficient
D = I + h (PA + PR + P c ) j , ~ , i parallel execution.
L = D - h (AT-l+ Bz-1 +@I) Several researchers have proposed modifications of
the LU-SGS algorithm to make it more paralleliz-
U = D + h (A,+, + E;+, + CGl) (12) able. Candler et al. [21, 221 have investigated a mod-
ification called Data-Parallel LU Relaxation (DP-
D is a diagonal matrix, and the two step LIT decom- LUR), which has shown excellent results in a data-
position can be performed by parallel environment. It is used in this study and is
discussed more thoroughly in section 3.1. Wong et
LA@ = -hRHS" al. [20] have investigated a domain decomposition
UAQ" = DAQ' implementation of LU-SGS. For two-dimensional
--

steady state reacting flow problems, they found that, cessing than LU-SGS, the use of Jacobi sweeps leads
while the convergence rate of the operator is re- to a larger amount of computational work. It is
duced with the domain breakup, the affect is rel- well-known that a Jacobi method will have a the-
atively weak (e.g. with 64 subdomains, the number retically slower convergence rate than Gauas-Seidel.
of iterations increases by less than 20%). Thus, the Multiple sweeps (e.g. 4 6 ) are therefore required in-
domain decomposition strategy appears promising, side Eq. (15) to maintain a comparable convergence
and is used as a basis for the Hybrid algorithm, dis- rate to LU-SGS. Although DP-LUR can be executed
cussed in section 3.2. efficiently on a parallel machine, the added compu-
tational cost is a significant penalty, the specifies of
3.1 DP-LURMethod which are discussed in section 5.1. The question is
A modification of LU-SGS, referred to as Data- whether the computational penalty of DP-LUR is
Parallel LU Relaxation, bas been introduced by Can- the best that we can do.
dler et al. [21, 221 for solving hypersonic flow prob-
lems. Essentially, the modification involves trans- 3.2 Hybrid Method
ferring the nondiagonal terms to the right hand side The motivation behind development of the Hybrid
and using values from the previous iteration for these approach is to replace a source of inefficiency in
terms. The modified operator then becomes Jacobi- DP-LUR. The DP-LUR algorithm was developed
like and requires only nearest neighbor communica- primarily for data-parallel computations. Its con-
tion. This operator has been found to he very effi- vergence is independent of the number of proces-
cient in a data-parallel environment (e.g. [22, 231). sors used because the same Jacobi sweeping strat-
The DP-LUR modification of the LU-SGS algorithm egy that allows nearest neighbor communications
is given in (15). between the processors is also used for the compu-
tations on each processor. Doing the on-processor
computations with Jacobi sweeps is a source of in-
6Qy,k,r= D - I . hRHS" efficiency, since the computational work can be per-
formed more efficiently with the Gauss-Seidel sweeps
of LU-SGS. The strategy behind the Hybrid ap-
For i = 1,. . ., i,,, Do proach is to use the communications structures of
6QV) -D-l. the DP-LUR algorithm, to maintain load-balanced
J,k,l - (15) parallelism with nearest neighbor communications,

1
RHS"+ along with the more efficient LU-SGS algorithm for
A t 6QV-l) -AT bQ(i-l)+ the on-processor computations. The algorithm is
J-1 Jyl J+1 J+1
+ 6
Qb-I -';+I
'k-1
($-I)
t+l
6Q(i-1)+ referred to as the Hybrid approach because it re-
tains features of both the LU-SGS and DP-LUR al-
C~,6Qf~ - ~C&16Qf1y1)
" gorithms.
End Do
6Q7,,,, = SQ!'"") J . k . 1 - D-'
6Q!O' . hRHS"
JlLJ

For i = 1, . . ., imaS
Do
The main difference between the LU-SGS and DP-
LUR algorithmsis that a Jacobi sweeping strategy is *(i) (i-1)
used in DP-LUR while Gauss-Seidel sweeps are used 6Qj,k,i = 'Qj,k.i

in LU-SGS. The advantage of using Jacobi sweeps is


that there is no recursion of data and only nearest
neighbor communication is required at each node.
Thus, it can be completely load balanced with com-
munications only at the borders of each partition
(Fig. 2).
End Do

The equations used inside the sweeps of (16) are the


same as those used by the LU-SGS algorithm (14).
Figure 2: Jacobi Sweeping Strategy of DP-LUR al- Thus, with 1 sweep (i.e. ime, = l), the Hybrid
gorithm. Load balanced parallelism with nearest algorithm is very similar to a domain decomposition
neighbor communication. implementationof LU-SGS, the only difference being
I, the initial condition on the first line of (16). The use
Although DP-LUR is more amenable to parallel pro- of multiple sweeps improves the convergence rate,
6-5

making up for the loss of connection in the domain


decomposition.
On 1 processor (with 1 sweep), the method is iden-
tical to the original LU-SGS algorithm. On many
processors, (i.e. in the limiting condition where the
number of processors approaches the number of grid-
points) the Hybrid method is identical to the DP-
LUR algorithm. The computational workload of the
Hybrid algorithm, therefore, is dependent upon the
number of processors used. The algorithm should be
most efficient with few processors, and should always
require less computational work than DP-LUR.
Parallel implementation of the Hybrid algorithm is
c
done in essentially the same way as DP-LUR. Border
data is communicated to nearest neighbors at the be-
ginning of each sweep and each processor performs
the standard LU-SGS algorithm on its domain. B e
cause the size of the domains corresponds with the
number of processors used, the convergence will be
different with different processor partitions. How-
ever, like DP-LUR, the Hybrid algorithm maintains
load balanced parallelism with only nearest neighbor
communications.
1
4. PARALLEL IMPLEMENTATION Figure 3: Partitioning the three-dimensional domain
A MIMD approach (i.e. requiring message passing) on a two-dimensional array of processors.
is used for parallel implementation. There are two
reasons for choosing the MIMD approach over a
SIMD (Single Instruction Multiple Data) or data-
parallel approach; l ) Code portability; because mes- of the DP-LUR and Hybrid algorithms, totaling
sage passing codes are more portable to different 4 x imas communication steps. One communica-
parallel architectures (e.g. from massively paral- tion step is required to pass information to form the
lel supercomputers to workstation clusters), and 2) RHS, since third order accuracy requires data at the
Ease of implementation; since the original code is j,C, If 2 points. This communication step could be
over 6000 lines, it is much easier to add message eliminated if a layer of two ghost cells were used but
passing directives to the existing code than rewrite this increases memory and communications. The
the entire code in a High Performance Fortran type boundary conditions at the flowfield borders and on
language (e.g. CMFortran). To ensure easy porta- the rotor blade can be imposed locally on each pro-
bility of the code, a set of generic message passing cessor, but communication is required in the wake
subroutines was used. With this protocol, the spe- region where L = 1 to enforce the boundary con-
cific message passing commands can be altered in dition where the C-H grid collapses and data is av-
one line of the code rather than throughout, making eraged across this wake plane. Only the processors
conversion to different message passing languages, holding this data perform communications and only
such as PVM (Parallel Virtual Machine) and MPI one communication step is needed.
(Message Passing Interface), a relatively short pro-
cedure. 5. RESULTS AND DISCUSSION
Fig. 3 shows the breakup of the three-dimensional The TURNS code with the DP-LUR and Hybrid im-
solution domain. The flowfield domain is layed out plicit operators have been implemented on the maa-
on a two-dimensional array of processors. The flow- sively parallel Thinking Machines CM-5 at the Army
field is split in the wraparound ( J ) and spanwise (IC) High Performance Computing Research Center (AH-
directions. The normal direction (L) is left intact so PCRC) in Minneapolis, MN. The CM-5 has a total
that the implementation of surface boundary condi- of 896 processors, configurable in processor parti-
tions is unchanged from the existing serial code. A tions of 64, 256, and 512 processors. The implemen-
single layer of ghost cells is placed on the border of tation is performed by adding message passing calls
each processor, providing a location where the com- to the existing Fortran 77 code. The message pass-
municated data can be stored. ing calls are taken from the CMMD library, which
is supported by Thinking Machines, Inc.
The communications between neighboring proces-
sors is done once during each of the inner sweeps Each processor on the CM-5 has a peak performance
of 5 Mflops/Processor. Vector Units (VU’S) ex-
66

ist on each processor that increase the performance erally, most newer machines (e.g. IBM SP-2) allow
substantially (e.g. from 5 Mflops/processor to 128 the user to choose the exact number of processors
Mflops/proceasor). Unfortunately, the only way to they want for their partition, so this will most likely
utilize the VU’s at this time is to rewrite the code in not be an issue on more modern machines.
CMFortran, a High-Performance+Fortran type lan-
The three dimensional quasi-steady starting solution
guage. Since TURNS is over 6K lines, rewriting the
is computed around the rotating blade in subsonic
code would require considerable effort and was one
conditions, with Mtip = 0.664, and a more tran-
of the main reasons we chose the MIMD implemen-
tation in the first place. In addition, rewriting the
sonic condition, with Mti 0.80. In both casen,
the freestream Mach numieris Mm = 0.17 and the
code to CMFortan would eliminate code portability. blade position is fixed at zero degrees azimuth an-
Consequently, the results presented here are deter-
gle (Fig. 5). It should be noted that the first case,
mined without utilizing the VU’s. Although this
Mtip = 0.664, is a realistic test case for rotor cal-
degrades the performance on the CM-5, it is not a culations. The Mrip = 0.800 case, however, is far
big drawback overall, because our future plans are
too transonic to be used in a practical helicopter a p
to run the code on parallel systems such as the IBM plication. It was added as an extreme test case to
SP-2 and workstation clusters, which do not have investigate the behavior of the implicit solvers with
vector units.
more nonlinear transonic flows.
The code is run for a test problem that computes
the quasi-steady flowfield around a symmetric OLS
blade. The OLS blade has a sectional airfoil thick- ufi4 444444

Q
ness to chord ratio of 9.71%and is a 1/7 scale model
Qd*d~
of the main rotor for the Army’s AH-1 helicopter. A S u n g SoluUoo
135 x 50 x 35 C H type grid is used, with the do-
main extending eight chords in all directions. The
upper half of the grid is shown in Fig. 4. We chose
Blade f l x d
at 0 deg Azhnuth
%
Figure 5: Quasi-Steady solution. Blade fixed at zero
degrees azimuth angle.

It should he also be noted that results are presented


for a quasi-steady fixed blade case instead of an U-
steady case because the convergence behavior of the
implicit solvers can be quantified most easily with
this quasi-steady case. It is difficult to investigate
convergenee behavior with an unsteady case without
also verifying time-accuracy for the implicit solver.
Figure 4: Upper half of the 135 x 50 x 35 C H type This does not indicate that the method is unable
grid used for OLS airfoil calculations on the CM-5. perform unsteady runs. The same algorithm is used
for time-accurate unsteady cases so these cases can
to use this particular grid and airfoil because they be run without further modifications to the algc-
were used for calculations in the aeroacoustic study rithm.
in [12]. Unfortunately, the unusual mesh dimensions
cannot be partitioned in a way that exactly matches 5.1 DP-LURResults
the processor partitions on the CM-5 (64, 256, and The results of timings of TURNS with the DP-
512 processors). For example, the J dimension has LUR algorithm on 57, 228, and 456 processors are
only odd factors so it is impossible to partition it on given in Tables 1 and 2, for the MtlP = 0.664 and
an even number of processors. We did break up the Mtip = 0.800 cases, respectively. The method is
mesh in a way that used most of the processors in stopped when the density residual drops by two or-
the partition. For the 64 node partition, the mesh ders of magnitude below its maximum value. Plots
was broken in 19 points in the J direction, and 3 of the LZnorm density residual vs. number of iter-
points in the IC direction, giving a total of 57 prc- ations are shown for the two cases in Figs. 6 and
cessors. For the 256 node partition, the mesh was 7. The convergence of the original LU-SGS method,
broken in 19 points and 12 points in the J and IC di- run on a single processor, is also shown in the plots
rections, respectively, giving 228 processors. Finally, for comparison purposes.
for the 512 node partition, the mesh was broken in
19 points and 24 points in the J and K directions,
giving 456 processors. When executing the code, the
remaining processors in the partition sit idle. Gen-
6-7

Table 1 - Timing Results on the CM-5 for TURNS Table 2 - Timing Results on the CM-5 for TURNS
with DP-LUR for subsonic test case. 135 x 50 x 35 with DP-LUR for transonic test case. 135 x 50 x 35
mesh, Mtjp = 0.664, density residual converged to mesh, Mtip = 0.800, density residual converged to
5 x 10-7. 5 x 10-7.
Iterations I Yo Comm. Iterations % Comm. I Tot. Time
5 sweeps 5 sweeps
436 10.4 % 9330 sec 464 10.0 % 9902 sec
440 15.3 % 2508 sec 457 14.8 % 2628 sec
438 21.0 % 1445 sec 465 20.8 % 1511 sec
6 sweeps 6 sweeps
351 9.2 % 8505 sec 379 10.0 % 9210
228 350 15.1 % 2233 sec 228 380 15.1 % 2424 sec
456 353 19.9 % 456 383 19.9 % 1402 sec
7 sweeps 7 sweeps
57 304 9.6 % 8229 sec 335 9.6 % 9068 sec
228 304 16.6 % 2110 sec 228 335 16.6 % 2383 sec
456 306 20.6 % 456 345 20.6 % 1380 sec

0
, .\. . . .
.. . .. .
. : ... . . .. . .. .
.....

.. .. ... ... .. . .. .
. . . .
., . .
. .
......
.................
.
.
.
i... .....
t . . . . ' . . am. . ' . ~' .~. ~. . ' . . . . : . .
loo

Figure 6: Convergence of TURNS with DP-LUR


7&

lkdon
JDO
........_.
.,~

-.
-

BGrrp
.. .
I'

LUSGS(m1PlW

....

100
.__
................--

. 5swaa
''_

5m

Figure 7: Convergence of TURNS with DP-LUR


method. Mtip = 0.664 method. Mtjp= 0.80

n
a- 4 s

Figure 8: Parallel Speedups of the time per iteration using the DP-LUR operator
6-8

Table 3 - Timing Results on the CM-5 for TURNS Table 4 - Timing Results on the CM-5for TURNS
with the Hybrid method for subsonic test case. with the Hybrid method for transonic test case.
135 x 50 x 35 mesh, Mtip = 0.664, density residual 135 x 50 x 35 mesh, Mtip = 0.800, density residual
converged to 5 x lo-'. converged to 5 x lo".
Procs
1 sweep I I 1 sweep
57 ' 461 10.3 % 4937 sec 10.6 % 5719 sec
228 470 15.1 % 1434 sec 228 558 15.3 % 1707 sec
456 502 18.8 % 863 sec 456 I 580 I 18.8 % I 998 sec
2 sweeps 2 sweeps I I I
57 394 10.1 % 5410 sec 10.1 % 6568 sec
228 398 14.8 % 1524 sec 14.8 % 1858 sec
456 404 20.6 % 889 sec 492 20.4 % 1082 sec
3 sweeps 3 sweeps
57 386 10.0 % 6423 sec 57 467 9.9 % 7748 sec
228 385 14.8 % 1771 sec 228 466 14.4 % 2143 sec
456 I 385 I 19.7 % [ 1012 sec 456 470 19.2 % 1226 sec

H@nd M.chod
M6p0.884

57P---- 57P----
pep- ....... p e h .......
458Pmcl- am-

z
1I
1

3P 1@ 10'

6 g"
Irc L U X I S (m 1 Pmc
Irc
1@ '
1O*
1Q

~~

100 200 Joo 400 5Qo 1w sm m 500 WO


hmar
Figure 9: Convergence of TURNS with Hybrid Figure 10: Convergence of TURNS with Hybrid
method. Mtip = 0.664 method. Mtip = 0.80

SI 2?e 4%
~ a o n
Figure 11: Parallel Speedups of the time per iteration of TURNS using the Hybrid operator
6-9

The convergence plots show that a minimum of 5 the number of inner sweeps required for convergence.
inner sweeps (i.e. ,,i = 5) of DP-LUR are re- While DP-LUR required a minimumof 5 sweeps, the
quired to converge the solution. In both the subsonic Hybrid method converges a t a comparable rate to
and transonic cases, 4 sweeps began to diverge. For single processor LU-SGS with only 1 sweep. This
the Mtip = 0.664 case, 5 sweeps gives slightly worse is due to the more efficient Gauss-Seidel procedure
convergence than single processor LU-SGS while 6 used for the on-processor computations. With 2
sweeps gives slightly better. For the MtiP = 0.800 sweeps, the convergence of the Hybrid method is
case, 5 sweeps of DP-LUR gives about the same con- almost identical to single processor LU-SGS. With
vergence as single processor LU-SGS, and 6 sweeps is one sweep, there is significant spread between the
better. This seems to indicate that DP-LUR main- convergence curves for the different numbers of pro-
tains a good level of robustness for transonic cases, cessors, but with 2 sweeps, the spread is reduced
since it requires less inner sweeps to maintain the considerably so that all processor partitions follow
convergence rate of LU-SGS. The single processor essentially the same convergence path as LU-SGS.
LU-SGS method requires the work of approximately Although it is not shown in the figures, the conver-
1.8 sweeps of DP-LUR. Consequently, these results gence plot with 3 sweeps is only slightly better than
show that, in order to maintain the same conver- with 2, and it is therefore not plotted to avoid the
gence rate, the DP-LUR implicit operator requires graph from becoming too crowded.
about 3 times the computational work of single pro- The Hybrid method is considerably faster than DP-
cessor LU-SGS. LUR. The CPU times of the Hybrid method are only
Timings of the DP-LUR method indicate that more 55-60% those of DP-LUR. This is due t o the larger
sweeps seems to be the better choice. The overall amount of computational work in DP-LUR, because
CPU time with 7 sweeps is fastest, but the difference a larger number of sweeps are required for conver-
between 6 and 7 sweeps is small (less than 2%). Each gence.
additional sweep increases the CPU time per itera- It should be pointed out that each sweep with DP-
tion by 10-15%. Communication represents a rela-
tively small percentage of the total CPU time. The
L U R involves only a single sweep through the do-
main on each processor, whereas the Hybrid method
communication percentage increases with increasing performs the two-step LU-SGS algorithm on each
number of processors. Also, the percentages tend to processor, performing two sweeps through the do-
fluctuate for different cases which is probably due main. Thus, each sweep of the Hybrid method is
to the fact that these runs were done on a loaded approximately equivalent to the work of two sweeps
rather than dedicated machine.
in DP-LUR. This is indicated in the CPU times; the
It should be noted that, in theory, the solution us- CPU time using 6 sweeps of DP-LUR is approxi-
ing DP-LUR is the same regardless of the number of mately equal to 3 sweeps using the Hybrid method.
processors used, so the number of iterations should Using 1 sweep in the Hybrid method gives the best
be the same for all processor partitions. However, CPU time, but requires 17-18% more iterations than
Tables 1 and 2 show that the implementation did single processor LU-SGS. The CPU time with 2
show some slight discrepancies in the number of iter- sweeps is worse than that of 1 sweep by about 8%,
ations. Generally, the differences are small (less than but the convergence rate is much closer to that of
4%) and we attribute them to numerical roundoff in single processor LU-SGS. When 3 sweeps are used,
the machine. Differences in the overall solution are the convergence is only slightly better (a reduction
indistinguishable for the different partition sizes. in iterations of less than 5%) than 2 sweeps, while
A plot of the parallel speedups of the time per iter- the CPU time is about 11-15% more. Thus, 3 sweeps
ation of TURNS with DP-LUR is shown in Fig. 8. or more appears to be unnecessary.
The speedup from 57 to 228 processors is nearly lin- A plot of the parallel speedups of the time per iter-
ear, but some falloff is noted for 456 processors. This ation is shown in Fig. 11. The parallel speedups are
is believed to be due to the relatively small problem essentially the same as with DP-LUR.
size of 236,250 gridpoints. It is expected that the
speedup will be more linear with larger problems.
The parallel speedup increases slightly for a. larger 6. SUMMARY A N D CONCLUSIONS
number of sweeps, since the amount of computa- A strategy is presented for implementing the three-
tional work goes up. However, the difference is not dimensional Navier-Stokes Rotorcraft CFD code
significant. TURNS on massively parallel computer architec-
tures. The main portion of the code that is difficult
5.2 Hybrid Results to parallelize is the implicit timestep using the LU-
Results of timings with the Hybrid algorithm are SGS operator. We study two modifications of this
given in Tables 3 and 4, for the Miip = 0.664 and operator that make it more amenable to parallel im-
Mtip = 0.800 cases, respectively. Plots of the density plementation. The first is the Data-Parallel LU Re-
residual vs. CPU time are given in Figs. 9 and 10. laxation (DP-LUR) technique, which essentially re-
places the Gauss-Seidel sweeps in LU-SGS with Ja-
The efficiency of the Hybrid method is apparent in
6-10

cobi sweeps, and uses multiple sweeps of the domain 9496327 and by the Army Research Office contract
to maintain the same convergence rate. The sec- number DAAL03- 89-C-0038 with the University of
ond is a new approach that couples the Jacobi com- Minnesota Army High Performance Computing Re-
munication strategy of DP-LUR with Gauss-Seidel search Center (AHPCRC) and the DOD Shared Re-
sweeps of LU-SGS for the on-processor computa- source Center at the AHPCRC.
tions. It also uses multiple inner sweeps to maintain
the convergence rate of LU-SGS. Because this sec-
ond approach retains features of both the DP-LUR
and LU-SGS algorithms, we call it a Hybrid method. References
The TURNS code is tested on the Thinking Ma-
Strawn, R. C., and Caradonna, F. X., “Conser-
chines CM-5, using a MIMD approach for parallel
vative Full Potential Model for Unsteady Tran-
implementation. It is run for an Euler quasi-steady
sonic Rotor Flows,” A I A A Journal, Vol. 25, No.
calculation with 236,250 gridpoints, computing the
flow around the tip of a helicopter blade rotating 2, Feb. 1987, pp. 193-198.
with subsonic and transonic tip Mach numbers. Re- Bridgeman, J . O., Steger, J . L., and Caradonna,
sults from various processor partitions show that F. X., “A Conservative Finite-Difference Al-
both the DP-LUR and Hybrid modifications of LU- gorithm for the Unsteady Transonic Potential
SGS are very parallelizable, showing good parallel Equation in Generalized Coordinates,” AIAA
speedups. Both methods are also able to maintain Paper 82-1388, 9th Atmospheric Flight Me-
the convergence qualities of original LU-SGS for all chanics Conference, San Diego, CA, Aug. 1982.
test cases. The Hybrid method, however, requires
less CPU time due to lower computational work re- Srinivasan, G . R., “A Free-Wake Euler and
quirements. The DP-LUR modification of LU-SGS Navier-Stokes CFD Method and its Application
causes the amount of computational work in the im- to Helicopter Rotors Including Dynamic Stall,”
plicit solver to increase threefold, to maintain the JAI Associates, Inc., Technical Report 93-01,
same convergence rate. The Hybrid modification, November 1993.
however, can match to within 25% the convergence
rate of single processor LU-SGS with no increase in Srinivasan, G. R., Baeder, J . D., Obayashi, S.,
the computational work. It can exactly match the and McCroskey, W. J., “Flowfield of a Lifting
convergence rate with twice as much work in the im- Rotor in Hover: A Navier Stokes Simulation,”
plicit solver, yielding CPU times that are only 8% A I A A Journal, Vol. 30, No. 10, Oct. 1992, pp.
higher than the single sweep cases. Overall, the CPU 2371-2378.
times for the Hybrid method are only 5540% those
of DP-LUR. Srinivasan, G. R., and Baeder, J . D., “TURNS:
The computational work required of the Hybrid ap- A Free-Wake Euler/Navier-Stokes Numerical
proach on a parallel machine will always be less than Method for Helicopter Rotors,” A I A A Journal,
that of DP-LUR. On a few processors, the amount Vol. 31, No. 5, May 1993, pp. 959-962.
of computational work will be about the same as
LU-SGS. The Hybrid approach is therefore ideally Srinivasan, G.R., and Raghavan, V., Duque, E.
suited for machines that have smaller numbers of P. N., and McCroskey, W., J., “Flowfield of a
more powerful, non-vectorized, processors. One ex- Lifting Rotor in Hover - A Navier Stokes Simu-
ample of a machine that fits this category is the 150 lation,” A I A A Journal, Vol. 30, No. 10, October
processor IBM SP-2. We are currently implement- 1992.
ing the code on the IBM SP-2 a t NASA Ames, and
Srinivasan, G.R., and Ahmad, J.U., “Navier
expect better CPU times than what were obtained Stokes Simulation of Rotor-Body Flowfields in
on the CM-5.
hover Using Overset Grids,” Proceedings of the
Finally, although the TURNS code is used primarily Nineteenth European Rotorcraft Forum, Paper
for rotorcraft CFD applications, the paralleliza.tion No. C15, September 1993, Cernobbio Italy.
strategy is not unique to this application. The paral-
lelization procedures proposed here could be readily Duque, E.P.N., and Srinivasan, G.R., “Numer-
used for other CFD codes that use the LU-SGS al- ical Simulation of a Hovering Rotor Using Em-
gorithm. bedded Grids,” Proceedings of the 48th An-
nual Forum of the American Helicopter Society,
Washington DC, June 1992.
ACKNOWLEDGMENTS
The first author was supported by a NASA Grad- Duque, E.P.N., “A Structured/Unstructured
uate Student Fellowship. This work was supported Embedded Grid Solver for Helicopter Rotor
by allocation grants from the Minnesota Supercom- Flows,” Proceedings fo the 50th Annual Fo-
puter Institute (MSI) and Cray Research, Inc. The rum of the American Helicopter Society, Vol.
work is also supported in part by grant NSF CCR- 11, Washington DC, May 1994, pp. 1249-1257.
6-1 I

[lo] Baeder, J.D., Gallman, J.M., and Yu, Y.H., “A [21] Candler, G.V., Olynick, D.R., “Hypersonic
Computational Study of the Aeroacoustics of Flow Simulations Using a Diagonal Implicit
Rotors in Hover,” Proceedings of 49th Annual Method,” presented at the 10th International
Forum of the American Helicopter Society, St. Conference on Computing Methods in Applied
Louis, Missouri, May 1993, pp. 55-71. Sciences and Engineering, Paris France, Feb.
1992.
[ll] Baeder, J.D., and Srinivasan, G.R., “Compu-
tational Aeroacoustic Study of Isolated Blade [22] Candler, G.V., Wright, M., and McDonald,
Vortex Interaction Noise,” AHS Specialists’ J.D., “A Data Parallel LU-SGS Method for Re-
Aeromechanics Conference, San Francisco, CA, acting Flows,” A I A A Journal, Vol. 32, No. 12, I
Jan 1994. Dec. 1994, pp. 2380-2386.
[23] Wright, M.J., Candler, G.V., and Prampolini,
[12] Strawn, R. C., Biswas, R., and Lyrintzis, A. S., M. “A Data Parallel LU Relaxation Method for
“Helicopter Noise Predictions using Kirchhoff the Navier Stokes Equations,” AIAA Paper 95-
Methods,” presented a t the 51st Annual Forum 1750, 1995.
of the American Helicopter Society, 9-11, May,
1995, Fort Worth, Texas; also, to be published
in the Journal of Computational Acoustics.

[13] Wissink, A.M., Lyrintzis, A.S., and Strawn,


R.C. “On the Parallelization of the Transonic
Unsteady Rotor Navier Stokes Code,” pre-
sented at the Computational Aerosciences Con-
ference, March 11-13, 1995, Santa Clara, CA.

[14] Roe, P.L., “Approximate Riemann Solvers,


Parameter Vectors, and Difference Schemes,”
Journal of Computational Physics, Vol. 43, No.
3, 1981, pp. 357-372.

[15] Anderson, W.K., Thomas, J.L., and van Leer,


B . , “A Comparison of Finite Volume Flux Vec-
tor Splittings for the Euler Equations,” AIAA
Paper 85-0122, Jan. 1985.

[16] Yoon, S., and Jameson, A., “A Lower-Upper


Symmetric Gauss Seidel Method for the Euler
and Navier Stokes Equations,” A I A A J . , Vol.
26, 1988, pp. 1025-1026.

[17] Yoon, S., and Kwak, D., “Three-Dimensional


Incompressible Navier-Stokes Solver Using
Lower-Upper Symmetric-Gauss-Seidel Algo-
rithm,” A I A A J . , Vol. 29, No. 6, June 1991,
pp. 874-875.

[18] Kandula, M., and Buning, P.G., “Implementa-


tion of LU-SGS Algorithm and Roe Upwinding
Scheme in OVERFLOW Thin-Layer Navier-
Stokes Code,” AIAA Paper 94-2357, June 1994.

[19] Barszcz, E., Fatoohi, R., Venkatakrishnan,


V., and Weeratunga, S., “Solution of Regu-
lar, Sparse Triangular Linear Systems on Vec-
tor and Distributed-Memory Multiprocessors,”
NASA Report RNR-93-007, April 1993.

[20] Wong, C.C., Blottner, F.G., and Payne, J.L.,


“A Domain Decomposition Study of Massively
Parallel Computing in Compressible Gas Dy-
namics,” AIAA Paper 95-0572, Jan. 1995.
7- 1

DEVELOPMENT OF A PARALLEL IMPLICIT


ALGORITHM FOR CFD CALCULATIONS
F. Dias d’Almeida, F.,4. Castro, J.M.L.M. Palma and P. Vasconcelos
Faculdade de Engenharia da Universidade do Porto
Rua dos Bragas, 4099 Porto CODEX, Portugal

Abstract exact, opposed to pressure (or pressure-correction)


equation derived from segregated [ll] algorithms.
The present article reports on further developments The code development was made using the two
of an implicit coupled algorithm for fluid flow equa- classical geometries of a two-dimensional cavity with
tions. Mass and momentum conservation equations a sliding lid and a backward-facing step. Results [12]
are solved as part of one large system of equations show that for instance, in case of the square cavity
in one single step. Iterations are needed because with sliding lid at Re=1000 the DIRECTO algorithm
of nonlinearities only. The algorithm requires no with LU factorization converged in 8 iterations, inde-
under-relaxation factors and can reach convergence pendently of grid size, for a residual error of 1 x
in a reduced number of iterations, compared to de- On the other hand the SIMPLE algorithm although
coupled approaches. This article describes improve- requiring 186 iterations for a grid of 64x64, used l o x
ments leading to reduction of both memory and com- less CPU time. This order of magnitude ratio was
puting time. The algorithm exceeds the memory re- reduced by using Block Band LU factorization [13]
quirements of the SIMPLE algorithm of Patanksr and GMRES [14], since at each iteration there is the
and Spalding by a factor of K 2 ,where I< is the num- solution of a large, sparse, unsymmetric, block-band
ber of independent variables. Computing time reduc- (block tridiagonal) linear system.
tion’ was achieved by using GMRES and a precon-
ditioner based on incomplete LU factorization. The The need for finer grids, mainly on complex ge-
algorithm compares favourably with conventional de- ometries, leads to very large systems of equations re-
coupled approaches. To overcome the high mem- quiring the use of secondary storage and large CPU
ory requirements and enable the simulation of large times. Parallel architectures with distributed mem-
physical problems two different approaches for par- ory may be one answer to those problems. The main
allelization were also tested, a t the expense of in- drawbacks are the communication between proces-
creased computing time. sors and additional computations. In this work a
cluster of 4 DEC Alphastation AXP, models 500s
and 600S, connected by FDDI (Fiber Distributed
1 INTRODUCTION Data Interface) and Gigaswitch using PVM (Parallel
Virtual Machine) [15, 161 was used as a parallel envi-
The SIMPLE [l, 21 algorithm is amongst the most ronment. PVM is a software package that allows the
widely used algorithms for solving the fluid flow concurrent use of heterogeneous processing elements.
equations. The difficulties of convergence of SIM- The article is made up of 3 major Sections. In Sec-
PLE when dealing with large problems, either in tion 2 the algorithm is described. Section 3 discusses
terms of physical complexity or grid size, are well the results with respect to accuracy, memory require-
known and have been discussed in the open litera- ments and computing time, including a discussion on
ture (e.g.: [3] [4], [5]). SIMPLE is relatively easy to the linear solvers and parallelization. Section 4 con-
implement and accommodate for increased number cludes the article.
of transport equations, but its sensitivity to numer-
ical aspects as for instance, under-relaxation factors
[6] has led t o many research efforts and even new
algorithms (e.g.: [7]), sometimes closely related to
SIMPLE (e.g: [8] [9]).
2 MATHEMATICAL MODEL
The algorithm discussed in this article (designated
DIRECTO [lo]) solves the fluid flow equations as a The governing equations for two-dimensional incom-
complete coupled system. The cell face velocities are pressible Newtonian and isothermal flows are, in ten-
predicted using a momentum equation, which once sorial notation,
replaced into the continuity equation leads to a pres-

-aui
sure equation fully coupled to the velocity field. No
simplification is made at this stage, the equation is =o,
axi
Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
and, if ut-' > 0. The superscripts k - 1 and k denote
previous and current iteration.
The second member of equation (3) is discretized
axj (2) using second-order finite differences,

where Ui is the velocity component along the xi di-


rection and P , p and p are the static pressure, density
and dynamic viscosity, respectively.

P
a2U
G = 4P
u j - 2u: + U; , (7)
Ax2

Replacing (4) - (8) into equation (3),

Figure 1: Control volume for nodal point P (upper nb nb


and lower case denote nodal and face values, respec-
tively). where A r b , A:b are the coefficients for the nodal
values of velocity and pressure surrounding face east.
When the first-order derivatives of equa-
tions (1) and (2) are integrated in the control Mass conservation
volume P (Fig. 1) of a non-staggered grid, new
equations will arise depending on the velocities at When equation (1) is integrated in the control vol-
the faces of the control volume. ume of Fig. 1,

Face velocities
The relationship between the nodal and face values
is found by discretization (in a control volume cen- The equations for face velocities (9) are now replaced
tred at the face of control volume P) of a simplified into (10) leading to the following algebraic equation,
version of equation (2), obtained assuming mass con-
servation and constant viscosity
nb nb nb

The coefficients A:: and A:: represent links to a


Two different discretizations of the convective total of 18 nodal velocities, i.e. 9-node star for veloc-
term of equation (3) were tested: (Dl), with a first- ity U and V , surrounding P; whereas A:: includes
order upwind scheme for the two derivatives; and connections to 5 nodal pressures ( P p , PE, P w , PN
(D2), with a second-order central difference scheme and Ps).
for the derivative in the direction perpendicular to
the unknown velocity only. For instance, for velocity Momentum conservation
U, we have in case of D1 discretization, if U:-' >0
and vz-' > 0, The integration of equation (2) along i = 1 direction
is,

p (uf-'uf - ub-'u;) Ay +
P; - P&
p (v;-'uk - vf-'u:) AX = - AY
2
and in case of D2,

4 AY
+ uAE A-Y (,5)
1-3

which, after replacing the equations for the velocities 3.1 Accuracy
a t the faces, yields, To obtain the accuracy of DTRECTO we performed
simulations of the flow in a two-dimensional square
UU k VU k
+ vnb + A::p,kb = cavity with sliding lid for 2 Reynolds numbers, (400
nb nb nb (13) and l O O O ) , and 3 grid sizes (64x64, 96x96 and
128x128). The Reynolds number definition was
The coefficients A:: represent links to the 9 nodal U Re = pUlidH/p. UIid is the lid velocity and H is
velocities surrounding P, and A:: represent links to the size of the square cavity.
4 nodal velocities ( V N E ,V N W ,VSEand Vsw). The The velocities were set constant a t every bound-
A:: coefficients includes the contributions from 7 ary, and zero normal gradient for the pressure was
nodal pressures (Pp, PE, Pw , PNE,PNW, PSE and used. This condition was implemented in an im-
PSW). plicit fashion to preserve the implicit feature of the
The momentum equation in the i = 2 direction method. The calculations were stopped for residuals
lower than 1x The residuals are the sum of the
VV k UV k
Anb Vnb + + A:rP,kb = absolute errors of the algebraic equations divided by
nb nb nb (14) reference quantities p U i d H , and pul;dH for momen-
tum and continuity equations, respectively. Calcula-
may be obtained by an identical procedure. tions were all performed in single precision.
Equations (1l ) , (13) and (14) are all assembled in a
single system of equations and solved simultaneously. Method Grid U,,,, Vmtn Vmaz
The system of equations is of the form, DIRECTO D1 64 -0.31999 -0.43943 0.29404
96 -0.32443 -0.44721 0.29897
Ax=b (15) 128 -0.32614 -0.44996 0.30090
Exact value -0.32878 -0.45356 0.30399
x is a vector with sequence of blocks with variables Accuracy 1.73 1.97 1.69
U , V and P . The order of matrix A is ( N I - 2) x
DIRECTO D2 64 -0.31956 -0.43968 0.29383
( N J - 2 ) x K , where N I x N J is the problemsize, and 96 -0.32430 -0.44729 0.29893
K stands for the number of variables (i.e., 3 in case 128 -0.32608 -0.45000 0.30087
of a two-dimensional laminar flow). This is a sparse, Exact value -0.32877 -0.45360 0.30380
unsymmetric, block-band (block tridiagonal) linear Accuracy 1.78 1.95 1.77
system. Because this is the most time consuming CPI 64 -0.32368 -0.44862 0.29925
part of the algorithm, special attention was given to 96 -0.32653 -0.45163 0.30183
this subproblem (in Section 3.3.1). 128 -0.32751 -0.45274 0.30271
After solution of the linear system (15) one global Exact value -0.32873 -0.45431 0.30379
iteration is completed. Because of the non-linearity Accuracv 2.05 1.85
- .. 2-. .
n~
of the differential governing equations, several global SIMPLE 128 -0.32614 -0.45119 0.30143
iterations are needed to obtain convergence and new
coefficients are calculated using the new velocity and Table 1: Square cavity results for Re = 400 (CPI
pressure fields, repeating the process until conver- results from Deng e2 al., 1994).
gence. The nomenclature “global iteration” is used
here to distinguish from the number of iterations as-
sociated with the solver. The estimated exact values and order of accuracy
Because all the conservation equations are solved of the results were estimated following the general-
as part of a single set, with no decoupling (or seg- ization of the Richardson extrapolation method. The
regation, accordingly to nomenclature in ref. [ll]), exact value can be approximated in terms of results
the algorithm can converge in a small number of it- on finite grids plus the leading term of the truncation
erations, and for this reason it has been designated error as,
DIRECTO [lo] (direct, in English).
41 = 4 e z + h,yXn + . . . , (16)

3 DISCUSSION OF RESULTS
The code development was made using the two clas-
sical geometries of a two-dimensional cavity with a
where h is the grid spacing in both directions and X n
sliding lid and a sudden expansion. In this paper re-
is a grid function, assumed the same for every grid
sults will be presented for the two-dimensional square
spacing ( h l , h2 and h3). Provided that h is small
cavity only.
enough for the leading term to be dominant, the or-
This Section discusses 3 major aspects of the algo-
der of the numerical scheme is estimated as [17],
rithm: the accuracy, memory requirements and com-
puting time, in subsections 3.1, 3.2 and 3.3, respec-
tively.
7-4

with, different velocity coefficients in discretized momen-


tum equation of CPI and DIRECTO for control vol-
A = - 4 3 - 41 (20)
umes near the boundaries. Differences could also oc-
42 - 41 cur in the implementation of the pressure boundary
conditions; Deng et al. [17] does not state explicitly
Tables 1 and 2 show non-dimensioned values of what kind of boundaries were used.
maximum and minimum V velocity (V,,, and V,j,) In the CSG method of Deng et al. [17] the
in the horizontal centreline and minimum value of U convection terms are discretized using central
in the vertical centreline. The tables include values finite differences for all mesh Reynolds numbers
predicted by 3 different grids and also the order of ( p U i A z i / p ) and a staggered grid is used, to avoid ve-
accuracy and estimated exact value, obtained by the locity interpolation when discretizing the mass con-
Richardson extrapolation. Results are also shown for servation equation.
SIMPLE algorithm of Patankar [l,21, CPI (Consis- As can be seen in Table 1, the estimated exact
tent Physical Interpolation) and CSG (Centred Stag- values from D1, D2 and CPI methods agree well with
gered Grid) methods of Deng et al. [17]. D1 and D2 each other. The maximumpercentual error is 0.17%,
are versions of DIRECTO algorithm with the con- and is found in the Vmin for the D1 method. The
vective term of equation (3) discretized using equa- estimated accuracy of D1 and D2 is almost second
tions (4) and (5), respectively. order, although lower than CPI method (if based on
U,,, or V,,,).
Method Grid U, Vmtn Vmaz In case of Re=1000, Table 2, the decrease of accu-
DIRECTO D1 64 -0.36722 -0.49426 0.35565 racy of the D1 method is obvious. This is due to the
96 -0.37763 -0.51104 0.36628 first-order upwind scheme used to discretize the con-
128 -0.38198 -0.51747 0.37037 vective term of equation (3), as can be seen by com-
Exact value -0.38510 -0.52146 0.37292 parison with the D2 method, which uses a second-
Accuracy 1.26 1.38 1.38
order accurate finite difference discretization. Nev-
DIRECTo D2 64 -0.36544 -0.49382 0.35414 ertheless the estimated accuracy of CPI is closer
96 -0.37704 -0.51088 0.36577 to second order compared with D2. The largest
128 -0.38177 -0'51747
0'37022 difference between the estimated exact values pro-
Exact value -0.39003 -0.52772 0.37701
duced by D2 and CPI is 0.35% (based on Urnin).
Accuracy 1.57 1.73 1.75
Compared to DIRECTO, SIMPLE algorithm with
CPI 64 -0.37436 -0.51015 0.36364
the hybrid scheme requires finer grids to achieve
96 -0.38233 -0.51947 0.37109
128 -0.38511 -0.52280 0.37369
identical level of accuracy, as can be seen in Fig. 2
Exact value -o.38867 -o.52724 o.37702 and Table 2. At Re=1000, the results of D1 and D2
Accuracy 2.01 1.94 2.01 methods for a grid of 96x96 are much closer to the
CSG 64 -0.35726 -0.48858 0.34556
estimated exact value of the CPI method than the
96 -o.37441 -o.50982 o.36271 results of SIMPLE with a grid of 128x 128. This lack
128 -0.38050 -0.51727 0.36884 of accuracy is due to hybrid scheme that is first-order
Exact value -0.38855 -0.52690 0.37705 accurate for Peclet numbers greater than 2.
Accuracy 1.96 1.94 1.99
SIMPLE 128 -0.37233 -0.51014 0.36234

Table 2: Square cavity results for Re = 1000 (CPI


and CSG results from Deng et al., 1994).

The SIMPLE algorithm [l,21 is implemented here f


in a non-staggered grid [18] [19] and uses the hy-
brid finite difference scheme, switching from upwind a310
to central differencing for mesh Reynolds number
higher than 2.
The C P I method of Deng et al. [17] is similar .033
to the DIRECTO method. CPI uses a governing DODO OW5 0010 0015 OM0 0025 0030

lh
differential equation for momentum and a relation-
ship between face and nodal values identical to our ~i~~~~2: Evolution of urn,,with grid resolution for
equations (3) and (4), designated D1. In CPI and ~~=400.
CSG, the governing differential equations of mo-
mentum are discretized at the centre of the control Fig. 3 shows the streamlines for D2 and SIMPLE
volumes before integration, while in the DIRECTO methods for a Re=1000 and a grid of 96x96. SIM-
method the equations are first integrated in the con- PLE is unable to predict the streamline distribution
trol volume and then discretized. This leads to in the centre. The dashed line (SIMPLE) a t the
centre of the flow represents the the value -0.11,
whereas the solid line (D2) represents -0.115.

1.o

0.8

0.6

YL
0.4
Figure 4: Memory requirements of DIRECTO (a),
b) and c)) compared to SIMPLE (d)) using hybrid
0.2 finite difference discretization scheme.

gorithms and a cluster of 4 DEC AlphaStations AXP


0.0
0.0 0.2 0.4 0.6 0.8 1.o 3000, models 500s and 600S, connected by FDDI and
x/L Gigaswitch using PVM, for parallel versions.

Figure 3: Stream function for D1 method (-) and 3.3.1 Linear solvers
SIMPLE method (- -) at Re=lOOO.
The Gaussian elimination method was used ini-
Given the similarities between DIRECTO and the tially [lo] during the FORTRAN implementation of
algorithms CPI and CSG of Deng et al. [17] one the algorithm. The first idea was to optimize the
would expect higher accuracy of the DIRECTO algo- Gaussian elimination method by adapting it to the
rithm; this is an aspect requiring further attention. block band structure and using BLAS kernels and
LAPACK library [20], on a vector processor VAX
6520-2VP. This reduced the computing time but still
3.2 Memory Requirements far from the SIMPLE+TDMA method, and required
Fig 4 shows the memory requirements for different a large amount of storage [13].
implementations of DIRECTO, compared to SIM- The next stage was the use of an iterative method
PLE using hybrid differencing. so that the sparse structure could be taken into ac-
The coefficient matrix, derived from a 9-node star, count. An iterative method has the additional ad-
has a dimension of ( N I - 2 ) x ( N J - 2 ) x K by ( N I - vantage of controlling the degree of accuracy for solv-
2) X ( N J - 2) X I<,where N I X N J is the problem ing the linear system of equations. Because the solver
size, and K stands for the number of variables (i.e., is an inner step of a global iteration required because
3 in case of a two-dimensional laminar flow). This is of nonlinearities, solving the equations to a high de-
shown by line a) in Fig. 4. gree of accuracy may prove useless.
Because of the block-tridiagonal structure it can Several methods were tested and GMRES (Gen-
be stored as a [ ( N I - 2) x ( N J - 2)] x K by 2 x eralized Minimum Residual) [14] was retained for
+ +
[ ( N I - 2) x 3 51 1 matrix (b) in Fig. 4). its robustness. GMRES is a Galerkin type method
For finer grids the block band structure becomes based on an orthonormal basis of a Krylov subspace.
sparser. This was exploited by using a sparse m a To obtain the solution of
trix structure, storing only the non-zeros values on
a vector, the column indices on an integer vector Ax=b,
and using pointers to the beginning of each row.
of the form
This structure reduced the memory requirements to
[ ( N I - 2) x ( N J - 2) x K] x I< x 9 (line c) in Fig.
Xk=XO+Zk, (21)
4).
On the other hand SIMPLE only requires 5 matri- where xo is an initial solution with residual
ces of dimension ( N I - 2) by ( N J - 2) to store the
coefficients (d) in Fig. 4). rpg = b - Ax0 . (22)

3.3 Computing Time zk is computed such that its residual projected onto
the Krylov subspace generated by ro is minimized.
The following computer tests were run on a DEC Al- Iterative methods of this type require the use of
phaStation AXP 3000, model 600S, for sequential al- preconditioners in order to improve the convergence
rate. Several preconditioners were tested and the
best proved to he the Incomplete LU factorization
of degree zero ILU(0) and ILUT [14, 211. The diag-
onal preconditioner although very simple to imple-
ment did not give as good results as the others [12].
Table 3 shows the total CPU times, and cor-
responding number of outer iterations needed t o
achieve a residual of 1 x on a DEC Alphas-
tation AXP 3000 model 6005, for several grid sizes,
using the DIRECTO+GMRES methods and for the Figure 6: Reordered (one-way dissection) matrix
SIMPLE+TDMA method [22]. The SIMPLE algo-
rithm was used with 4 sweeps of TDMA for compu- 5 processes. There was a reduction in (elapsed) time,
tation of the velocities and 8 for computation of the when passing from 3 to 4 processes; this is a 22% re-
pressure. It can be seen in Table 3 that the CPU duction corresponding to a relative speed-up of 1.27.
For 5 proteases, there is a degradation of CPU and
Grid SIMPLE DIGMRES DIGMRES elapsed times because the farm is composed only by
TDMA ILU(0) ILUT
4 machines, and more than 1 procegs will have to
32 13.1s (124) 8.5s (19) 11.7s (16) share the same processing element.
64 169.9s (381') 9~1.4~
'(20) 8~1.2~(i6j
96 764.2s (735) 608.4s (39) 285.2s (23)
CPU time Elapsed time
128 2292.0s (1212) 10689.1s (91) 1667.2s (65)
Processes Master Slave (max.) -
3 192.7s 2734.3s 3090.48
Table 3: CPU time and number of iterations for a ~

4 627.7s 1780.3s 2434.5s


residual of 1 x loM5. 5 1167.7s 1217.4s 3401.5s

times are competitive. ILU(0) is a good choice for Table 4: CPU and elapsed time for a 64x64 grid and
small grid sizes and ILUT is recommended for finer Master-Slave approach.
grids because it keeps the number of outer iterations
low. To be able to use finer coarsegrain parallelism it is
necessary to reduce the CPU time spent by the m a 6
3.3.2 Parallelhation ter to accompany the decreasing of the total time
For this type of problems the parallelisation by do- induced by the reduction of the CPU time in the
main decomposition was selected. A non-overlapping slaves. Based on this need, another parallel version
domain decomposition strategy was used, where the of the code waa created, based on a SPMD strat-
domain was decomposed into disjoint subdomains egy. Table 5 shows the CPU and elapsed times of
separated by interfaces. The grid nodes were num- the Master-Slave and SPMD approaches for 3 sub-
bered first inside each subdomain and then on the domains on a 64x64 grid. The SPMD approach
interfaces, leading to a bordered block diagonal ma-
CPU time Elapsed time
trix shown in Figs. 5 and 6 [23].
Master-Slave 1780.3s 2434.5s
SPMD 1568.5s 1642.3s

Table 5: CPU and elapsed times for a 64x64 grid for


Master-Slave and SPMD approaches.

is faster because it is more adequate to the sparse


nature of the problem. Furthermore, for identical
number of subdomains, the SPMD approach uses
Figure 5: One-way dissection ordering.
one process less than the Master-Slave. However
the implementation of the SPMD approach is more
The algorithm was parallelised in two versions: the
complex, and given the reduced number of worksta-
first using a master-slave approach where the mas-
tions available running PVM, we cannot conclude yet
ter performs the computations corresponding to the
which of these approaches is the most appropriate.
interface, and the second using a SPMD (Single Pro-
gram Multiple Data) approach where each processor
deals with a subdomain and one interface. Each pro- 4 CONCLUSIONS
cessor had an independent preconditioner.
Table 4 reports the CPU and elapsed times on a The present article reported on further developments
cluster of workstations for a 64x64 grid and 3,4 and of an implicit coupled algorithm for fluid flow equa-
tions. The main conclusions of this study are the [6] J.J. McGuirk and J.M.L.M. Palma. The effi-
following.
- ciency of alternative pressure-correction formu-
lations for incompressible turbulent flow prob-
Computing time reduction was achieved by
lems. Computers and Fluids, 22(1):77-87, 1993.
moving from a direct to an iterative solver based
on GMRES. [7] R.I. Issa. Solution of the implicity discretized
It was shown that DIRECT0 + GMRES with fluid flow equations by operator-splitting. Jour-
ILU(0) and ILUT preconditioners is always nal of Computational Physics, 62:40-65, 1986.
faster than SIMPLE+TDMA (4 and 8 sweeps
for velocities and pressure, respectively). ILU(0) [8] S.V. Patankar. A calculation procedure for two-
is a good preconditioner for coarse grids and dimensional elliptic situations. Numerical Heat
ILUT is better for finer grids, because it keeps Transfer, 4:409-425, 1981.
the number of outer iterations small.
[9] J.P. van Doormaal and G.D. Raithby. Enhance-
Reduction of memory storage was also achieved ments of the SIMPLE method for predicting in-
by taking advantage of the sparse nature of the compressible fluid flows. Numerical Heat Trans-
coefficient matrix. However, memory require- fer, 17:147-163, 1984.
ments are still large compared to SIMPLE and
this is an aspect calling for further investiga- [lo] F.A. Castro. A coupled procedure for solv-
tion. Efforts were made to overcome this disad- ing the Navier-Stokes equations (in Portuguese).
vantage by reverting to domain decomposition Technical report, Faculty of Engineering, Uni-
techniques, at the expense of increased comput- versity of Porto, Portugal, 1993.
ing time.
[ll] J.P. van Doormal and G.D. Raithby. An evalu-
Acknowled gments ation of the segregated approach for predicting
incompressible fluid flows. 1985. Presented at
This work was developed as part of project STRIDE the National Heat Transfer Conference, Denver,
N. STRDA/C/CEG/712/92. F.A. Castro is a Colorado, USA, ASME paper 85-HT-9.
PhD student, recipient grants from Programmes
CIENCIA and PRAXIS XXI. The work on precondi- [12] P.B. Vasconcelos and F.D. d’Almeida. Direct
tioners was carried out during the visits of two of us and iterative methods in coupled discretiza-
t o CERFACS on July 1994. F.D.A. and P.V. thank tion of fluid flow problems. Paper presented
the hospitality of Prof. Ian Duff. at LANCZOS Centenary Conference, Raleigh,
The authors are grateful to the Faculty of Engi- North Carolina, USA, 12-17 December, 1993.
neering Computer Centre (CICA).
[13] P.B. Vasconcelos and F.D. d’Almeida. Column-
wise block LU factorization using BLAS kernels.
References Computer Systems in Engineering, 1995 (to ap-
pear).
S. V. Patankar and D. B. Spalding. A calcula-
tion procedure for heat, mass and momentum [14] Y. Saad. GMRES: A Generalized Minimum
transfer in three-dimensional parabolic flows. Residual algorithm for solving nonsymetric lin-
Int. J. Heat Mass Transfer, 15:1787-1806, 1972. ea.r systems. SIAM, J. Sci. Statis. Comput.,
S. V. Patankar. Numerical Heat Transfer and 7:856-869, 1986.
Fluid Flow. Hemisphere Publishing Corpora-
tion, 1980. [15] A. Beguelin, J.J. Dongarra, G.A. Geist,
R. Manchek, and V.S. Sunderam. A user’s guide
G.D. Raithby and G.E. Schneider. Numerical to PVM parallel virtual machine. Technical
solution of problems in incompressible fluid flow: report, ORNL/TM-11826, Oak Ridge National
treatment of the velocity pressure coupling. Nu- Laboratory, July 1991.
merical Heat Transfer, 2:417-440, 1979.
[16] P. B. Vasconcelos. Experience with PVM and
A. Wanik and U. Schnell. Some remarks on the
ScaLAPACK (in Portuguese). Technical re-
P I S 0 and SIMPLE algoritms for steady tur-
bulent flow problems. Computers and Fluids, port, CICA-I01/94, Faculdade de Engenharia
17(4):555-570, 1989. do Porto, 1994.

D.S. Jang, R. Jetli, and S. Acharya. Compari- [17] G. B. Deng, J . Piquet, P. Queutey, and M. Vi-
son of the PISO, SIMPLER and SIMPLEC algo- sonneau. A new fully coupled solution of the
rithms for the treatment of the pressure-velocity Navier-Stokes equations. International Journal
coupling in steady flow problems. Numerical for Numerical Methods in Fluids, 19:605-639,
Heat Transfer, 10:209-228, 1986. 1994.
7-8

[18] C. M. Rhie and W. L. Chow. Numerical study


of the turbulent flow past an airfoil with trailing
edge separation. A I A A Journal, 21:1525-1532,
1983.
[19] T. F. Miller and F. W. Schmidt. Use of
a pressure-weighted interpolation method for
the solution of the incompressible Navier-Stokes
equations on a nonstaggered grid system. Nu-
merical Heat Transfer, 14:212-233, 1988.
[20] E. Andersom et. al. LAPACK Users Guide.
SIAM, Philadelphia, 1992.
[21] R. Barrett et. al. Templates. SIAM, Philadel-
phia, 1994.
[22] P.B. Vasconcelos and F.D. d’Almeida. The ef-
fects of preconditioning iterative methods in
coupled discretization of fluid flow problems.
16th Biennal Conference on Numerical Analy-
sis, University of Dundee, 27-30 June, 1995.
[23] P.B. Vasconcelos and F.D. d’Almeida. Paral-
lel computation on fluid dynamics using PVM
and domain decompositon techniques (in por-
tuguese). Proceedings of IV Nacional Mecinica
Computacional., V01.2, 815-824, 1995.
8- 1

Experiments with Unstructured Grid Computations

S.V. Ramakrishnan, K.Y. Szema, C.L. Chen, V.V. Shankar


and S.R. Chakravarthy
Rockwell Science Center
1049 Camino Dos Rios
Thousand Oaks, CA 9 1360
U.S.A.

1. SUMMARY
This paper describes work done at Rockwell Science Center large. Especially, the time required for preprocessing
on the development and application of computational fluid increases almost exponentially as more and more details of
dynamics (CFD) solvers for unstructured grids. A the geometry are included in the simulation. For example,
description of the use of “interior boundary” conditions in in the case of the multibody space shuttle configuration
simulating moving bodies is also presented. (Ref. 7), several months were needed to generate a
structured grid when the fidelity requirements for the model
employed in the numerical simulation were increased
2. INTRODUCTION considerably.
The CFD group at Rockwell Science Center has been
involved over the past fifteen years in the development and Unstructured grid methodologies appear to be very
application of numerical techniques for the simulation of promising, since the preprocessing time could be orders of
flow past complex aerodynamic shapes. Starting with magnitude less than that required for structured grids. It is
small perturbation equations, codes have been developed t o indeed the case for inviscid flows. But, our experience with
solve more and more complex governing equations on unstructured grid computations has opened our eyes t o
structured grids (Ref. 1-4). The latest version of the several issues involved in such simulations. We propose t o
structured grid code solves Reynolds Averaged Navier- discuss some of those issues in this paper.
Stokes (RANS) equations in generalized curvilinear
coordinates. It includes the ability to simulate reacting Research on the development of unstructured grid solvers
multispecies flows (Ref. 5). Simulations requiring grid for Computational Fluid Dynamics (CFD) and
movements are handled quite elegantly using this code Computational ElectroMagnetics (CEM) has been i n
(Ref. 6). CFD codes developed at Rockwell Science Center progress at the Rockwell Science Center for the past
have played a significant role in several national projects several years. An unstructured grid solver for CFD, called
including the Space Shuttle, B-IB and National Aerospace UNIV, that can handle tetrahedral, triangular prizmatic and
Plane (NASP) projects. hexahedral cells, has been developed (Ref. 8). UNIV
employs a finite-element-like formulation that uses
Time required for performing accurate numerical simulation piecewise polynomial interpolation for the dependent
of complex fluid-dynamics problems is still sufficiently variables. The dependent variables are the cell averages of
large to discourage designers from including CFD internal energy, mass, x-, y-, and z-momenta.
techniques in the design cycle. Total time required for a Interpolating polynomials may be discontinuous across
numerical simulation consists of the time required for cell boundaries. An approximate Riemann solver is used t o
resolve discontinuities at cell boundaries. The domain of a
a) preprocessing, which consists of modifying the CAD dependent variable polynomial is restricted to a cell. The
geometry to a form suitable for numerical simulation, discretization of the governing equations is constructed
(in the case of structured grids) dividing the directly from the integral form of the conservation laws.
computational domain into zones, choosing proper No variational principle or method of weighted residuals or
grid resolution at the boundaries and finally grid other indirect approach is employed. The code has the
generation, option to use either a least-square polynomial or a EN0
(Essentially Non-Oscillatory) reconstruction.
b) solver, Reconstruction is the process of constructing an
interpolating function for a cell that satisfies the cell
and average. Please see Ref. 9. for details on E N 0 schemes.

c) post-processing, which consists of extracting Numerical formulation employed in UNIV and a new
physical quantities like skin-friction and heat- approach for simulating bodies in relative motion are
transfer, from the numerical solution; and discussed in the following sections. A generalized Lax-
visualization of the solution. Wendroff scheme for Euler equations adapted from CEM i s
also presented. A pointwise turbulence model that is highly
Several years of research in structured-grid simulations and suitable for unstructured grids is discussed. Lessons learned
developments in computer software and hardware from our experience with unstructured grid computations are
technologies have considerably reduced the turnaround time elucidated.
for numerical solutions. Still the time required to simulate
flow past complex geometries is unacceptably

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
8-2

3. NUMERICAL FORM.ULATION . suggested by Roe (Ref. 10) is employed for this purpose i n
Two different approaches to solving the initial-/boundary- the UNIVERSE-series of codes of which UNIV is a member.
value problem (IBVP) for general hyperbolic system of
conservation laws in the “conservation-law form” The two approaches alluded to at the beginning of this
represented by section differ in their “reconstruction” procedure and also
in the time-stepping scheme. Only explicit time-stepping
aq af, af2 af3 schemes are considered in both approaches. Both
-+-+-+-=o approaches permit use of multiple quadrature points and
at ax ay aZ curved surfaces for higher accuracy. Codes developed using
these approaches can handle hexahedral, tetrahedral and
have been developed. Equation (1) is satisfied at all (x, y, z) triangular- prismatic cells.
belonging to domain D with prescribed initial and
boundary values for the dependent (conserved) variable
vector q. Here, the Cartesian coordinate directions
(independent variables) are x, y , and z. The components of
flux tensor in the three coordinate directions are the vectors
f,. f2 andf,. In both approaches the domain D is divided
into several cells, and the integral form of the conservation
equations in each cell given by

dt
b
0 (origin)
is solved with prescribed initial values for 9
SC.1654E.091195
Fig. 1 “Left” and “Right” states for locally one-
dimensional Riemann problem.
and relevant boundary conditions. Here, 4 denotes the cell
average of the dependent variables;
3.1 The First Approach; a Finite-element Like
A ? A - A *
Algorithm
+ tj,k +
A
n=%j 3
1 The major credit for this work goes to Dr. Chakravarthy.
This approach employs a unified treatment for structured
is the outward unit normal at any point on the boundary and unstructured grids. The codes developed using this
- 4 -b

-
surface S of a cell; j , k , and 1 are the unit vectors in x, y
and z directions respectively; V is the cell volume and F
formulation are called UNIVERSE-series of codes. The
UNIVERSE-series includes “least-square” and “ENO” (Ref.
9) reconstruction options. Both these procedures involve
is the tensor of fluxes with (f,, f2, t3) as components. development of an interpolating polynomial Pc(x,y,z) for
Stated in words, Eqn. 2 implies that the rate of increase of a each of the conserved quantities, where
conserved quantity (iV) inside a cell is given by the net nP
inflow (flux) of that quantity into the cell. Therefore, as in = PF ,j(i) yk(i) zKi)
(3)
the case of cell-centered finite-volume structured grid i d
solvers (Ref. 3), solving the governing equations requires
evaluation of surface integrals from known values of cell where, p c are the coefficients of the polynomial. Pc is
averages. 1
applicable only within a given cell C . Integral of Pc over C
Surface integrals are evaluated using numerical quadrature reproduces the corresponding cell average. That is,
formulas. In this method an integral is written as the
weighted sum of the integrant evaluated at the quadrature (4)
points. The location and weights of quadrature points are so
chosen as to give the best possible approximation for the
integral. Higher order schemes require larger number of where,
quadrature points. Choosing the centroid of a surface as the
quadrature point yields second order accuracy. Since only
the cell-averages of the dependent variables are known, we (5)
need to develop a procedure for evaluating the dependent
variables at the quadrature points in order to compute the
The spatial accurac of the numerical scheme is determined
surface integrals (fluxes). The spatial accuracy of the
numerical scheme is determined by the accuracy of this
J ’
by the form of P . A linear polynomial in x, y and z
“reconstruction” procedure. The dependent variable vector q results in second-order accuracy while a quadratic
at a quadrature point may not be uniquely specified, since polynomial yields a third- order scheme. Linear
the point belongs to two neighboring cells with different polynomial requires evaluation of 4 coefficients, while the
polynomial representations. If the two vectors evaluated at quadratic polynomial requires 10. In the case of the “least-
a quadrature point using the polynomial reconstruction i n square” option, the polynomial coefficients are computed
the two “containing” cells are 91,and q R (Fig. I), then a such that the integral of Pc over cell C reproduces the
corresponding cell average values (Eqn. 4). and the
unique value q* is determined from the solution of a locally
integrals over the neighboring cells satisfy the
one-dimensional Riemann problem with and q R as the
corresponding cell averages in a least-square sense. That is,
“left” and “right” values. An approximate Riemann solver
8-3

a
-E=O
(2) Proximity neighbors (PN)
apF This latter type is defined in terms of distance from a given
cell.
for 0 I i S np. The error term E is given by,
A neighborhood is now defined to be a collection of
neighboring cells. A neighborhood hierarchy is defined as
follows:

where n refers to a neighboring cell, nc is the number of Ho is the cell itself.


cells in the neighborhood of cell C, excluding C itself.
Obviously, a least-square approximation for Pc can be H’is the cell and its neighbors.
constructed only if
H2 is the union of H1and the neighbors of all the
nc 2 np (8) cells in HI.

Therefore, the neighborhood of a cell should be properly This process may be continued recursively, and depending
defined to satisfy equation (8). The UNIVERSE-series CFD on the order of Pc, a neighborhood may be found such that
formulation defines a “neighbor” of a given cell in a very equation (8) is satisfied.
flexible and useful way.
In the case of EN0 (Essentially NonOscillatory)
First, we consider two types of cell connectivities (Fig. 2): reconstruction, we seek to obtain a “best” polynomial
rather than a “least-squares” one. The “best” polynomial
(1) Node-aligned cells (NAC) corresponds to the “smoothest”. As always, the equation
for cell C must be satisfied (Eqn. 4). From the remaining nc
(2) Surface-aligned cells (SAC) equations, we can select any combination of np equations
and solve the resulting set of np + 1 equations. There are

(3 (9)
such combinations. The combination that yields the best
polynomial in terms of its E N 0 property is to be preferred.
For example, when the flow field contains a single shock
wave, the neighbors selected should lie on the same side of
the shock as cell C. This approach may be termed the “best
stencil” formulation and has been applied very successfully
in various forms to structured grid m0 formulations.
Node Aligned Cells Reference 9 contains many different strategies for this
task. Note that the “least squares” strategy may result in a
WAC) stencil that includes cells from both sides of a
discontinuity and hence not desirable.

Alternatively, a “best term” strategy has also been tried


out. In this formulation, the least-squares polynomials are
first determined for all cells. Each coefficient p c of the
i
polynomial Pc (Eqn. 3) in a given cell C corresponds to the
appropriate derivative of the polynomial (up to a constant
coefficient) evaluated at the centroid of the cell. That is,

Surface Aligned Cells


SC.1656L091195
(SAC)
Fig. 2 Node-aligned and surface-aligned cells.

where the subscript c refers to the centroid and Ki is a


Next, we consider different types of neighbors: constant. In the case of the “best term” strategy, we replace
each p c by that computed from the corresponding
(1) Touching neighbors (TN) 1
derivative at the cell centroid evaluated from a neighboring
These include cell polynomial p! , provided
1

( 1 a) Common-node neighbor (CNN)

(lb) Common-face neighbor (CFN) where a > 1. In other words, p c i s selected such that the
1
corresponding derivative at the centroid does not differ
(IC) Touching-face neighbor (TFN) “too much” from its value in a neighborhood. This
procedure attempts to construct a reconstruction
8-4

polynomial that uses only neighboring cells on the same


side of a discontinuity. This numerical scheme was originally developed under the
leadership of Dr. Shankar (Ref. 11) for solving Maxwell’s
For one-dimensional shock-tube problems, it has often equations and later was adapted for Euler equations. This
been demonstrated that it is better to select the best approach employs a multilevel time stepping scheme. The
stencils based on comparing interpolates of local second-order scheme uses a two time-level discretization.
characteristic variables and not the conserved dependent The first fractional time-step employs first-order spatial (qL
variables. However, within the context of unstructured grid and q R are set equal to the corresponding 4”) and temporal
formulations this approach is very expensive, and
consideration of such issues is postponed for future work.
discretizations to compute T”. Here, the superscript n
refers to the time-level R . For the second time-step, qL and
We have so far discussed the spatial discretization problem. qR are computed from the corresponding centroidal values
As far as temporal discretization is concerned, only of 4” a n d w as
explicit schemes have been considered. A second-order
time-accurate formulation is given below as an example.
This is fashioned after Heun’s method or the second-order
Runge-Kutta method (RK2). The RK4 method can be * *
implemented in similar fashion. Higher than second-order where rf and rc refer to the position vectors of the
spatial accuracy results in reduced numerical dissipation, centroids of the surface and cell, respectively, and
and this sometimes necessitates the use of the fourth-order
Runge-Kutta formulation, which has a larger stability range
than the second-order Runge-Kutta method.

In semidiscrete form, the equations to be solved are


where q* is obtained from the Roe’s approximate Riemann
solver with qL and q R set equd to i n . The algorithm
described above may be considered as a generalization of
Lax-Wendroff upwind integration, since it reduces to the
Lax-Wendroff scheme for uniform rectangular hexahedral
where RHS(q,t) is the net flux. The corresponding time cells. Note that only details of the second order scheme are
stepping method can be written as presented, and that extension to higher order schemes i s
indeed straight-forward, albeit tedious.

3.3 Computation of Viscous Fluxes


Viscous fluxes at a quadrature point on a cell face are
computed as the mean of the corresponding contributions
The fourth-order accurate Runge-Kutta scheme can be from the two adjacent cells that share the face. That is, the
written as average of the derivatives computed from the polynomial
reconstruction in the two adjacent cells are employed in the
calculation. When the quadrature point lies on the boundary
of the computational domain, the polynomial
reconstruction employed is centered about the quadrature
point.

4. BOUNDARY CONDITIONS
The implementation of boundary conditions ensures
consistency in flux computations. That is, just like in the
case of any interior cell boundary, computation of fluxes
for a cell boundary that lies on the boundary of the
computational domain involves determination of “left” and
“right” states and Roe’s approximate Riemann solver. The
state that corresponds to the “outside” of the domain should
satisfy the appropriate boundary conditions. For instance,
when computing fluxes for a cell on the left boundary of the
domain where inviscid tangency condition is to be
satisfied, the “left” state should be such that the
corresponding velocity vector should be tangential to the
surface. This manner of imposing boundary conditions
In the above, the explicit dependence of RHS on t is useful ensures that only the information at a boundary that
for time-dependent problems where the boundary corresponds to waves propagating in to the computational
conditions or other behavior explicitly depend on time. domain is actually used in the computation of fluxes.

4.1 Interior Boundary Condition


The concept of boundary conditions has been generalized
to include specification of boundary conditions anywhere
3.2 The Second Approach; Generalized Lax- in the computational domain (Ref. 12). The part of the
Wendroff Scheme boundary condition that does not correspond to the actual
8-5

boundary of the computational grid is referred to as associate with each (pair of) interior boundary point the
“interior boundary conditions.” In this case the user cell that contains it. This chore of searching through the
specifies, among several attributes, the coordinate location mesh to determine the one cell that contains the boundary
of each boundary point as well as a vector normal point is efficiently accomplished in the UNIV flow solver
associated with the point. The need for the normal arises using an “octree” sort and search procedure. Given an
from the fact that even though interior boundary points are interior boundary point, an octree search of the sorted list
specifiable as individual points, they arise from boundary of node points of the mesh quickly yields the nearest mesh
surfaces that they are a part of. It is the surface normal node. All cells that contain the node as well as the
along with its location that describes the local geometry. common-node neighbors of this set of cells are searched, in
Note that the surface in question could very well be a surface that order, to determine if the given point is in any of those
of discontinuity (a shock wave), and it may not be possible cells. If not, the “nearest” cell is identified.
to assign unique values for the dependent variables at the
corresponding boundary point. To account for such a In the previous paragraphs, it was convenient to describe
situation, for every interior boundary location identified by the procedure as if the user provides pointwise information
the user, two interior boundary points are created and added related to interior boundaries. Depending on the relative
to the data base of the UNlV flow solver. One of the added fineness or coarseness of the geometry description of the
points has the normal pointing one way and the second interior surface with respect to the surrounding mesh, there
point the other way (Fig. 3). may be two or more user-specified (before the flow solver
replaces each user-specified point with two points, with the
A normals facing in opposite directions) interior boundary
points in a cell, or there may be none (Fig. 4). In Fig. 4 the
cells 1,5,8 and IO have two or more interior boundary
points while cells 4,7 and 9 have none. The case of
multiple interior points in a cell can be dealt with easily
(e.g., by replacing them with an equivalent single point, if
necessary). But, the case of no interior point in a cell that
actually straddles the interior boundary is not acceptable.
To avoid such problems, we start with the user describing
the interior surface as an unstructured grid (triangular
elements). Using an octree-based sort and search procedure,
the intersection of the mesh with this surface is identified
(Fig. 5). Interior boundary points are assigned to each such
intersection. There could be interior surface geometry
elements that do not participate in such intersections. The
centroids of these elements are optionally added to the list
of interior boundary conditions.

Fig. 3 Interior boundary points. SC.1658E.091l!

The user-specified boundary condition is applied to each


pair of interior boundary points. Certain boundary
conditions such as surface tangency are applied
individually to both points of the pair; i.e., they are
applied in a decoupled fashion. Certain boundary
conditions such as those associated with “shock fitting” or
“contact-surface fitting” are applied in a coupled fashion.
For example, the values on the supersonic side of the shock
are accepted as is, and the values on the subsonic side are
computed (along with the shock speed value) by accepting
only the pressure from the subsonic side and applying the
Rankine-Hugoniot shock-jump relations. The availability
of the boundary points in pairs facilitates such
transactions.

The process of computing the ‘‘left’’ and “right” states is


modified when a cell has an interior boundary point. The Fig, 4 An example of user specified interior boundary
contribution of each of the boundary points to the points.
I quadrature points is computed using the proportion of the
I’ surface area that is in the region of influence of the
boundary point. In Fig. 3, the face AB is completely in the
region of influence of boundary point 1, whereas face CA
gets contributions from both the boundary points. Note
that the boundary points actually differ only in the normals
associated with them, and their coordinates are identical.
For the sake of clarity, they are shown as two different
points in Fig. 3.

As part of the infrastructure necessary to implement


interior boundary point treatment, one needs the ability to
8-6

and angular momenta are solved to obtain the location and


orientation of the store at the next time level. This process
also yields velocity vectors at all points on the store. The
intersection of the store geometry in its new location with
the unstructured volume mesh is determined, and the
process is repeated for all subsequent time steps. This
approach is not suitable for viscous flows, since the parent
vehicle grid would be too coarse to resolve viscous regions
when the store moves away from the parent vehicle. This
problem may be circumvented by adapting the mesh as the
store moves, but at present such a strategy does not appear
attractive due to the large amount of work involved i n
adapting the mesh and performing required interpolations
that could result in loss of accuracy.

7. TURBULENCE MODELING
Until recently all the turbulence models employed i n
numerical simulations required the knowledge of the normal
points. distances of a point from surrounding walls. This
information is very difficult to obtain in the case of
unstructured grids. In the case of structured grids, mostly
5. GRID GENERATION distances along grid lines were employed. This was
The UNIVERSE-series of codes includes an unstructured grid sufficient since the grid lines were nearly orthogonal in the
generator, named UNIVG. UNIVG accepts specification of vicinity of a body where viscous effects are dominant. But
surface geometry in the form of a collection of patches. A when complex geometries requiring a multizone grid
patch geometry could be specified either in the ICES format topology were encountered, it became difficult to maintain
or by specifying sufficient number of non-intersecting continuity of eddy viscosity at zonal interfaces. To
lines on the patch. Each line in turn is discretized by an circumvent this problem, a pointwise turbulence model that
ordered collection of sufficient number of points. does not require any information regarding the distance of a
Triangular elements are first generated on the boundary of point from surrounding walls was developed at Rockwell
the computational domain satisfying user specified Science Center by Goldberg and Ramakrishnan (Ref. 14).
clustering requirements. The computational domain is then Since then, several such models have been developed, and
discretized in the form of tetrahedral cells using the reliable computation of turbulent flows on structured grids
“advancing front” technique (Ref. 8). has become a possibility.

The method of “advancing front” does not possess a good


mechanism for controlling the distribution of cells in the 8. LESSONS LEARNED
computational domain, and often regions with large One of the most important lessons that we have learned
variations in cell sizes and shapes are encountered. Such from our own experience and the experience of our peers i n
regions deteriorate the fidelity of numerical simulation. To the CFD and CEM community with unstructured grid
overcome this problem, grid smoothing strategies based computations is that in spite of all the advances that have
on constrained optimization techniques have been been made in this field so far, on comparable grids
developed and employed successfully (Ref. 13). structured-grid simulations yield more accurate solutions.
For complex geometries, it is indeed possible to speed up
UNlVG has the capability to develop an unstructured grid the preprocessing stage of a numerical simulation by an
that includes a specified “cloud” of points as nodes. This is order of magnitude by employing unstructured grids. On the
sometime useful in controlling the distribution of cells i n other hand, a structured grid computation requires less CPU
the computational domain. time and memory and converges in fewer number of time
steps. For example, in the case of an inviscid flow past a
sphere with M, = 0.5, it has not been possible to obtain
6. STORE SEPARATION even two orders of magnitude drop in the L-2 norm for the
The concept of interior boundary conditions, described i n net-flux vector in reasonable number of time steps (less
section 4.1, is used to simplify numerical simulation of the than 1000) for an unstructured grid, while a structured grid
store separation problem. The process starts by generating computation on a comparable grid converges by about four
an unstructured mesh for the parent vehicle. The store orders of magnitude in less than 600 time steps. Attempts
geometry is discretized by generating an unstructured have been made to develop implicit schemes for
surface mesh. The intersection of the store surface mesh unstructured grids, but in our opinion, a structured grid still
with the unstructured volume cells of the parent grid is performs better as far as convergence and accuracy are
determined by using an octree sort/search procedure. concerned.
Centroids of the intersecting surfaces and their normals are
computed. A data base consisting of these centroids and During the design phase of an aerospace configuration,
normals is thus generated and used as input by the interior several possible candidates are evaluated, and a small
boundary condition routines. Note that the boundary number of viable candidates is down selected from the
condition for the store accounts for its initial motion. The original pool for further considerations. This process i s
governing equations are then solved with appropriate usually carried out using a relatively low-level CFD
boundary conditions to obtain solution for the next time analysis requiring less stringent accuracy and convergence
level. Aerodynamic forces and moments for the store are criteria. Mostly only inviscid flows are considered.
computed, and the equations for the conservation of linear Structured grids are not suitable for this purpose, since the
8-7

preprocessing takes an unacceptably long time. used to demand about 200. This situation has been vastly
Unstructured h l e r solvers offer the most viable solution. improved, and the storage requirement has been brought
Since Euler equations, unlike Navier-Stokes equations, do down to a manageable 60 words per conservation cell.
not require very fine grids in the vicinity of solid bodies,
unstructured grid development becomes much easier to The concept of “interior” boundary conditions described in
handle, and several solutions for many different section 4.1 is very promising. It was used successfully i n
configurations can be carried out in a matter of a few weeks. computing the trajectory of a store released from an F-18.
This was indeed demonstrated in the case of some This concept also proved its usefulness in analyzing the
modifications that were carried out for B-IB bomber. effect of mounting an additional equipment on an aircraft.
Starting with the geometry of the aircraft in IGES format, In this case, the grid and solution from an earlier
an Euler solution was obtained for this complex computation could be used along with the geometry of the
configuration (Fig. 6) in about five working days. With the addgl equipment to obtain the required information in a
use of Massively Parallel Processing ( W P ) computers, timely manner. The present implementation of this
this process may be accelerated even more. From this point concept has some shortcomings. To minimize the number
of view, unstructured grid solvers have a clear edge over of arithmatic operations. several approximations were
their structured grid counterpans. introduced. Instead of computing the exact contribution of
each face for updating the interior boundary points, some
simple recipes were employed. This results in
communication between the cells that lie on either side of a
solid object. That is, the interior boundary point pairs 1
and 2 in Fig. 3. interact, resulting in an erroneous
interaction between the inside and outside of the body. It
appears that shortcuts may not work, and it may be
necessary to consider the exact geometry of the
intersecting surfaces when interior boundary conditions are
encountered. Since this process is very involved, it may
not be acceptable for many problems. Alternative
solutions are currently being investigated.

One burden that we have carried over from structured-grid


algorithms to unstructured grid is the use of locally one-
dimensional approximate Riemann solvers. This conscious
introduction of a known problem is due to lack of a better
alternative. Several multidimensional Riemann solvers
have been considered in the structured-grid world without
much success. This problem is accentuated in the case of
unstrucNred grids & t o the difficulty in controlling cell
shapes. Of course, when sufficiently fine grids are
employed, Riemann solvers do not play a major role. But.
this doesn’t happen in the real world and hence, at least for
Fig. 6 Unstructured grid for inviscid flow past 8-1B now, we do have to reckon with mors that arise from
configuration. locally one-dimensional approximate Riemann solvers.
In the case of viscous flows, stringent resolution
requirements in the direction normal to a solid body force 9. CONCLUSIONS
an unstructured tetrahedral grid to have similar resolutions The most important conclusion that we have arrived at from
on the body surface in order to maintain acceptable shapes our experiments with unstructured grid computations is that
for the conservation cells in such regions. This results i n they offer a chance to prove to the designers that CFD is a
an unstructured grid with too many conservation cells in tool not just for analysis, and that it may very well be a
the vicinity of a solid body. Such a restriction does not better alternative to existing design tools such as panel
exist in the case of a structured grid and thus makes it more methods. Structured grid solver will still play a major role
suitable for viscous flows. as an analysis tool and as a tool for understanding some
complex flow features.
Arguably a hybrid grid. consisting of a structured grid in
the viscous regions and unstructured grid elsewhere may be
the most suitable way to discretize a computational REFERENCES
domain. But. considering the fact that one of the main
reasons for resorting to an unstructured grid is the difficulty [I] V. Shankar. “A Conservative Full Potential.
involved in generating body-conforming structured grids Implicit, Marching Scheme for Supersonic Flows”.
for complex geometries. it is indeed questionable whether AIAA Journal, Vol. 20, Nov. 1982, pp. 1508-1514.
much could be gained fmm such a strategy. At the Rockwell
Science Center we have been experimenting with a hybrid [2] S.R. Chakravarthy and K.Y. Szema, “An h l e r
unstructured grid consisting of triangular prismatic cells in Solver for Three-Dimensional Supersonic Flows
the viscous regions and tetrahedral cells elsewhere. This with Subsonic Pockets”. AIAA Paper No. 85-0243.
approach seems to be promising.
131 S.R. Chakravarthy, “High Resolution Upwind
One aspect of unstructured-grid solvers in which real Formulations for the Navier-Stokes Equations”, Ma
progress has been made is the storage requirement. Whereas Lecture Series in Computational Fluid Dynamics,
the structured grid solvers require only about 30 words of 1988-05.
storage per conservation cell, the unstructured grid solvers
8-8

[41 S. Palaniswamy, S.R. Chakravanhy and D.K. Ota, [Ill C. Rowell, V.V. Shankar. W.F. Hall, A.H.
‘‘Finite-Rate Chemistry for USA-Series Codes: Mohammadian. “Algorithmic Aspects and
Formulation and Applications”, A I M Paper No. 89- Computing trends in Computational
0200, 1989. Electromagnetics Using Massively Parallel
Architectures”, Proceedings of First
Rockwell Science Center CFD Department, ‘ U N I S International Conference on Algorithms and
User Manual”, Version 94.1 I . November 1994. Architectures for Parallel Processing, Brisbane.
Australia, 19-21 April, 1995, Volume I , editor V. L.
S.V. Ramakrishnan. C.L. Chen, S.R. Chakravarthy Narasimhan.
and K.Y. Szema. “Numerical Simulation of Two
Opposing High Speed Trains in a Tunnel”. AIAA (121 S.R. Chakravarthy. K.Y. Szema and S.V.
Paper No. 95-0746. 1995. Ramakrishnan. “Unification of Exterior and Interior
Boundary Conditions for Inviscid Computational
D.F. Dominik et al., “Navier-Stokes Solution for Fluid Dynamics”, presented at the 5th International
the Space Shuttle Vehicle using High Fidelity Full Symposium on Computational Fluid Dynamics,
Scale Grid Model”. AIAA-93-0419, Jan. 1993. Sendai International Center, Japan, Aug. 31-Sep. 3,
1993.
S.R. Chakravarthy, et al.. “Computational Fluid
Dynamics Capability for Internally Carried Store 1131 C.L. Chen, K.Y. Szema and S.R. Chakravarthy,
Separation”, Technical Report (Phase 111). Contract “Optimization on Unstructured Grid“, AIAA Paper
No. N6053&90-C-0393, Naval Air Warfare Center, No. 95-0217, 1995.
Weapons Division, China Lake, CA 93555-6001,
May 1994. [I41 U.C. Goldberg and S.V. Ramakrishnan. “A
Pointwise Version of Baldwin-Barth Turbulence
[91 A. Harten and S.R. Chakravarthy, “Multi- Model”, International Journal of Computational
Dimensional FNO Schemes for General Geometry”, Fluid Dynamics, Vol. I , Dec. 1993.
U C U Computational and Applied Mathematics
(CAM) Report 91-16.

P.L. Roe, “Approximate Riemann Solvers,


Parameter Vectors, and Difference Schemes”,
Journal of Computational Physics, Vol. 43. 1981.
pp. 357-372.
9- 1

.A Second-Order Finite-Volume Scheme Solving Euler and


Navier-Stokes Equations on Unstructured Adaptive Grids with
an Implicit Acceleration Procedure

M. Delanaye *
Ph. Geuzaine t
J.A. Essers t
P. Rogiest 5

Aerodynamics group, Institute of Mechanics and Aeronautics (C3)


The University of Liege
rue Ernest Solvay 21
B-4000Liege, Belgium

1. SUMMARY ROE Roe’s flux difference splitting


In this paper, recent advances in the development of a new VL Van Leer’s flux vector splitting
quadratic reconstruction finite-volume scheme for unstruc- U discontinuity detector
tured polygonal meshes are presented. The scheme is used At timestep
to discretize the two-dimensional compressible Euler and full h local characteristic mesh size
Navier-Stokes equations. The quadratic reconstruction is
3. INTRODUCTION
shown to lead to a full second-order accurate discretization of
During this last decade, many investigations have been carried
the advective derivatives. The accuracy of the scheme is very
out to develop efficient numerical techniques for solving the
weakly dependent on grid distortions, a property which is very
compressible Navier-Stokes equations for complex geometries.
attractive for adaptive unstructured grids computations. The
Unstructured meshes turn out to be a useful tool to generate
pseudo-time integration of the equations is performed by an
grids around general configurations, and offer the powerful
implicit scheme baxed on Newton-Krylov techniques. The li-
capability of adaptation to local flow features. Nevertheless,
near system that arises from the Newton linearization is solved
the inherent distortions present in unstructured grids cause the
by the GMRES algorithm. The incomplete L u factorization
classical schemes used for structured meshes to be mostly in-
is employed for the system preconditionning. The accuracy,
efficient for computing accurate solutions.
efficiency and robustness of the method are demonstrated on
various classical test cases respectively corresponding to in- In 1990, B a t h and Frederickson proposed the concept of
viscid and viscous laminar flows.
high-order polynomial reconstruction also named k-exact re-
construction schemes. Depending on a discrete set of cell va-
2. LIST OF SYMBOLS
a
- lues, a high-order cell-by-cell reconstruction of the flow vari-
ables is performed. A high-order Gauss quadrature coupled
conservative variables
with an approximate Riemann solver is used to evaluate the
any flow variable
flux balance integrals. This approach essentially corresponds
advective fluxes
to the generalization of the Godunov method to high-order
viscous fluxes
schemes on any type of meshes. It was initially designed for
edge normal components
Essentially Non Oscillatory schemes ’i3. However, the appli-
area of control volume
cation of the latter to steady state computations proved to be
contour of control volume
difficult due to the large computational time requirement and to
length of edge k
some convergence problems ‘. Barth developed a quadratic
Hessian matrix
reconstruction with a fixed support stencil in the frame of a
cell gravity center
cell-vertex finite-volume scheme. At the same time, Essers et
position vector
al. 6 1 7 proposed a non fully conservative finite-volume scheme
volume surrounding a cell and
for structured meshes which also preserved quadratic polyno-
bounded by its neighbors
mials. Despite its lack of conservativity, the accuracy and the
oi - 00
robustness of the method were demonstrated by the computa-
quadratic reconstruction
tions of viscous flows on very distorted meshes.
linear reconstruction
constant reconstruction In the frame of the present research *-”, we contribute to
the work of these authors by developing a robust and accu-
*Research Assistant rate scheme. The accuracy of the scheme is determined by
tF.R.1.A. Research Assistant
the use of an original quadratic reconstruction of the flow
t Professor
5F.N.R.S. Research Assistant variables and a second-order Gauss quadrature within a cell-

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
9-2

centered finite-volume solver. The quadratic reconstruction


can be interpreted as a higher-order generalization of the ro-
bust Green-Gauss linear reconstruction I I' widely employed
1

in unstructured solvers. The method is designed to deal with


grids that contain general polygonal cells with an arbitrary
number of edges. It provides a high flexibility. Hybrid grids
which are of high interest when solving viscous flows can be
used. Adaptation by h-refinement and coarsening is employed. Fig. 1: Control volume, quadrature'points
The second-order accuracy of the scheme with respect to the
discretized advective derivatives can be demonstrated, and is servative system of the non-linear equations:
achieved regardless of the amount of grid distortions. The
viscous term is discretized by using a central approximation. dsi + - c
- (fk+ Fk) = 0
Ni

The monotonicity of the solution is guaranteed by using a dt Ri k = l


discontinuity detector that switches the scheme to a constant
reconstruction.

For large problems, convergence rates obtained by explicit J 61.

methods (Runge-Kutta), even with some acceleration tech-


niques such as local time-stepping or residual averaging, fi-
nally remain insufficient. The speed-up of the convergence
5. ADVECTIVE DERIVATIVES
can be achieved by employing a multigrid approach orland
Obviously the accuracy of the scheme (2) is essentially depen-
implicit schemes '3114. Venkatakrishnan and Barth tested
dent on the numerical integration of the non-linear advective
a fully implicit scheme for unstructured meshes, wherein the
flux along the mesh edges. Two steps follow:
system arisen from the Newton's linearization was solved by
direct methods. However, that attempt showed that direct
methods, despite their robustness, are plagued by extremely 0 First, a reconstruction phase reconstructs the flow vari-
prohibitive memory and computational requirements. As an al- ables in the cell from the discrete values at the neigh-
ternative, iterative implicit solvers have been studied by many boring cell gravity centers.
authors 16- I R . The Newton-Krylov methods have turned out
to be really successful for a broad class of problems. Within 0 Secondly, a high-order Gauss quadrature integrates the
the frame of the Inexact Newton methods *', iterative solvers upwind numerical flux computed by a Riemann Solver.
based on Krylov subspace generation are employed to approx-
imately solve the linear system that arises from the Newton
5.1 Preliiniiiary note on the order of accuracy
linearization. Among others, the Generalized Minimum Resid-
ual (GMRES) algorithm of Saad and Schultz " has proved to Various definitions of the accuracy of a scheme exist in the
be very efficient thanks to its robustness. W e employ it in its CFD community. In this paper, we use a definition which is
finite-difference version that has the major advantage not to usual in the finite-difference community, i.e we refer to the
require the storage of the system Jacobian. To accelerate the accuracy obtained in the evaluation of the advective and dif-
convergence of conjugate-gradient like algorithms, precondi- fusive derivatives appearing in the equations. Hence, second-
tionning is highly recommended for clustering the eigenvalues order accuracy on first order derivatives (like for the advective
of the matrix ". For that purpose, we use the incomplete LU part of the Euler equations) means that the error on these first
factorization with no fill in. order derivatives for any sufficiently smooth function U should
decrease quadratically when the mesh is refined similarly in
4. FINITE VOLUME DISCRETIZATION all space directions. The dominant term of the truncation error
Consider a finite-volume discretization of the Navier-Stokes is therefore proportional to third-order derivatives times the
equations onto a set of polygonal cells whose number of edges square of a local characteristic mesh size:
Ni can be arbitrary:

with Ir' a constant.

A scheme discretizing advective derivatives is considered as


second-order accurate if it leads to an exact evaluation of the
Within the frame of the cell-centered variant of the finite- latter when the corresponding flux vectors are any quadratic
volume method, we associate to each polygonal cell a set of (or linear) function of the Cartesian coordinates of the physical
conservative variables (s,) which refer to the unknowns at the domain. Indeed, in that case, the third and higher order deriva-
cell gravity center (node). A second-order spatial discretization tives appearing in the truncation error (3) obviously vanish.
of the time derivatives of ( 1 ) can therefore be achieved without
requiring any artifice such as for example the mass lumping In the literature, the most frequently employed cell reconstruc-
implicitly present in the cell-vertex technique. The Navier- tion uses a representation of the flow variables based on linear
Stokes equations reduce to the following semi-discretized con- polynomials. Clearly, it can only evaluate exactly a linear
9-3

function, but not a quadratic function. This means that the or the Van Leer's flux vector splitting is employed to compute
dominant term of the truncation error involves a second order the upwind numerical flux at each quadrature point.
derivative and can be written:
5.2 Reconstruction phase

5.2.1 Fixed support stencil


with Ir" # 0 a constant Contrary to the Essentially Non Oscillatory schemes, the
stencil that supports the reconstruction is fixed during the
The resulting scheme is therefore first-order accurate only.
iterations. The major advantage is an important saving of CPU
However, when the mesh is sufficiently regular, it can be
time, but an accuracy deterioration occurs in the vicinity of
demonstrated that Ii" is equal to 0, and the second-order ac-
discontinuities. Indeed, to satisfy some monotonicity require-
curacy is recovered. Similarly, for irregular meshes, the domi-
ments in shocks or other regions with strong flow gradients,
nant error on the advective derivative computed by a constant
the scheme should locally reduce to a constant reconstruction
reconstruction involves a first order derivative with a coeffi-
whatever the methodology employed : TVD 23 , LED or
cient that does not tend to zero with the mesh size. For these
hybrid schemes 25.
meshes, the constant reconstruction is thus inconsistent, but
consistency (i.e. first-order accuracy) is recovered on regular 5.2.2 A classical linear reconstruction
grids.
By dropping off the quadratic terms of ( 5 ) , a linear reconstruc-
tion is obtained which in fact corresponds to the extension of
By extending the linear reconstruction to a quadratic recon-
the classical finite-difference Fromm scheme to multiple di-
struction defined as a third-order truncated Taylor series expan-
mensions:
sion of the variables around the cell gravity center, a quadratic
function can be reconstructed exactly provided that the numeri-
u T e C ( r=
) uo + A rT Vu0 (7)
cal gradient and Hessian matrix are respectively computed with The gradient at the cell gravity center is computed by the well-
a second-order and a first-order accuracy at least : known robust Green-Gauss reconstruction widely employed
in unstructured grid solvers 5i'2.TheGreen-Gauss theorem is
uTec(r)= 21.0 + ArTVu.0 + +ArT7i*Ar (5) applied to compute the averaged gradient of U over the surface
of a bounding control volume:
with A r = r - ro, and U is any flow variable.

Therefore, if the numerical integration of the numerical flux


is sufficiently accurate, a second-order accurate scheme is
obtained without any assumption about the grid regularity.
where r denotes a bounding volume surrounding and is
defined by the neighbors of fi (fig. 1). The integral in (8) is
This property is quite attractive when using unstructured grids
discretized by a summation of the contributions of thc linear
which are usually very irregular.
segments of or obtained from the trapezoidal rule. It leads to
The calculation of the flux f,"(see eq. (2)) through each edge a linear combination of the values of the neighboring nodes:
is performed by a high-order numerical integration of the flux
V u 0 = Dial1 (9)
functions using the n-points Gauss quadrature:
n

j=l
with: A u =

L
ul-uo

U N -
~ U o
1
where ($, y i ) are the coordinates of the Gauss quadrature Na is the number of neighbors of R, i.e. the cells connected
point j , w j denotes the weight associated with this point. to R by at least a common edge or a common vertex. Dl is
a 2 x Na matrix with constant coefficients.
By using n quadrature points, the formula (6) allows the ex-
iI act integration of polynomials with degree 2n - 1 at most. 5.2.3 Extension of the linear to the quadratic reconstruction
To meet the second-order accuracy requirement described in By using a Taylor series expansion of U around node 0 in
section 5.1, two quadraturepoints are at least needed in order (9), the truncation error E corresponding to formula (9) can
to compute exactly the flux integral of a quadratic polyno- be expressed as:
mial of the Cartesian coordinates. Schemes employing linear E = Er V2u0 (10)
reconstruction only necessitate one quadrature point (located
at the mid-point edge), but they are usually first-order accu-
rate only as already mentioned above. Essers et al. however
with: V 2 uo = [ ]
a;,u0
aiYuo
axyuo
proved that a one quadrature point integration can produce a
full second-order scheme even for very irregular meshes. This Note that Er is a 2 x 3 matrix containing constant coeffi-
accuracy can only be recovered by applying a non conserva- cients of c?( h ) . For arbitrary meshes, secontl-order accuracy
tive correction to the scheme, which definitely constitutes a is nevertheless recovered by subtracting E from the right-hand-
drawback with respect to the present method. side of (9):

A Riemann solver such as the Roe's flux difference splitting VUO= Di A U - Er V'UO (11)
9-4

This second-order numerical gradient does indeed depend on ways: either by selecting another stencil for the reconstruction
some sufficiently accurate (first-orcler at least) second-order which does not involve the discontinuity ( E N 0 schemes "), or
derivatives. By replacing ( 1 1 ) in ( 5 ) , we obtain a quadratic by modifying the reconstruction within the s a n e stencil (TVD

-
reconstruction for which the only unavailable coefficients are schemes 23).
the second-order derivatives of U :
The design of multidimensional limiters has Iieen introduced
linear part
by Barth and Jespersen ' I . However, as shown by Venkata-
krishnan **, such limiters may severely hamper the conver-
gence to the steady state. This problem is still more dramatic
when employing implicit schemes with large CFL numbers.
" Venkatakrishnan 28 proposed some modifications to the limiter,
quadratic part
and obtained convergence at the price of the evaluation of an
The second order derivatives are computed by a technique additional constant.
sometimes referred to as the minimum-energy reconstruction
5 i 2 6 . It sunply consists in fitting the cell quadratic polynomial
W e employ another approach by using the rather old idea of
'ti,.,, to the values of the neighboring nodes. The following hybrid schemes 25 , however applied to the reconstruction.
functional is ininirnized with respect to V'UO: The quadratic reconstruction is switched to a monotone con-
stant reconstruction in the vicinity of discontinuities. While in
"smooth flows regions", it remains unaltered.

i=l This is e a d y achieved with the formulation (12), but requires


which is equivalent to solve in the least square sense the fol- a discontinuity detector, that is taken of the form:
lowing linear system of Na equations and 3 unknowns:

( A , - A l E r ) V Z ~ g= ( 1 - A I D l ) A i i (14) i=l
60 = N,

(16)
'U is
the pressure orkand the velocity norm. The complete form
of 7 , which acts as a filter term, is given in reference ( ').

Formula (16) is an extension of the error indicator developed


The normal equations are non singular provided that the stencil ''
by Lohner for transient finite-element computations.
contains at least 6 nodes. Although this condition is generally
fulfilled for interior nodes, it may not be for boundary nodes. By construction 'TO is always bounded by 1 , and provided a
In that case, the stencil must lie enlarged by incorporating the threshold value p, a discrete discontinuity detector uo can be
neighbors of the neighbors sharing an edge with the concerned defined at each node 0 :
node. The solution of (14) corresponds to a first-order approxi-
if ao<P j u0=1
mation of the second order derivatives which can be expressed (17)
if uo>p j uo=O
as a linear combination of the nodal values:
where p is usually chosen close to 0.2, and turns out to lie
V"UO = D ~ A u (15) relatively case independent.

with D2 a 3 x Nn matrix with constant coefficients: The quadratic reconstruction ( 5 ) is finally modified as follows:
-1
D2 = [(A, - A I E ~(A,
) ~- AlEr)]

(A, - A , E ~(I) -~A , D , ) uo is computed once at the beginning o f each Newton's


iteration. Unfortunately, the detector is sometimes found to
All the matrices involved in the reconstruction Er, Dl and fluctuate at some nodes usually located in the neighborhood
D, are preprocessed and stored. of the discontinuities. At these locations, it may indeed appear
difficult to decide whether the cells involve the discontinuity
Note that variables U arc actually the primitives variables. As or not. However, after a sufficient residual decay, the iterative
a result, the gradients of the primitives variables are therefore process can bc supposed to be close enough to the solution,
directly available for the computations of the viscous fluxes. and the detector is frozen everywhere for the rest o f the con-
For inviscid flows, the conservative variables could however vergence. A similar but more complex strategy, which has not
be used as well. been tested in this paper yet, can be found in reference ( ').

5.3 Monotonicity of the reconstruction 6. DIFFUSIVE DERIVATIVES


High-order schemes p r ~ l u c eoscillations in the vicinity of dis- A centered discretization is used to discretize the viscous terms
continuities. That prolilem can lie overcome in two different o f the Navier-Stokes equations. The viscous llux is estimated
9-5

at one quadrature point located at the mid-edge, which requires where J (Q) = & + 2aQ2 is the Jacobian of 3
the evaluation of the gradient.; of the primitive variables at the
nodes. As pointed out in section 5.2.3, these gradients have As pointed out by Kuffer 30, deciding when the Newton loop
been previously computed with a second-order accuracy during ha..; to be stopped is not easy. A large residual decrcd;.\L IS
' not
the quadratic reconstruction phase. They are obtained at the always required, which necessitates many inner iterations and
quadrature point by using a linear interpolation between the left then costs a lot of computational time. Except for unsteady
( L ) and the right ( R )neighbors of the edge. Strictly speaking, flow computations for which equation (20) must be solved
that procedure is only valid if the mid-edge point lies on the accurately, many authors usually limit the number of inner
line joining the left and right neighbors. If it does not, the iterations to one (n = 0). The resulting descent direction is
following modified interpolation formula ha..; to be used: in fact usually accurate enough to decrease the residual satis-
factorily. As the time step increases to infinity, the iterative
time-marching scheme tends to a Newton-Raphson lineariza-
tion of the steady state equations. Restricted to one inner loop
iteration, the iterative process (21) becomes:
where P is the quadrature point, Q the projection of P on

For 1 = 0, 1, . . . until convergence do:


Solve J ( s ' ) ~= s -R(s')
The accuracy of that discretization is restricted to first-order
for arbitrary meshes, but remains second-order when the mesh
Update SI+' = S I + 6s

is sufficiently regular. That accuracy limitation is however


not too restrictive because the truncation error is multiplied 7.2 Inexact Newton's illethod
by a usually sinal1 factor (i.e. the inverse of the freestream Most of the computational time required by a Newton algo-
Reynolds number). rithm is essentially devoted to the evaluation of the Jacobian
and to the solution of a linear system. The exact solution
a9
7. IMPLICIT INTEGRATION SCHEME of that system is most of the case not justified when the iterate
is far from the solution. It seems to be quite reasonable to
7.1 Newton's method solve it approximately by using an iterative solver, which of
As our purpose is to compute steady state solutions of the course saves a lot of CPU time with respect to direct methods.
Euler and Navier-Stokes equations, Newton's method could Such a method is referred to as an inexact Newton-method. As
be directly applied to the steady state equations. However, shown by Dembo et al. 3 ' , the residual on the linear system
a well-known drawback of the Newton's method is the need must however verify the following rule in order to preserve
for a sufficiently "close initial guess" in order to guarantee the quadratic convergence:
convergence. A common approach to bypass that problem is
to consider the unsteady equations and to inarch in time. Be-
cause fully implicit schemes are known to be unconditionally (22)
stable, the time step At is allowed to increa..;e and finally
tend to infinity during the time-marching in order to permit
quadratic convergence when approaching the solution. The where T , E J(&("))SQ(") +F ( Q ( n ) ) , 11.11 denotes any
time-marching strategy can also be interpreted as the addition arbitrary norm in Rn, c is a constant and 3 is supposed to be
of an extra & term to the diagonal of the Jacobian of the suitably scaled. In our code, we use c = 0.5 and z = 0.5.
steady equations in order to increase its magnitude, which in
fact corresponds to an under-relaxation procedure. The choice of the iterative solver is obviously essential. These
solvers can h e grouped in two sets: stationary iterative meth-
An Euler-backward time-stepping is employed to discretize ods (Jacobi, Gauss-Seidel, SOR, ... etc), and the non-stationary
the time derivative of the equations (1). A Newton-Raphson iterative methods (Krylov subspace algorithms, Chebychev
iterative process is performed at each time step I to find the iteration, ... etc). This last set differs from stationary meth-
solution SI+' of the following system of non-linear equations: ods in that the computations involve information that change
SI+' - sI at each iteration. Iterative solvers such as Jacobi and Gauss-
3(S1+') = +R(s'+') = 0 (20) Seidel are attractive because they are simple and emy to vec-
At
torize. But their main drawback is that their robustness and
The operator R(Q) corresponds to the discretization of the
convergence are only ensured when the matrix exhibits large
spatial derivatives of the Euler and wavier-Stokes equations
diagonal terms. Unfortunately, due to the stiff problems gener-
described in the previous sections. Finally, the whole iterative
ated by the Euler and Navier-Stokes equations, a large diagonal
process can be summarized in two loops:
for the system Jacobian requires a important under-relaxation,
For 1 = 0, 1, . . . until convergence do: which indeed destroys the Newton's quadratic convergence.
Set Q(') = S I On the contrary, Krylov subspace methods such as the con-
For n = 0 , 1 , . . . until convergence do: jugate gradient are known to solve complex problems in a
Solve J(&("))SQ(") = -.F(Q(n)) (21) finite number of iterations. Many algorithms derived from the
Set Q("+') = +6~'") conjugate gradient have been developed to deal with a broad
Update SI+' = Q("+') range of problems. Among others, the Generalized Minimum
9-6

Residual (GMRES) of Saad and Schultz 22 is designed to solve 7.4 Precoiiditioiiiiiiig


non-symmetric linear systems. The convergence of Krylov solvers is very dependent on the
eigenvalues of the matrix. To accelerate the convergence, the
7.3 The finite-differaice GMRES algoritliiii use of a preconditionner that clusters the eigenvalues to each
The GMRES is a projection method for solving a linear system other is strongly recommended. The precmditionner should be
as close as possible to the inverse of the matrix. In practice,
Ax = b (23) it should allow a fast linear system resolution. Precondition-
ners based on stationary methods (diagonal preconditionner
that seeks an approximate solution 2, from an affine subspace
or Jacobi, Gauss Seidel, SOR) have been widely employed.
20 +C
I , of dimension 771 by imposing the Petrov-Galerkin A comparison of different preconditionning techniques can be
condition:
found in Orkwis et al. 34, and Venkatakrishnan et al. 1 4 . We use
b - Ax, IAK, (24) the incomplete LU decomposition ’’
( I L U ) which has been
20 represents an initial guess to the solution. K, is the Krylov demonstrated to be a very efficient preconditionning strategy.
subspace of dimension nz: It is generally employed in its simpler version named r L U ( 0 )
for which no fill in is permitted during the LU decomposition.
K,(A,TO) = span(r0, A Q , A ’ T O , . . , A”-’ro} (25) In other words, the non zero elements of the preconditionner
are located at the same location as those of the initial matrix.
with TO = b - Azo. This has the advantage of a fixed and minimum memory re-
quirement. However, this decomposition can turn out to be
In other words, the GMRES process successively build an too weak for stiff problems. We have developed and actually
approximate solution 2 , at each iteration m so that x, is use a block version of the ZLU(0).
orthogonal to the previous search directions in the metric of the
matrix A, and minimizes the residual T, in L2 norm. More The choice between right or left preconditionning is of im-
details about the implementation can be found in reference 32. portance. The use of right preconditionning is beneficial. In-
In our code, a block variant of the basic algorithm with restart deed, when using left preconditionning, all the residual vectors
is used. The whole problem is indeed considered as a n. X n, and their norms correspond to preconditionned and thus scaled
system of 4 x 4 block matrices. residuals. Hence, it could be difficult to know whether the
algorithm needs to be stopped. On the contrary, right precon-
Typically, a Krylov solver such as GMRES does not require ditionning allows the use of the actual residuals.
the calculation of the Jacobian J ( Q ) but only necessitates the
computation of a matrix vector product: Although the Jacobian matrix does not need to be formed in
the GMRES algorithm, an approximate form of it is however
J ( Q )P still required for the preconditionning. The support stencil of
the quadratic reconstruction is large, it is therefore prohibitive
where p denotes any vector.
to take all the neighbors into account. As suggested by many
authors, an approximate Jacobian may be computed by using
For nonlinear equations, this action can be approximated by a
finite-difference quotient of the form: a constant reconstruction which only depends on the edge-
neighbors or distance-one neighbors. In order to minimize the
bandwith and thus the fill-in of the decomposition, a reverse
Cuthill-McKee ordering is performed in a preprocessing step .

An analysis of the convergence of what is referred to as the in- In most of the results presented in this paper, the Roe’s flux
exact Newloidfinite-difference projection methods is given by difference splitting is employed. It is quite a complex and
Brown 20. The interesting feature of equation (27) is that the expensive task to derive analytically even an approximate form
calculation and the storage of the Jacobian are not required. ’’ of the jacobian of the latter scheme. One alternative is to use
Indeed, the computation of the jacobians of the advective and the easily available jacobian of the Van Leer’s scheme in the
diffusive flux may be very complicated, and the exact jacobian preconditionner, which costs 2 to 3 times less computational
of the Roe’s flux difference splitting is very expensive to com- time than the Roe’s scheme jacobian. A comparison between
pute. Furthermore, the introduction of turbulence modelling in both preconditonners is addressed in the section devoted to the
the frame of future developments will also lead to difficulties presentation of the results.
for deriving jacobians. The stencil of the quadratic reconstruc-
tion usually involves an average of 9 to 13 cells. Therefore, It should be mentioned that up to now the contribution of the
the required storage should amount from 144 to 208 words per viscous flux jacobian is not introduced in the preconditionner.
cell, which is quite expensive.
7.5 Tiiiie step iiicreiiieiit coiitrol
A proper choice of the parameter E in (27) is given by the As explained in the previous section, the Newton’s method is
analysis of Dennis and Schnabel ” : implemented in a time-stepping form. The evolution in time
is monitored by the time step. During the time-marching, the
EllPll = dY time step is increased to infinity in order to ultimately achieve
the Newton’s quadratic convergence. Like many authors, this
where 77 is the machine zero and 1 I. II represents the RMS norm.
9-7

is performed by employing an empirical formula in which the number of points involved in the mesh. The edge data struc-
C F L number varies according to the inverse of a residual ture employed in the code and the relatively insensitivity of
norm: the accuracy of the numerical scheme to grid distortions al-
low the use of very general polygonal cells, and as a result
of somewhat distorted meshes. We developed a very general
adaptation strategy based on mesh enrichment and coarsening.
The method is based on an error indicator of the form (16).
There indeed subsists two different parameters to tune in order
Cells whose error indicator lies above a preset threshold are
to optimize the convergence rate : the initial C F L number
candidates for refinement, while others whose error indicator
and the exponent p . Typical values of the latter parameters
lies under another preset value are to be possibly coarsened.
are : C F L o = 10 and p = 0.5.
The refinement strategy, which is implemented for any type
8. BOUNDARY CONDITIONS of polygons is described in reference ( ’). In particular, trian-
gles and quadrangles can be divided anisotropically depending
The treatment of the boundary conditions has a strong influ-
on the value of an anisotropy sensor based on some standard
ence on the convergence of an implicit scheme. For inviscid
deviations of the gradients of a flow parameter computed in
flow computations, we use a very convenient procedure, wich
the directions pointing to the different neighbors of the cell.
consists in imposing the boundary conditions in a weak manner
Two types of coarsening procedures are considered. The first
via the modification of the advective flux through the bound-
one is based on the refinement history. A tree containing the
ary edges. Hence, according to the boundary type, some of
information between successive meshes is updated during the
the flow variables are imposed at the quadrature points of the
refinements. It is then rather easy to delete “son” cells and to
edges, and others are computed from their values at interior
recover the “parent”. The second method is more general and
nodes using extrapolation formulas similar to those used to
coarsens the grid by deleting vertices and recombining others
evaluate left and right values at the quadrature points of in-
to build larger polygons.
ner edges. For viscous flow computations, inlet and outlet
boundaries are treated in a similar way as inviscid boundary
10. RESULTS
conditions. At the solid walls, the viscous flux is modified in
order to impose the noslip boundary condition: 10.1 Subsonic sine-bump
The effect of the various reconstructions (quadratic - linear
r o i - constant) has first been tested by computing the inviscid
subsonic flow (kf, = 0.5) in a channel perturbed by a sine
bump with a mesh of 1294 cells (fig. 2a). The geometry is
defined as follows:
That method however turns out to be generally too weak to cor-
Lower wall:
rectly satisfy the no slip condition. Two additional procedures
have been tested. The first corresponds to the introduction -0.7 <x< 0 :y=O
of dummy nodes in the stencil of the boundary cells. These 0 <x< 1 : y = 0.05[1 + sin(2.rrz - $)]
dummy nodes are located at the mid-point of boundary edges. 1 <z< 1.7 :y=O
The flow variables at these nodes are extrapolated or imposed
by the no slip boundary condition before each evaluation of UDper wall: -0.7 < x < 1.7 : y = 0.7
the flow derivatives. That method has been implemented in a
fully implicit manner and successfully used for the flat plate The Roe’s scheme is employed as Riemann solver. The solu-
boundary layer computation. Unfortunately, the result is not so tions have been computed for an infinite value of the C F L
good for more complex flow computations. For these flows, number and %maximumnumber of GMRES iterations equal to
we have tested another procedure. The boundary nodes are 60 with a restart every 30 iterations. Figures 2c and 2d show
no longer located at the cell gravity center, but at the mid- the evolution of the Mach number and the total pressure on
boundary edge, and the noslip boundary condition is imposed the lower wall. The quadratic reconstruction clearly appears
in its strong form at each Newton iteration. to lead to the lowest spurious entropy generation (fig. 2d).
Hence, it predicts the highest peak Mach number : 0.835. For
Finally, notice that it is essential to include the contribution the sake of comparison, the peak values respectively calculated
of the boundary conditions in the preconditionner. The ja- with the linear and the constant reconstructions are equal to
cobian of the modified boundary advective flux is calculated 0.804 and 0.754. When compared to other reconstructions (re-
analytically for most of the boundary conditions except for sults not shown), the symmetry of the solution obtained with
the subsonic inlet. For the latter, it is derived from a finite the quadratic scheme is almost perfect as can be seen from the
difference formula similar to equation (27). iso-mach lines pattern (fig. 2b).

9. MESH ADAPTATION Fig. 2e illustrates the dramatic convergence improvement ob-


The possibility of using a flexible local grid adaptation pro- tained with the implicit scheme with respect to an explicit
cedure is a major advantage of unstructured meshes. The integration using a 3 steps Runge-Kutta algorithm. The pre-
objective is to improve the resolution of the flow by succes- conditionner based on the approximate Roe flux difference
sive refinements and coarsenings together with minimizing the splitting yields the fastest convergence in terms of CPU time
9-8

and number of Newton iterations (fig. 20. The quadratic re- migrate to their right location. During that phase, the residual
construction however takes about 25 % more CPU time than actually stagnates. That prevents the CFL number to increase
the linear reconstruction scheme to achieve the same residual to infinity, and therefore dramatically slows down the conver-
decay. gence. We actually use the two following remedies: the star-
ting solution is obtained with a cheap low order scheme, and
10.2 Subcritical NACA0012 airfoil we use a grid sequencing strategy with mesh adaptation. The
The second test case again illustrates the accuracy gain ob- initial mesh contains 1420 rectangular cells (fig. 4a). After
tained with the quadratic reconstruction scheme. The inviscid three adaptation, the final grid (fig. 4b) involves a lower num-
subsonic flow over the NACA0012 airfoil has been computed ber of cells (1296), which are very general polygons. The total
at a freestream Mach number of 0.63 and an incidence of 2 deg. computational time (not shown here) amounts to 400 CPU sec.
The mesh contains 4537 cells (fig. 3 4 . The far-field bound- on a HP9000/730 workstation (infinite C F L number). Fig.
ary is located at a distance of 20 chords away from the airfoil. 4c shows the points where the detector automatically activates.
The starting solution corresponds to the uniform flow. The In order to avoid endless switches of the latter, it is frozen after
solutions computed with the various reconstructions behave 5 Newton’s iterations. As can be shown of fig. 4d and 4e, a
similarly on the lower wall (fig. 3b). However, larger discrep- very crisp shock is captured. Different convergence histories
ancies occur at the upper wall due to the strong flow accelera- for computations performed on the initial mesh are presented
tion. The highest peak Mach number (0.981) is again obtained in fig. 4g and 4h. The fastest convergence is again obtained
with the quadratic reconstruction and agrees very well with the with an infinite C F L number. Those figures also show that
value computed by Paillere (0.983) 36 on the same mesh with an exponent p equal to 2 yields a similar convergence history.
a fluctuation splitting scheme. Figure 3d shows the evolution For the sake of comparison, the GMRES algorithm appears to
of the total pressure along the wall. Notice that the level of be about 3 times faster than the SOR scheme in terms of the
spurious entropy generated by the quadratic reconstruction is computational time.
very low. The lift coefficient Cl = 0.323 compares well
with the value computed by Paillere 36 (C/ = 0.322), and 10.4 Inviscid hypersonic flow over a double-ellipse
the purely numerical pressure drag coefficient is found very We now consider the inviscid flow over the double-ellipse test
low, Cd = 0.00034. The lift coefficient is however slightly case proposed in the workshop of Antibes 38 at 30 deg. angle
lower than the exact one predicted by a full potential method of attack and a Mach number of 8.15. The initial mesh of
(C/ = 0.334). That difference can be explained by the fact 2412 triangles (fig. 5a) is adapted three times (9527 cells,
that no vortex correction is imposed at the far-field boundary fig. 5b). The iso-mach lines pattern is presented in fig. 5c.
condition 3 6 . The error between the present value and the ex- Nearby, the fig. 5d shows the nodes where discontinuities are
act one is equal to 3.4 %. According to the work of Thomas automatically detected. Convergence histories are presented
and Salas 37, a computation with a mesh of about 20 chords for the computation on the final adapted mesh. Notice in
and with no vortex correction should underpredict the lift co- fig. 5e the dramatic convergence improvement obtained with
efficient with a factor of 4 %. implicit scheme with respect to a 4 steps explicit Runge-Kutta
scheme. Fig. 5h, 5g and 5i respectively give the evolution
The influence of the exponent ( p ) of the C F L update formula of the Mach number, the pressure coefficient and the total
(29) on the convergence has been tested (fig. 3e and 30. The pressure along the windward and leeward sides. Our results
code diverges when the computation is initiated with an infinite are compared with those obtained by Gustafsson et al. and
C F L number. The convergence history obtained when the Khalfallah et al. published in the workshop proceedings 38.
GMRES is replaced by an SOR iterative solver is provided The pressure coefficient and the Mach number agree with the
in fig. 3f and 3e. Figure 3f clearly shows that the Newton’s results of the latter authors. Notice the fair agreement between
quadratic convergence is never reached with the SOR strategy. the computed total pressure and the exact one which can be
Nevertheless, this strategy turns out to be competitive in terms obtained from the normal shock theory (less than 0.02 % error
of the computational cost (fig. 3e). on the leeward side).

10.3 Transonic flow over a circular arc bump 4


10.5 Supersonic flow around a NACA0012 airfoil
To test the accuracy and the performance of the scheme to The supersonic flow over the NACAOOI 2 airfoil (Mach = 1.2,
calculate flows with shock waves, the code is applied to a angle of incidence = 0 degree) illustrates the flexibility of the
classical test case: the inviscid transonic flow (M, = 0.85) adaptation technique and the preservation of the accuracy of
over a circular arc bump in a channel. The use of implicit the scheme even on very distorted grids. The calculation is
Newton-Krylov techniques for flows with discontinuities re- started on a triangular mesh shown in fig. 6a. Three adapta-
mains difficult because of the modifications of the reconstruc- tions are performed. The final mesh is made of polygons with
tion that are required to preserve the monotonicity of the so- a number of edges varying from 3 to 7 (fig. 6b and 6c). The
lutions. As explained in section 5.3, the quadratic reconstruc- detached shock and the oblique shocks attached to the trailing
tion (18) modified by the discontinuity detector is employed to edge are well captured (fig. 6d). The distribution of the Mach
achieve monotone solutions. For this test case, the Van Leer’s number on the upstream and downstream parts of the x-axis
flux vector splitting is used. The computation is started from a as well as along the airfoil is presented in fig. 6e. The present
solution previously computed with the constant reconstruction calculation is computed with a 4 steps explicit Runge-Kutta
scheme. Indeed, one of the major reported disadvantages of scheme. Up to now, no attempt was made to use the implicit
implicit-Newton methods is the time required by the shocks to scheme for this test case.
9-9

10.6 Laminar viscoiis flow over a flat plate 0.143 instead of the reference value 0.15. Moreover, the cell
The accuracy of the Navier-Stokes code has been assessed longitudinal dimension in the region of the separation point is
by investigating the development of a laminar compressible also relatively large: about 1 % of the chord.
boundary layer over an adiabatic flat plate. For that calcula-
tion, the Mach and the Prandtl numbers are respectively taken 11. CONCLUSION
equal to 0.5 and 1 . The viscosity is proportional to the tem- In this paper, an original quadratic reconstruction finite-volume
perature (Crocco’s viscosity law) in order to compare our re- scheme for solving the Euler and full Navier-Stokes equa-
sults with the exact solution predicted by the boundary layer tions has been presented. The quadratic reconstruction is a
theory. The computation is performed with a full quadratic higher-order extension of the robust Green-Gauss linear recon-
reconstruction and the Roe’s flux difference splitting. The struction. The accuracy of the resulting discretized advective
initial C F L number is equal to 10 and the exponent p is derivatives is second-order, and is insensitive to grid distor-
0.5. We noticed quite an important numerical influence of the tions. The robustness and the high accuracy of the scheme
downstream boundary condition (inviscid subsonic outlet with have been demonstrated by various computations on very dis-
pressure imposed), that forced us to locate that boundary quite torted meshes. The Newton-Krylov method based on the
far from the leading edge of the plate, i.e. at a Reynolds num- GMRES iterative solver has been successfully used to drama-
ber based on x equal to 10,000. The mesh contains rectangular tically improve the convergence to steady state with respect
cells. There is an average of 15 cells in the displacement thick- to explicit methods. The implicit scheme has been tested on
ness of the boundary layer. An excellent agreement is found fully subsonic, transonic, supersonic inviscid flows, and on
between the computed and exact velocity and temperature pro- laminar viscous flows computations. For transonic and su-
files (fig. 7c). Fig. 7b shows the evolution of the skin friction personic flows, a discrete discontinuity detector is employed
coefficient along the plate, which also agrees very well with to switch the scheme to a monotone constant reconstruction.
the exact one. Notice also the good agreement, especially in This alternative does not encounter the major problems of the
outer part of the boundary layer, between the computed and classical multidimensional limiters to drive the convergence
exact shear stress (fig. 7d). Unfortunately, some deviation oc- to machine accuracy. For inviscid flow test cases when the
curs near the wall. Our latest investigations show that it seems Roe’s flux difference splitting is employed, the precondition-
to be caused by a perturbation coming from the downstream ner based on an approximate jacobian of the Roe’s flux dif-
boundary condition. The problem must be further studied. The ference splitting always lead to a faster convergence than a
convergence history is reported in fig. 7a. The relatively slow preconditionner based on the Van Leer’s flux vector splitting
convergence is attributed on one hand to the fact that no con- although much cheaper to compute. For viscous flow com-
tribution of the viscous terms jacobians is introduced in the putations, the ability of the scheme to deal with hybrid grids
preconditionner and on the other hand to the weakness of the is a real advantage. The quadratic reconstruction has led to
I L U ( 0 ) decomposition. very accurate solutions. However, the proper imposition of
the boundary conditions remains a problem. Two methods
10.7 Laminar viscous flow over the NACA0012 airfoil have been tested. The first one which modifies the stencil
In this final test case, we consider the laminar flow over a of the reconstruction for boundary cells to include the effect
NACA0012 airfoil at 0 deg. incidence with a freestream Mach of the boundary conditions has been successfully applied for
number of 0.5 and a Reynolds number of 5000. The wall a flat plate boundary layer computation. But, another proce-
is adiabatic. The Sutherland viscosity law is employed and dure was required for the computation of a laminar viscous
i
the Prandtl number is equal to 0.72. The flexibility of the flow around the NACA0012 airfoil. It consists in locating the
method is illustrated by employing a hybrid grid (fig. 8a). boundary nodes on the boundary edges rather than at the cell
It consists in a structured C-type part around the airfoil and gravity center and then to apply the boundary conditions in
in the wake surrounded by a triangular mesh. The far-field their strong form. This modified strategy, which is explicit,
boundary is located at a distance of 33 chords from the airfoil. unfortunately artificially perturbs the convergence for nodes
The cell aspect ratio varies from 100 near the wall to 50,000 near solid walls. More efforts should also be devoted to the
in the wake. The iso-mach lines pattern presented in fig. 8b improvement of the preconditionner which seems to be too
shows the development of the boundary layer and its separation weak for viscous flow computations.
near the trailing edge to form a small recirculation bubble.
The pressure and skin friction coefficients are presented in 12. ACKNOWLEDGMENTS
fig. 8c and 8d. Accuracy estimates of the results may be The works of Ph. Geuzaine and P. Rogiest are presently sup-
carried out by comparing the location of the separation point ported by fellowships awarded by the Fund for the Formation
(in percents of the chord) and the magnitudes of the pressure in Research in Industry and Agriculture (F.R.I.A.), and by the
and viscous drag coefficients. We obtain x S e p = 81.7%, National Fund for Scientific Research (F.N.R.S.), respectively.
Cd, = 0.0227, Cd, = 0.0320. These results agree with the The authors wish to thank Prof. H. Deconinck from Von Kar-
reference values obtained by Swanson and Turkel 39 on a 518 man Institute. (Belgium) for providing the mesh employed for
x 128 structured mesh (z, = 81.4%, c d , = 0.02235, the computation of the subsonic flow around the NACA0012
Cd, = 0.03299). Notice however that the present mesh airfoil.
only involves 7709 cells and is relatively coarse in the leading
edge region which is responsible for a slightly underprediction
of the skin friction. We obtain a maximum peak value of
9-10

REFERENCES 21. Y. Saad and M.H. Schultz. “GMRES: A generalized A n -


imal residual algorithm for solving non-synunetric linear
1 . T.J. Barth and P.O. Frederickson. “Higher Order Solu- systems”. S I A M J . Sci. Stat. Comp., 7 , 1986.
tion of the Euler Equations on Unstruct.ured Grids using 22. Y. Saad. “Preconditioning techniques for nonsynmetric
Quadratic Reconstruction”. A I A A paper 90-0013, 1990. and indefinite linear systems”. J . of Comp. a n d A p p .
2. P. Vankeirsbilck. “Afgorithmic developments for the sola- Math., 24, 1988.
tion of hyperbolic conservation laws on adaptive ~instriic- 23. A. Harten. “High Resolution Schemes for Hyperbolic Con-
tiired grids (Applications t o the Eziler Equations)“. PhD servation Laws”. J . of Comp. Phys., 49(3):357-393, 1983.
thesis, Katholiek Universiteit van Leuven (Belgium) and 24. S. Tatsumi, L. Martinelli, and A. Jameson. “Flux-Linuted
Von Karman Institute, 1993. Schemes for the Compressible Navier-Stokes Equations”.
3. R. Abgrall and F.C. Lafon. “ E N 0 schemes on Unstruc- A I A A JotiTnal, 33(2):252-261, 1995.
tured Meshes”. V I i l Lectiire Series 1993-04, March 1993. 25. A. Harten and G . Zwas. “Self Adjusting Hybrid Schemes
4. A.G. Godfrey, C.R. Mitchell, and R.W. Walters. “Practi- for Shock Computations”. J . of Comp. Phys., 9(3):368-
cal Aspects of Spatially High Accurate Methods”. A I A A 583, 1972.
paper 92-0054, 1992. 26. W.J. Coirier and K.G. Powell. “An Accuracy Assesment
5. T.J. Barth. “Recent Developments in High Order k-Exact of Cartesian-Mesh Approaches for the Euler Equations”.
Reconstruction on Unstructured Meshes”. A I A A poper J . o f Comp. Phys., 117:121-132, 1995.
93-0668, 1993. 27. A. Harten and S.R. Chakravarthy. L‘Multidiniensional
6. J.A. Essers, M. Delanaye, and P. Rogiest. “An Upwind- E N 0 Schemes for General Geometries”. Technical Re-
Biased Finite-Volume Technique Solving Compressible port No. 91-76, ICASE, 1991.
Navier-Stokes Equations on Irregular Meshes. Appli- 28. V. Venkatakrishnan. “On the Accuracy of Limiters and
cations to Supersonic Blunt-Body Flows and Shock- Convergence to Steady State Solutions”. A I A A paper 93-
Boundary Layer Interactions.”. A I A A paper 93-3377, 0880, 1993.
1993. 29. R. LGhner. “An Adaptive Finite Element Scheme for
7. J.A. Essers, M. Delanaye, and P. Rogiest. “An Upwind- Transient Problems in CFD”. Comput. Meth. Appl. and
Biased Finite-Vnlume Technique Solving Compressible Mech. Engrg., 61:323-338, 1987.
Navier-Stokes Equations on Irregular Meshes.”. A I A A 30. Jurg Kiiffer. “Fast Implicit Solvers for the Incompress-
Jo7irnal, 33(5), 1995. ible Navier-Stokes Equations”. Proceedings of Compzita-
8. M. Delanaye and J.A. Essers. “Finite Volume with tional Fziild Dynamics ’92 Conference , Br?isse/s, 1:407-
Quadratic Reconstruction on Unstructured Adaptive 412, 1992.
Meshes Applied to Turbomachinery Flows”. 1995 A S M E 31. R.S. Dembo, S.C. Eisenstat, and T . Steihaug. “Inexact
I C T I Gas Twrbine Conference, Hoiiston, June 1995. Newton method”. S I A M J . Nzim. Anal., 19(2), April
9. M. Delanaye and J.A. Essers. “An Accurate Finite- 1982.
Volume Scheme for Euler and Navier-Stokes Equations on 32. Y. Saad. “Krylov Subspace Techniques, Conjugate Gradi-
Unstructured Grids ”. A I A A paper 95-1710, 12th CFD ents, Preconditionning and Sparse Matrix Solvers”. VKI
Conference, S a n Diego, June 1995. Lectures Series 1994-05, March 1994.
10. M. Delanaye, J.A. Essers, and Geuzaine Ph. “Euler 33. J.E. Dennis and R.B. Schnabel. “N~imericalMethods
and Navier-Stokes Calculations with a Quadratic Recon- f o r fJnconstmined Optimizations a n d N o n Linear Eqiia-
struction Finite Volume Scheme on Flexible Unstructured tions”. Prentice-Hall, 1983.
Grids”. Sixth International Symposiiim on CFD, Lake 34. P.D. Orkwis and J.H. George. “A Comparison of CGS
Tahoe, Nevada, September 1995. Preconditionning Methods for Newton’s method solvers”.
11. T.J. Barth and D.C. Jespersen. “The Design and Ap- A I A A paper 93-3327, 11th A I A A C F D Conference, OT-
plication of Upwind Schemes on Unstructured Meshes”. l a n d o , 1993.
A I A A paper 89-0366, January 1989. 35. T.J. Barth. “Analysis of Implicit Local Linearization
12. D. De Zeeuw and K.G. Powell. “An Adaptively Refined Techniques for Upwind and TVD Algorithms”. A I A A
Cartesian Mesh Solver for the Euler Equations”. J. of paper 87-0595, 1987.
Camp. Phys., 104:56-68, 1993. 36. H. Paillkre. “M7iltidimensiona/ Upwind Elesid7ial Distri-
13. D.J. Mavriplis and A. Jameson. “Multigrid Solutionof the bution Schemes f o r the E d e r a n d Navier-Stokes Eqzia-
Navier-Stokes Equations on Triangular Meshes”. A I A A tions on llnstriicti~redGrids”. PhD thesis, UniversitC Li-
Jo?rrna/,28(8):1415-1425, 1990. bre de Bruxelles (Belgium) and Von Karman Institute,
14. V. Venkatakiishnan and D.J. Mavriplis. “Implicit Solvers 1995.
for Unstructuied Meshes”. J . o f Comp. Phys., (105):83- 37. Thomas J.L. and M.D. Salas. “Far-Field Boundary Con-
91, 1993. ditions for Transonic Lifting Solutions to the Euler Equa-
15. V. Venkatakrishnan and Barth T . J . “Application of Di- tions’’. A I A A J o i i m a l , 24:1074-1080, 1986.
rect Solvers to Unstructured Meshes for the Euler and 38. J.-A. DCsidtri, R.. Glowinski, and J. (Eds) Pbriaux. “Hy-
Navier-Stokes Equations Using Upwing Schemes”. A I A A personic F f o w s f a r Reentry Problems”, volume 2. Springer
paper 89-0364, 27th Aerospace Sciences Meeting, Reno, Verlag, 1991.
Nevada, 1989. 39. R.C. Swanson and E. Turkel. “Artificial Dissipation and
16. Wigton L.B., N.J. Yu, and D.P. Young. “GMRES Accel- Central Difference Schemes for the Euler and Navier-
eration of Computational Fluid Dynamics Codes”. A I A A Stokes Equations“. A I A A paper 87-11 07, 1987.
paper 85-1494, 1985.
17. Z. Johan, T.J.R. Hughes, and F. Shakib. “A Glob-
ally Convergent Matrix-free Algorithm for Implicit Time-
marching Schemes Arising in Finite Element Analysis in
Fluids”. Comp. Meth. i n App. Mec. a n d Eng., 87, 1991.
18. D.L. Whitaker. “Three Dimensional Unstructured Grid
Euler Computations Using a Fully-implicit, Upwind
Methods”. A I A A paper 93-3357, 1 l t h C F D Conference,
Orlando, 1993.
19. P.N. Brown and Saad Y. “Hybrid Krylov Methods for
Nonlinear Systems of Equations”. S I A M J . Sci. Stat.
Comp., 11(3),1990.
20. P.N. Brown. “A Local Convergence Theory for Combined
Inexact-Newton /Finite-Difference Projection Methods”.
S I A M J . N u m . Anal., 24(2), 1987.
9-1 I

Fig. 2 b IsoMach lines

-r
Fig. 2a: Mesh (1294 cells)
(0.4 - 0.85,A0.01875)
1.02

1.01

r 0.65 0.99

9 0.96 QUA 4

0.55
0.87

0.96
0.45

0.85
-0.5 0 0.5 1 1.5 4.5 0 0.5 I 1.5

Fig. 2c: Wall Mach number Fig. 2d: Wall Total pressure

IHOO
..-
--._ _ . ,^

QUA. ROE C
QUA-VL c
le46 .
LIN ROE -E--.
CON -ROE -x--
EXPLICIT --

I
0 2w 4w 600 600 lorn 0 3 6 0 12
CPU(*) itUaliOr*l

Fig. 28: Convergence - CPU (HP9000) Fig. 21: Convergence. Newton iter.

Subsonic sine-bump M , = 0.5, C F L = CO


9-12

Fig. 3 b Iso-Mach lines


Fig. 3a: Mesh (4537 cells)
(0.022- 0.981,A0.028)
1 IO1

0.8 1

0.6 0 DO

0.4 OVA c
LIN c
CON .=-

o'
00 0.2 0.4 x 0.6 0.6 1 0.011
0 OB OB 1

Fig. 3c: Wall Mach number Fig. 3d: Wall Total pressure

I ._ -
0 MI0 1WO 1500 ZWO 25X 0 40 80 120 160
CPU [*I nsntrns
Fig. 3e: Convergence CPU (DECalpha 250)
~ Fig. 3t: Convergence Newton Iter.
~

Subcritical NACA0012 airfoil M , = 0.63, incidence = 2 dag., CFLo = 10


Fig. 4a: Initial mesh (1420 cells) Fig. 4 b Adapted mesh 11226 cells)

I L
Fig. 4d: Iso-Mach lines
Fig. 4c: Discontinuity detector
(0.59 - 1.33,A0.03)
1.4, I

1'1
1.05
r-
-

-4-
0.5 I
-2 -1 0 I

Fia. 4e Wall Mach number


2 3
I
0's6
0.9 -
t
-2 -1 1

Fia. 4f: Wall tot4 Dresswe


2

Fig. 49: Convergence - CPU (HP9000) Fig. 4h: Convergence - Newton iter.
Transonic bump, M , = 0.85
9-14

Fig. 5a: Initial mesh Fig. 5b: Adapted mesh Fig. 5c: Iso-Mach lines
Fig. 5d: Detector
(2412 cells) (9527 cells) (0 - 8.15,AO.Z)

* ,
I
~ ;E
Fig. 50: Convergence - CPU (DECalpha 250) Fig. 5f: Convergence - Newton iter.

I
o m

O.W,
n*l

1
...-
-. ! . . ....... ..._..I __
0.w

w 4.w 4.w 0.e -0.W 4.M 4.w OOB .Ma 9.w 4.02 0.-

Fig. 5g: Wall Cp Fig. 5h: Wall Mach number Fig. 51: Wall Total pressure

Inviscid double ellipse, M , = 8.15,angle of attack = 30 deg.


945

Fig. 6a: initial mesh 6242 cells) Fig. 6 b Final Mesh (10149 cells)

Fig. 6 d Iso-Mach lines


Fig. 6c: Details of the final mesh
(0 - 1.6,A0.066)
1.6

1.2

0.8 - -

0.4 - -
0 _ .................................................. .-

-1 -0.5 0 0.5 1 1.5


X

Fig. 6e: Mach number distribution

Present -
1 .01 Exact -e.-,
1 -
5 - ............
: 0.99

3 -
'
0.98

0.97 - -
0.96 - -
0.95 I
-1 -0.5 .O 0.5 1 1.5
X
Fig. 61: Total pressure distribution

NACA0012 airfoil, M, = 1.2. angle of attack=O deg.


9-16

0.4
Preoml
ExSd .
-

Fig. 7a: Convergence - Newton iter. Fig. 7b Skin friction coeff.

Fig. 7c: Velocity and temperature Fig. 7d 0,u


Boundary layer profiles at Re, = 4000
Adiabatic flat plate Mm = 0.5, Re, = 2000
Fig. Ea: Mesh (7709 cells) Fig. Bb: Iso-Mach lines

.-
0 0.2 0.4 0.6 0.8 1

Fig. 8c: Pressure coeff. Fig. 8d Skin friction coeff.

NACA0012 airfoil M , = 0.5, Re, = 5000, incidence = 0 deg


10-1

Un SchCma CinCtique d’Ordre 2 peut montrer que ce schema p r k r v e la poeitivitd de la


densitd et de la pression sous une condition de type CFL
PrCservant les PositivitCs pour les [ll].Notre extension au second ordre c o ~ i ~irt ajouter
e
Equations d’Euler Compressibles sur au flux n d r i q u e du premier ordre une cometion anti-
Maillages non Structur6s diffusive qui doit &re limit& de mrte que lea poeitivik
Aut 0-Adaptatifs soient prkrvdes. Cette approche peut &re considirtk
comme une variante de la mdthode dite ’des flux mod-
Ph. Villedieu, J.L. Eetivalezes, J.J. Hylkema
Sb’.
CEFCT-ONERA , 2 Ave. Ed. Belin, 31055 Toulouse,
FRANCE Les maillages utilids sont structurb ou non struc-
turb; de plus une technique de reffinement automa-
tique de maillage a dtd implant& dam les codes
Abstract 2D et 3D. Nous avons r W de nombreuses simula-
tions numdriques sur diffdrents types de maillages, afin
The aim of this contribution is to present the first nu- d’dvaluer la robusteese et la pr&cieion de ce nouveau
merical results, that we have obtained with a new sec- echdma. Des comparaisons ont dgalement dt6 faites avec
ond order kinetic theory bawd scheme. The main inter-
le schdma de Roe dtendu au second ordre suivant la
est of our approach is that density and internal energy mdthode MUSCL de Van Leer.
can be proved to remain non negative under a CFL like
condition. It is well known that classical approximate
Riemann solvers, even first order accurate, do not sat- Le plan de l’article est le suivant. On commence
isfy this property. Our first order scheme is the classical tout d’abord par quelques gdndralitds sur les schdmas
kinetic scheme, based on the Maxwellian velocity distri- cindtiques dont on rappelle les principales propridtb.
bution. Our second order extension consists of adding to D ~ la Bseconde partie, on prdsente le principe de notre
the first order numerical flux an antidiffusive correction extension au second ordre. La troisikme partie est con-
which has to be limited such that the constraints of poe- sacrtk a l’exp& d’un critkre de raffinement de maillage
itivity will be satisfied. It can b e seen as a variant of the (bas5 sur la production locale d’entropie du schkma) et
so-called corrected anti-diffusive flux approach. We have B la description de la technique de raffinement de mail-
performed numerical computations for various two and lages que nous avons utilistk. Enfin dans la dernikre
three dimensionnal test cases on unstructured and self- partie, on p r k n t e de nombreux rbultats numkriques et
adaptative meshes, in order to evaluate the accuracy and des comparaisons avec le schema de Roe.
the robustness of this new method. Comparisons have
been done with a second order extension of Roe’s scheme
(with MUSCL approach).
1 Ghn6ralitBs sur les schhmas
Introduction cin6t iques
Le but de cette contribution est de prdsenter lea pre-
miers rbultats numkriques obtenus avec un nouveau
Le premier schdma cindtique pour les dquations d’Euler
a dtd introduit par D. Pullin dans [19]. I1 a ensuite
schkma d’ordre 2, b& sur la thhrie cindtique des gaz.
it6 revisit6 et emdliord par S. Deshpande dans [4,5].
Le principal intdret de notre approche rdside dans le fait
qu’on peut prouver que la densitd et la pression restent D’autres schdmas cindtiques, b&s sur des distribu-
tions d’dquilibre diffdrentes de la Maxwellienne, ont en-
positives sow une condtion de type CFL. I1 est bien
suite 6t6 prop& par divers auteurs dont Kaniel et
connu que lea schdmas clasaiques, construits sur la base
Perthame [7,8,9]. Le grand intdret des travaux de
d’un solveur de Riemann approchd, ne p d d e n t pas
cette propridtd meme ir I’ordre 1 [3]. C’eet un krieux
B. Perthame est d’avoir les premiers mis en Cvidence
les propridtb thhriques de certains schdmas cindtiques
inconvdnient lorsqu’on souhaite calculer des dcoulements
(ceux associb is des distributions d’dquilibre k s u p
pour lesquels la densite est tres faible ou pour lesquels
port compact):consistance avec l’inkquation d’entropie,
I’energie interne est faible devant l’energie cindtique
prkrvation des poeitivitb, ... . Signalona enfin les
(Ecoulements hypersoniques, Problames de ddtonique,
travaux de Mazet et al concernant les liens entre les
...). Noter de plus que notre approche peut se gdndraliser
sans difficultt5 aux dcoulements rdactifs [12,16].
schdmas cindtiques et la symdtrisation des equations
d’Euler via les variables entropiques [1,12,11]. Nous re-
Au premier ordre, notre schdma n’est autre que le viendrons sur cet aspect dans la troisikme partie de cet
schdma cindtique classique b& sur la distribution de article.
vitesses Maxwellienne introduit par Pullin dans [19]. On

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
10-2

1.1 DCfinition d’un schCma cindtique Diffkrents choix eont possiblea pour la fonction
d’kquilibre fw([). Les deux plus couranb sont la fonc-
Lee &&mas cinktiques eont des schkmas VolumeeFinis tion ’crkneau’ prop& par Perthame 17’81
d k e n t r b pour lesquels la fonction Flux Numkrique
F ( w , J , n ) eat de type Flux-Splitting c’eat-Mire de la
forme suivante:

F(w,w’,n) = F+(w,n)+F-(w’,n) (oh Y eat la fonction indicatrice de [0,1], Bd la boule


unit6 de Rd)et la fonction ’Mexwellienne’prop& par
OG w et w’ sont deux Ctats quelconques et n un vecteur P U l l i i [19]
unitaire de Rd (repmisentant la normale a l’interface en-
tre deux cellules). Les fonctions F + ( w , n ) et F-(w,n)
s’expriment sous la forme d’une intkgrale sur l’espace dit
des ’phases’ en Physique Statistique. Dans le cas d’un C’est cette dernikre que nous awns choisie car c’est la
gaz mono-atomique, on a par exemple: mieux adapt& pour l’extension nux m6langes de gar,
&ls [12] et c’est celle qui conduit aux formulea 1- plus
simples pour l’extension B l’ordre 2 [14].
On peut bien s t r expliciter la formule (1) dans le cas
oi f w est une Maxwellienne. Dane le cas d’un gaz parfait
d’kquation p = prT avec c = f ( T ) (f fonction rkgulikre
06 fw(<) disigne la distribution d’kquilibre des particules quelconque) on a:
et satisfait par dkfinition les relations:

La formule (1) peut s’interprkter en considkrant que


les particules traversant une interface sont constituh
de celles venant de gauche et se ddplacant dans le sen8
de la normale (contribution b F+) et de celles venant Oh C X = -U.n
C
de droite et se dkplqant en MM opposk b la normale
(contribution ir F-).

Pour une prbentation plus complbte des schkmas


cinktiques et en particulier pour la dkfinition precise
d’une distribution d’kquilibre, on pourra se reporter par
exemple B l’article de B. Perthame [7].

1.2 Quelques propri6tCs des schCmas


cinCtiques
L’expreeaion (1) des fonctions F+ et 7’ permet
de dkmontrer de nombreuses propridtb des schkmas
cinktiques. En particulier, en dimension 1 d’espace,
lorsque le support de la distribution d’kquilibre est un
compact de la forme [-&tl(mor]l on peut montrer [7]
que les schdmas cinktiques associks eont entropiques et
prkservent la positiviti de la densitk et de la tempirature
sous la condition CFL At < Az/&,,ar.
Dans le c a d’une distribution Maxwellienne (support
non bornk), le problkme est plus dClicat et, B notre con-
figure 1: Flux de particules b travers une interfa naissance, la consistance avec l’inkquation d’entropie a
10-3

seulement dt.6 prouvb de manikre formelle dam [SJl]. 2 Extension a I’ordre 2 et posi-
La question de la positivi6 du schdma a par contre 6 6
r h l u e dam [11,14]. On rappelle ci-deseoue le principal t ivit6
rhltat.
Pour 6tre le plus gCndral possible, on se place dans un 2.1 Principe de I’extension 1 I’ordre 2
cadre multidimensionnel (d ddsignant le nombre de di- Pour dtendre une mdthode de volumes finis b I’ordre 2
mensions d’espace) et le d l a g e , not6 M h , est s u p p d en espace, il exiete au moins deux approches classiques:
quelconque. On note K un dldment quelconque de Mh,
m(K) sa mesure de Lebesgue dans Rd, K e eon voisin le 0 La premikre (sans doute la plus uti& du fait de
long de la face e , nK,e la normale b la face e dirig6e de sa simplicitd et de sa gdndrali6) est la mdthode
K vera Ke et m(e) la mesure de Lebesgue de la face e MUSCL de Van Leer. Elle ~0nsiSteb dhmposer
dans Rd-’ (cf figure 2). un pas de temps en deux dtapes: une premi6re &ape
d’interpolation affine de la eolution approch&, une
seeonde &ape oii l’on applique le echima volume fini
aux valeurs interpoldes de la solution approchb. Le
point essentiel rdside dam le fait que lora de I’dtape
d’interpolation, il est nkcessaire de limiter la valeur
du gradient de la solution approchde a h d’dviter
1’apparition d’oscillations.
0 La aeconde dhignde dam la littdrature anglesaxone
sous le non de ’corrected antidiffusive flux approach’
(elle sera not& CAFA par la suite) consiste h ajouter
au flux numdrique du premier ordre une correction
antidifhive qui doit dtre limit& pour des raisons
de stabilitd numdrique.

Ces deux mdthodes ont dtC t r b bien dtudides d’un


point thbrique dans le cas dans le cas d’une loi de con-
servation scalaire (voir par exemple Goldveski-Raviart
[6], Coquel-Lefloch (21, ...). En particulier, on sait
dans ce cas donner des critbres prdcis sur la fwon dont
figure 2: Vue partielle du maillage lea pentes doivent dtre l i t & pour que la mdthode
numdrique soit stable au sene de la norme B V (schdma
On montre dans (141 la proposition suivante: TVB) ou de la norme Loo.
Proposition 1 Le schdma Volumes-Finis Dans le cas des systbmes gdnbraux de lois de con-
servation, en particulier celui des Cquations d’Euler, il
n’existe actuellement aucune thdorie. On se contente
donc en gdndral de raisonner par analogie avec le cas
scalaire, afin d’en ddduire certains cridres empiriques de
stabilitd. De plus, dans le cas de la dynamique des gaz,
associd au p u t numdrique ddfini par les fomoles (a) vient se rajouter le fait que la solution w = (p, pU, PE)ne
(distnbdion Matwellaenne) preserve la positivaiC de p prend pas sea valeura dans Ild+l tout entier mais seule-
ei de T sous la condiiion CFL ment dans un sous ensemble de celui-ci, Wad, ddfini par
-
de8 contraintea de positivit6 p > 0, p E 1/2pU2 > 0.

L SUP
m(W(IUEI + + +IA,
2d 1 r
< (3)
Quelque soit l’approche adoptie, MUSCL ou CAFA,
il est ndcesaaire de joindre (ou dventuellement de sub-
K€Mh m(K) stituer) b ces critbres empiriques de limitation de pente
Remarque: En pratique cette condition est un peu ou de flux, qui permettent de contrbler les oscillations,
plus restrictive que la condition CFL usuelle (elle cor- une condition qui garantisee que le schdma laisse invari-
respond environ pour un gaz parfait avec y = 1.4 B ant l’ensemble Wad (ce qui suppose bien eiir que la p r e
CFL = 0.5). Toutefois c’est seulement une condition pridtd est dejb satiafaite par le schkma d’ordre 1). I1
suffisante et, dam lea applications, on n’a jamais con- est intdressant de remarquer que cette seule propridtd
statk de difficult6 en prenant CFL = 0.9. d’invariance de l’ensemble Wad garantit la stabilitd en
10-4

norme L’ du d d m a [SI et constitue done un crit&rede


stabilit.4 faible.

Dane le cas de la d t h o d e MUSCL, une variante


a 6td prop& par Perthame et Qiu qui garantit la
prkrvation des positivi&. Le principe a t de constru-
ire la solution interpolb ir chaque pas de temps de sorte
que l’on ait ir la foie conservation de p, pU, p E sur chaque
cellule et positivitd de pet T aux noeuds du maillage [lo].
Cependant les rbultats numdriques sont 8 s ~ e zddcevant
du point de vue du gain en p&ision, et leur technique
de reconstruction eemble difficilement gdnCrelible sur
des maillages quelmnques.
L’approche que nous proposons dans cet article est
plut&t une variante de la mCthode ’CAFA’. Elle peut a-
priori s’dtendre ir tout schdma de Flux-Splitting (cet as-
pect est dCvellop6 dans [14]) mais les schdmas cindtiques
posskdent toutefois deux avantages essentiels:
0 La positivitd peut 6tre prouvk dans le cas du
schdma d’ordre 1 ce qui n’est pas le cas par exem-
ple pour les schdmas de Flux-Splitting de Steger et
Warming ou de Van Leer.
0 I1 est possible, grice ir la reprbentation intdgrale
(1) du flux numdrique, d’expliciter les limitations ir
imposer sur lea corrections antidiffusives pour que
le schdma prdserve lea positivitb. oh ze dbigne le centre de la face e, ZK le centre de
gravitd de la cellule K , (ps);, (pUj&, et (pT& des
estimations des gradients spatiaux respectifs de s,Uj et
Nous allom rappeler ici les grandes lignes de cette T dans la cellule K ir l’instant n et (qs);(, (qUJ!&,et
approche, en renvoyant ir [14] pour les ddtails. L’idk (qT& des estimations des d d r i v h temporelles de s,UJ
gdndrale est de remplacer sur chacune des interfaces e et T dam la cellule K ir l’instant n. Nous prdciserons au
du maillage ~ + ( w K , ~ K par, ~ )~ + ( w K n, K , , ) + A G : et paragraphe suivant les choix effectub pour calculer ces
3’(wK.,nK,e) Par 3 ’ - ( w K . , n K , e ) + P ~ > , oh les fOnC- quantitb ainsi que la valeur de fm,,
tions 3+et 3’ sont ddfinies par lea formules (2) (cor-
respondant ir une distribution Maxwellienne) et A 3 + NB: Noter que 6 s , 6 U et 6T sont des quantitds en O(h)
et A 3 - sont des corrections antidiffwives qui pour un ( h dtant le pas du maillage) lorsque la solution est
gaz mono-atomique s’expriment sow la forme intdgrale rdgulibre.
suivante: Du point de vue cindtique, on peut interpreter ies
formules (4) et (5) en considdrant que la distribution
d’dquilibre Maxwellienne fw;; (t)a Ctd remplac6e par une
distribution d’dquilibre ’perturb&’ f w ; ; ( ( ) + Afw; (0,
tenant compte des gradients de la solution (ce point de
vue est e x p d dans [15]). Cette id& dtait ddjk prdsente
dans lea travaux de S. Deshpande [SI. L’introduction
d’une fonction crdneau Y ( z ) dans la definition de Afw
permet de garantir que la distribution modifi6e f w ; ( [ ) +
Afw;(() reste toujours positive (ir condition de choisir
convenablement fmOz). Cette propriktd joue un r6le es-
. , sentiel pour la positivitd du schdma d’ordre 2.
oh Afw(<) est une correction de la distribution
d’dquilibre, fonction mmme dans le ddveloppement de On montre dam [14] le rdsultat suivant:
Chapmann et Enskog [18] des gradients locaux de la so-
lution. Dam le cas oh f w est une Maxwellienne, elle Proposition 2 Si pour fouf K E M h et pour fout e E
s’exprime sous la forme suivante (cf [14]): K:
10-5

oh q dbigne une compoeante quelconque de W . Cette


formule est d’ordre 2 en h si la solution eat rdgulihre et
si le maillage est cartbien.
Pour limiter lea gradients obtenue nous avons utilis6
le schdma ddcrit ci- une gdndralisation multidimeneionnelledu limiteur ’min-
aueC ((6U)$,,I = mod’, qui consiste i imposer que les valeurs Min et Max
-
dessos eat du second odn en temps et en eapace a i M h de la fonction q(z) = I& + VqK.(Z Z K ) au centre
ed une grille cartdaienne et a i les gnadients sont esfimtes
des faces de l’dldment K (et non pas aw sommets de
au second o n f n . De plus, il pdserve la posiiavitd de p et l’dldment K, ce qui serait plus eontraignant) soient com-
de T aoua la condition CFL: prises entre le Min et le Max des valeurs de q sur lea
dldments voisins de K. En pratique on commence par
calculer lea valeurs Min et Max de i ( z ) sur l’dldment K
(notdes Qmin et qmor) ainsi que les valeurs Min et Max
de q sur 1- dldments voisins (no& i m i n et i m o r ) . On
pose ensuite:

Remarques:

0 On remarque qu’aucune limitation de type ‘min-


mod’ n’est ndcessaire pour garantir la positivitd du
schkma. On en donne une illustration numdrique
dans [14]. Toutefois les limitations (i) et (ii) On prend enfin:
ne sont pas suffisantes, pour contrder totalement
l’apparition d’oacillations spatiales. I1 est ndcessaire
en pratique de les aasocier B d’autres limitations plus
classique de type ’min-mod’ qui seront explicit& a u L’estimation des d d r i v k en temps de la solution
paragraphe suivant. discrhte se fait B partir de la forme non conservative
des dquations d’Euler. Sous forme non conservative, le
0 La condition CFL ci-dessus est un peu trop restric-
systhme de la dynamique des gaz peut en effet s’dcrire:
tive. Dans lea applications, nous n’avons jamais con-
state de difficult6 en prenent CFL = 0.9. &S = -u.VE

0 Tous lea rdsultats expo& ci-deeaus ee gdndralisent


dens le CM d’un gaz parfait polyatomique de 7 stu = -U @VU -P1v p -
quelconque. I1 suffit d’augmenter la dimension de
l’espace des phases pour prendre en compte les -
BtT = -U.VT - (7 1)TdivU
degrds de libertk internes des molkules. La condi- On obtient les estimations souhaitdes de 6,s’ &+U et 6tT
tion de positivit6 fait cette fois intervenir une con- en se servant de ces relations et des valeurs de V s , Vp,
trainte suppldmentaire sur lbTI. On renvoie i [14,13] VT et V U , calcul&s selon la formule (6).
pour les ddtails et les formules explicites permettant
de calculer AF+ et A T .
3 Crit6re de raffinement et mail-
lages aut o-adapt atifs
2.2 Principe du calcul et de la limitation
des gradients de la solution discrhte 3.1 Description de la technique de raf-
finement de maillages
De nombreuses solutions sont propos&s dans la
littdrature pour estimer les gradients de la solution Afin d’amdliorer la prdcision des rdsultats pour le cal-
diecrbte i partir de sea valeurs dans chacune des cel- cul d’rkoulements stationnaira, une procddure de raf-
lules du maillage. Par souci de simplicit6 et dgalement finement automatique de maillages a dtd implantde dans
pour des raisons li& ir notre structure de donndes, nous les codes de calcul 2D et 3D. Le principe utilise est le
avons choisi la formule suivante: suivant :
10-6

1. On commence par calculer une premiire solution


stationnaire sur le maillage grossier de ddpart.
2. On calcule alore sur chacun dea dldmenta du maillage
la valeur du critkre de rffiement et en fonction de
ce critire certains Cldments eont r a f i Q selon une
procedure qui sera dicrite ci-dessous.
3. On calcule ensuite une nouvelle solution sur le mail-
lage rafEn6 en partant de la solution sur le maillage
prdddent.
4. On rditire iventuellement le processus
On utilise, dans notre code de calcul 2 types
d’ClBments en dimension 2 (triangles et quadrangles) et
3 en dimension 3 (t6trddres, pentddres et hexddres).
La rdpartition de ces diffdrents dldments peut itre quel- figure 4: W n e m e n t d’un triangle
conque au sein d’un m6me maillage. En particulier il
n’est pas nBcessaire d’assurer la coincidence nodale en-
tre deux dlBments voisins cornme l’illustre par exemple
la figure 3.

figure 5: Rattinement d’un quadrangle

3.2 Un critkre de rafhement fond6 sur


la production locale d’entropie
Nous allons maintenant ddcrire le crit&rede raffinement
utilid. Ce critkre a Bt6 introduit par P. Mazet et al dans
figure 3: Vue partielle d’un maillage non conforme [1,11]. I1 repose sur lea liens entre lea schimas cindtiques
et la SymBtrisation, via 1- variables entropiques, des
Bquations d’Euler.
Pour raffiner un dldment, le principe coneiste dans tous
les cas b le diviser en un certain nombre d’dldments Cornmensons par quelques rappels sur la
fils (4 en dimension 2, 8 en dimension 3) tous sem- eymdtrisation dea dquations d’Euler. Afin de simplifier
blables i l’dlkment de dipart. Chaque dldment est bien lea notations, on se restreint au cas d’un gaz parfait poly-
sQr raffini indipendamment de see voisins, si bien qu’b Tl/(V-l)
I’issue d’une phase de r f f i e m e n t le maillage obtenu est tropique. Soit s = rlog( ) l’entropie messique
7
gdndralement non conforme (non coincidence nodale en- du gaz. I1 est bien connu que la fonction S(w) = -ps
tre certains dlBments). On a echdmatid sur lea figures 4 (oi~w est le vecteur des variables conservatives) est une
et 5 le principe de raffinement d’un triangle et d’un quad- fonction strictement convexe en w et constitue une en-
rangle. En dimension 3, on renvoie pour une description tropie de Lax pour les dquations d’Euler, associde au
dbtaillde de la procddure de rafhement au travail de J. flux d’entropie V S ( w ) . Les variables entropiques sont
Delaire [20]. alors ddfinies de la maniire suivante:
10-7

R, on obtient une dicomposition convexe-concave de la


fonction c* en posant simplement:

c*f(4,n) = /R,(t-n)* exP[($p -k h U 4 -k h E -c’T ) / r ] 4


(10)
Les relations (7) difinissent un changement de vari- En dfiirenciant (10) par rapport 4 on voit facile
ables bijectif w + 4 de Wad sur @ad = Rd+’x R*. On ment que lea fonctions F+(w,n) et F-(w,n) ainsi
difinit alors sur @ad la transform6e de Legendre 9 de S obtenues ne sont autrea que lea fonctions P ( w , n ) et
par: P ( m , n) d6hiea par (1). On peut donc icrire que:

P(4) = -S(W(4)) + 441.4 7 f ( v ) = V#+(d(W), n) (11)


De rdxne, on introduit une pseudo-transform& de Nous dons mainbenant u t W la propriOt.6 (11) et la
Legendre du flux d’entropie dans la direction du vecteur concavitd de la fonction c*-(#(w),n) pour Ctablir une
n, notde c*(4,n), en posant: estimation de la production l o d e d’entropie du schCma
cinOtique d’ordre 1, introduit h la section 1, lorsque I’itat
stationnaire est atteint. Par difinition, lorsque l’ktat sta-
tionnaire est atteint , on a sur toutes les cellules K du
oh F(w).n = IpU.n,pUU.n+pn, (pE+p)U.n]‘ dbigne le maillage:
flux dans la direction n. Cette fonction est appelQ fonc-
tion de symktrisation du systbme des iquations d’Euler
car elle possbde par construction la propriiti suivante: C +
[7+(wK, nK+) F-(wK., nK,e)] m(e) = o
eEBK
F(w(4)).n = V P ( 4 , n ) (8)
En seservant du fait que F(wK).nK,em(e)= 0 (for-
Le flux des Cquations d’Euler, exprimi en variables eEBK
entropiques, est donc le gradient de la fonction de mule de Green), et en ajoutant cette quantiti au premier
symitrisation F ( 4 ,n). I1 en dicoule immidiatement, membre de la prickdente dgalitk on obtient:
qu’icrit en variables entropiques, le systbme de la dy-
namique des gaz est symktrique. D’autre part, d’aprb
(8) toute decomposition de la fonction P ( 4 , n ) en la
somme de deux fonctions C”c(4, n) et c*-(4,n) induit,
en difkenciant par rapport B 4, une dkcomposition du D’autre part, posons:
flux F(w).n en la somme de deux fonctions F+(w,n) et
F’(w,n). Si de plus, pour tout n, I F ( # , . ) mt con-
vexe en 4 et c*-(+,n) concave, dors on peut montwr
(voir [111) que la jacobienne de F+ (w ,n) est diagonalis-
able ii valeurs propres positives et que la jacobienne de
F - ( w , n) est diagonalisable B valeurs propres nkgatives.
La dkcomposition en partie convexe et concave d’une C+(w,n)+C-(w,n) = U.nS(w)
fonction de symitrisation fournit donc un moyen nature1 Le couple (c+(w, n), C-(.w, n)] constitue donc une
de construire des schimas de Flux-Splitting correctement dicomposition du flux d’entropie. En multipliant
dBcentr6. scalairement 1’6galitd (12) par QK = #(wK) et en se ser-
Le lien avec le formalisme cinitique provient du fait vant de (13), (11) et de la formule de Green, on obtient:
que la fonction c* peut s’icrire, B une constante multi-
plicative p r b , sous la forme intigale suivante (casd’un
gaz mono-atomique):

. .
+ +
La fonction exp[(+, 4N.t dpE.f)/r1 n’est autre
que la Maxwellienne. La fonction exp itant convexe sur
10-8

La fonction c*-(.,n) Ctant concave, le terme QK est 5 Resultats numeriques


ndgatif. I1 peut s’interprdter comme la production b
d e d’entropie sur la cellule K,l’autre terme dans (14) Afin d’kvaluer la pr6cision et la robwtesse de ce nou-
Ctant eimplement un terme de flux. C’eat cette quantitd veau echbma, now avom r C W plusieure exp6rienees
que now avo118 u t i l i i cOmme critkre de rdiinement, numdriques.
le calcul explicite de la fonction F-pouvant e’effectd
facilement pice ir la ddfinition (IO). Tout d’abord, afin d’illustrer nuhriquement la
prdservation des positivitb, now avona calculd la so-
lution du problkme de Riemann proposd par Sjogreen
dans [3],pour lequel la solution est t r b proche du vide.
4 Schema implicite Nous avom effectud les calculs sans utiliser de limiteur
de pentes de type min-mod (cf figure 6) pub avec lim-
Pour le calcul d’&oulement stationnaire 3D,l’utilieetion
iteur (cf figure 7). Le8 rbultata obtenus sans limita-
d’un schCma explicite en temps s’est avCr& trop tion sont bom mais font apparaitre quelques oscillations,
coiiteuee, m6me avec une technique de pae de temps b
cal. Le schCma dkrit ir la section 2 a donc Ctd implicitd, qui dispareiesent avec l’utiliation du limiteur. Pour
des cas test plus complexe ( p r k n c e de discontinuitb)
suivant le principe classique qui consiste A lindarid par-
l’utilisation d’un limiteur est ndcessaire.
tiellement le sysbime non lindaire que l’on doit r h u d r e
ir chaque pas de temps. De ce point de vue, lee eehdmas Sur les figures 8 et 9, on prbente les rdsultats con-
cindtiques p d d e n t une particularitd intdreesante:les cernant le cas d’un dcoulement hypersonique 2D B Mach
fonctions 3- et 3+eont diffdrentiables et homogknes 25 et Incidence 30. Le maillage a dtd obtenu a p r b trois
de degrd 1. Elle vdrifient donc lea relations suivantea: raffinements successifs. La mdthode de Roe n’a pu etre
utilisie qu’ir l’ordre 1 car ir l’ordre 2, meme avec de fortes
F + ( w , n) = [Jac(F+)(w, .)].tu limitations de pentes, des tempdratures negatives appa-
3 - ( w , n ) = [Jac(3-)(w1n)].w (15) raissent b l’arrikre corps. Les rdsultats obtenus avec
le schema cindtique d’ordre 2 sont t r k bons et bien
ce qui permet de simplifier l’dcriture de la forme meilleurs que ceux obtenus avec le schdma de Roe. En
lindarGe du schema implicite. Celle-ci peut donc particulier, on peut constater l’absence d’oscillations sur
s’dcrire: lea courbes du coefficient de pression ir la paroi (figure
8)’ contrairement aux rbultats obtenus avec le schdma
de Floe premier ordre (avec correction d’entropie).
Sur les figures 10 ir 12,on prdsente les rdsultats con-
cernant le calcul d’un dcoulement instationnaire entrant
ir Mach 3 dans un tune1 comportant une marche. Ce cas
test est t r b clwique et bien documentd. On compare les
rhultata obtenus avec la mdthode de b M U S C L (avec
correction entropique) et le schdma cindtique. (sans cor-
rection entropique). Lea deux maillages utilisb sont
La correction du second ordre n’est pas implicit6el afin c o m p d s de triangles (le pas choisi est de 1/40emece qui
de simplifier l’expression de la matrice Jacobienne du est aasez grossier pour ce cas). Le second maillage a dtd
dux numdrique. On n’a pas rencontrd pour autant de reaffind prks de la paroi afin d’dliminer l’influence de la
problkme de stabilitd. couche limite numdrique. On peut noter que, sur les deux
maillage, la detente sonique est mieux captur& avec le
A chaque pas de temps, le systkme lindaire ci-dessus schdma cindtique qu’avec le schema de b e . On con-
est rdsolu par une d t h o d e itdrative. On en a compard state de plus la p r k n c e s de nombreuses oscillations sur
deux: la d t h o d e de Jacobi et la mdthode BICGstab les courbes b d e n s i t d obtenus avec le schema de Roe.
[17].Si on se contente d’une prdcision moyenne ir chaque (Celles-ci disparaissent si on renforce les limitations de
rdsolution (ce qui est suffisant en pratique), la mdthode pentes). Par contre, avec le schdma cindtique, la position
de Jacobi est un peu plus performante en temps CPU. de la ligne de choc aprks la deuxikme rddexion n’est pas
La tendance s’inverse si on souhaite une trks gande correcte. Ce ddfaut semble du ir la trop grande dpaisseur
prdcision. L’utilisation d’un prdconditionneur pour la de la couche limite numdrique. I1 disparait loraqu’on raf-
mkthode BICGstab amdliore nettement la vitesse de fine le maillage prks de la paroi. Avec le schema de Roe,
convergence (d’un facteur 2 au moina) mais n’apporte on constate que les oscillations nudriques prdsentes au
pas un gain en temps CPU compte tenu du coiit du niveau de la ligne de glissement sont amplifides lorsqu’on
prdconditionnement . raffine le maillage au niveau de la paroi.
10-9

Sur lea figures 13 et 14, on p r k n t e le calcul d’un in several space dimensions: tbe corrected antidif-
6coulement stationnaire entrant a Mach 2 dans un tune1 fusive flux approacb, Math. of Comp. 57, 1991, p
comportant une rampe inclin& 15 de&. Le mail- 169-210
lage a & obtenu a p r b t r o i s rffiementssucceaeite ZI
partir d’un maillage gromier comportant 2500 cellules. [3] B. EINFELDT, C.D. MUNZ, P.L. ROE and B.
Le critkre de rathement utili& (cf section 4) a permis SJOGREEN, On godunov type metbods near low
de detecter toutes les ondes p r h n t e s dana IUcoulement; densities, J. ofa m p . Phys., Vol. 92, 1991, 273-295
en particulier le maillage a dt4 raf6nd au niveau de la
ligne de glissement dmanant du point triple situ6 sur la [4] S.M. DESHPANDE, On the Maxwelian distribu-
paroi sup6rieure (cf figure 11). On voit sur la figure 12 tion, symmetric form and the entropy conservation
que le raffinement du maillage a permis une mdlioration for tbe compressible Euler equations, T d . Rep.
sensible de la qualit6 des r h l t a t s . 2583, NASA Langley, 1986
Enfin on prbente sur les figures 15 Q 17 des rbultats [5] S.M. DESHPANDE, A sewnd order accurate, ki-
numdriques 3D concernant le calcul d’un dcoulement netic tbeory based, metbod for inviscid w m p m i b l e
tramonique (Mach: 0.84, Incidence: 3.06 de@&) au- Bows, Tech. Rep. 2613, NASA Langley, 1986
tour de la voilure M6 de 1’ONER.A. Ce cas teat est t&
bien document4 dam [21]. Le maillage initial est con- [SI E. GODLEWSKI, P.A. RAVIART, Hyperbolic sys-
stitud d’environ 60000 tktrddres, ce qui est assez grossier tems of conservation laws, SMAI, 1990
pour ce type de calculs. Le maillage final (figure 20) a
6tk obtenu a p r b 2 raffbements successifs. Lea rbultats 171 B. PERTHAME, Boltzmann tVpe schemes for gas
obtenus sont tout Q fait en accord avec ceux des diffdrents dynamics and the entropy property, SIAM J. Num.
contributeurs du workshop AGARD (211. Le raffinement Anal., Vol 27-6, 1990, 1405-1421
du maillage permet la encore d’amdliorer sensiblement la
prCcision des rdsultats. [8] B. PEKTHAME, F. CORON, Numerical passage
from kinetic to fluid equations, SIAM J. Num.
Anal., Vol28-1, 1991, 2642
6 Conclusion
[9] B. PEKTHAME, Second order Boltzmann schemes
On a prdsentd dans cet article un nouveau schdma for compressible Euler equations in one and two
cinetique d’ordre 2 prdservant la positivite de la masse space variables, SIAM J. Num. Anal., Vol. 241,
volumique et de la tempdrature sous condition CFL. Lea 1992
premier rbultats nudriques obtenus sur maillages non
structurb sont tree bona et confirment lea propridth [lo] B. PERTHAME, Y. QIU, A new variant of Van
thdoriques de robustesse du schdma. De plus l’estimation Leer’smetbod for multidimemional systems of con-
d’entropie discrkte aesocike au schdma d’ordre 1 permet servation laws, Rapport technique 1562, INRIA,
de ddgager, de manikre naturelle, un critkre de rathe- 1991
ment de maillage fond6 sur la production d’entropie lo-
cale. Ce critkre semble un excellent candidat pour la [ll] P. VILLEDIEU, P.A. MAZET, Sche‘mascine‘tiques
capture des discontinuitds stationnaires. pour les kquations d%uler
La suite de cette dtude va consister a dtendre ce bors Bquilibre thermocbimique, h paraitre dans la
schdma au calcul d’dcoulements rdactifs, pour lesquels Recherche Adrospatiale
la robustesse de la d t h o d e numCrique est un critkre e%
sentiel. Ce travail est en coura. [12] P. VILLEDIEU, Approximations de type cine‘tique
du syst4me byperbolique de la dynamique des
gaz bors kquilibre tbermochimique, Thkse de
References l’universitd Paul Sabatier, 1994

[l] F. BOURDEL, Jp. CROISILLE, p. DELORME, [13] P. VILLEDIEU, ScbBmas cine‘tiques d’ordre 2 sur
P. MAZET, Sur l’approxhation par e‘le‘mentsfinis maillages non structurk Rapport Technique DEW
des systkmes hyperboliques K-diagonalisables, A p n 213526.00, Fev. 1995
plication aux 6quations d’Euler et aux m’langes de
gaz, La Recherche Adrospatiale, 1989-5, 15-34 [14] P. VILLEDIEU, J.L. ESTIVALEZES, High order
pasitivity preserving scbemes for the compressible
[2] F. COQUEL, P.G. LEFLOCH, Convergence of fi- Euler equations, a paraitre dans SIAM J. Num.
nite difference schemes for scalar conservation laws Anal.
10-10

[15] P. VILLEDIEU, J.L. ESTIVALEZES, A second or-


der pasitivity preserving scheme for tbe compres%
ible Euler equations, Proceedings de la 148m8ICN-
MFD, Bangalore!, 1994
[16] P. VILLEDIEU, A positin'ty preserving m n d or-
der accurate kinetic scheme for cbem'cally reactive
flows, en prdparation
(171 E.A. VAN DER VORST, Bi-CCSTAB: A fast and
smootbly converging variant of Bi-CC for tbe mlu-
tion of nonsymetric linear systems, SIAM 1. Num.
Anal., Voll3-2,1992
[18] C. CERCIGNANI, Tbe Boltemann equation and its
applications, Scottish Academic Press 1975, Reed.
A.M.S. 64 (Springer)
[19] D.I. PULLIN, Direct simulations methods for com-
pressible inviscid ideal gas-flows, J. of Comp. Phys.,
Vol. 34, 1980, 53-66
[20] J. DELAIRE, Algorithmes de raffinement de mail-
lage non structurC 30, Rapport de Stage DERI,
Sept. 1995
(211 AGARD ADVISORY REPORT, Test cmes for in-
viscid flow field methods, May 1985
(221 G. MEHLMAN Etude de quelques probl4mes lids
aux Ccoulements en dddquilibre cbimique et ther-
mique ,ThLe de doctorat de 1'Ecole Polytechnique,
1991.
-
10-1 I

f!J!l lIIl lzl


DENSITY PRESSURE VELOCITY
l2

0.2 -0.5
0
.3
0.0
0.1
0.00.0 0.2 0.4 0.6 0.8 1.0 -250.0
-1.5 02 0.4 0.6 0.8 1.0

:m
0.0 0.2 0.4 0.6 0.8 1.0

figure 6: Sjogreen tent case: with no M i - M o d limitations

DENSITY PRESSURE VELOCITY

2r
f
i

0.6
4.5

0.3 0.1 -15

0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 "0.0 0.2 0.4 0.6 0.8 1.0

7 S j o p n tent case: with M i - M o d limitations

_-
figure 8: Ernes tah c w : Reflined Mesh (10000 .ell.)and C, at the wall
10-12

Kinetic scheme Roe .cham


figure 9: Hermen test wee: Is0 Mach Linen on the Reflined Mesh

I
I
figure 10: Medium Mesh (SOW cells) and Rd%ed Mesh (14000 &)

Kinetic Scheme Roe d e m e


figure 11: Emry test elye: Ia, density Lines on the medium Mesh
Kinetic Scheme Raeacheme

*re 12: Emvy test case: Iao density Lines on the r e 5 e d Mesh

figure 13: Coarse Mesh (2500 cells) and Reffincd Mesh (9000 ceb)

Coarse mesh Reffined mesh

figure 1 4 Is0 demity Lines


figure 15: Coarse Mesh and Reflined Mesh on the Wing

figure 16: C, on the Ww at diRuent se-ctions

Coarse mesh Reffined mah


figure 17 Is0 Pressure Lines on the Wmg
11-1

A meshless technique for computer analysis of high speed flows

T. Fischer, E. Oiiate and S. Idelsohn


International Center for Numerical
Methods in Engineering
Edificio C-1, Campus Norte UPC
C/ Gran Capitdn, s/n
08034 Barcelona, Spain

1. ABSTRACT sibility of deriving numerical methods without us-


ing meshes. Nayroles et al [l]proposed a technique,
This paper describes the results of the research car-
calling it diffuse element method (DEM), where only
ried out by the authors in the computer modelling
some nodes and a boundary description is necessary
of flow problems using an approximation based on to formulate the Galerkin equations. The intep-
"douds of points" which does not require the defini-
lating functions are polinomials fitted to the nodel
tion of a mesh. The so called Finite Point Method
values by a least squares approximation. Although
(FPM) [5] is presented showing some examples for
no finite element mesh is explicitly required in this
the solution of the 1D convection diffusion equation
method, still some kind of "auxiliary grid" is needed
and 2D compressible inviscid flows.
in order to compute numerically the integral expres-
sions deriving from the Galerkin approach. This re-
2. INTRODUCTION
quirement may prelude the successful extension of
The finite element method (FEM) and the finite vol- the DEM to 3D problems.
ume method (FVM) are well established numerical
More recently, Belytschko ef a l [ 2 ]have proposed an
techniques whose main advantage is their ability to
extension of the DEM which they call the element-
deal with complicated domains in a simple manner
free Galerkin (EFG) method. In that work, gen-
while maintaining a local character in the approxi-
eralized moving least squares interpolants typically
mation. Both methods seek to divide the total d e
exploited in curve and surface fitting are used to de-
main into a finite number of subdomains (or ele-
fine the local approximation. This provides ad&-
ments) wherein a volume integration is performed.
tional terms in the derivatives of the unknowns field
For these reasons the subdomains are limited by
omitted by Nayroles et al [l]. In addition, a reg-
some regularity of geometrical conditions such as
ular cell structure is chosen as the "auxiliary grid"
having a positive volume or a limited aspect ratio
to compute the integrals by means of a higher order
between elements, angles, etc. Although this poses
quadrature. Finally, Lagrange multipliers are used
no serious difficulties for 2D situations, the lack of
to enforce the essential boundary conditions. The
efficient 3D mesh generators makes the solution of
same approach has been further generalized by Liu
3D problems a difficult task.
e l d [ 3 ] by introducing concepts from wavelet theory.
It is widely acknowledged that efficient 3D mesh gen-
The use of "clouds of points" to define local approxi-
eration remains one of the big challenges in FE and
mations is by no means new and it has enjoyed some
FV computations. Thus, even the more complex
popularity among finite difference (FD) practitioners
problems in CFD, such as some 3D solutions of the
to derive generalized FD schemes in arbitrary irreg-
Navier-Stokes equations, can be accurately tackled
ular grids. Here typically the concept of a "star" of
nowadays providing an acceptable 3D mesh is avail-
nodes is introduced to derive FD approximations by
able. However, the generation of 3D meshes, despite
means of a local Taylor expansion using the infor-
major recent advances is still a bottle neck and it can
mation by the number and position of nodes con-
absorb far more time and effort than the numerical
tained in each star. These ideas have been success-
solution itself.
fully applied in fluid mechanics under the name of
Different authors have recently investigated the pos-

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
11-2

Smooth Partide Hydrodynamics Method. A recent S.l.2Lead squares interpolation


extension of these concepts to the solution of high
Increasing the number of nodes in ni to n > m, we
speed flow problems has recently been attempted by
cannot directly invert C anymore. However, through
Batina [4].
least squares approximation a square matrix is o b
In this paper a general methodology for the numer- tained which can be inverted if C .has a full rank,
ical solution of high speed flows using a finite set which is assumed in what follows. Hence, the fol-
of arbitrary points is described. The approach pro- lowing sum of squares can be written using eq. (1)
posed incorporates the main features of generalized
finite difference schemes and other more recent point
data based procedures such as the DEM and the n n

EFG [5]. The theoretical basis of the method in the


context of the solution of viscous and inviscid flows j=1 j=1
are described in some detail. The accuracy and a p
plicability of this method is shown in some examples
of application in 1D and 2D flow problems.
4
3. METHODOLOGY
"I
3.1 The Finite Point method (FPM)
From a polynomial expansion of order m a function
U(.) can be approximated in a local interpblating
domain ni (sometimes also termed ((clouds")'
/'"pi
U
(
.
) 2! C(2) = a1 + a22 + a32 + ...+ a , t m - l

= pT(x)a
=

(1)
4 X

where the base functions pT = [l, z]for m = 2 and Figure 1: Nodal unknowm U and the interpolated
pT = [I, 2, for m = 3 in one dimension [5]. function 0.

The above approximation can now be sampled at n


points within ni where the values of the unknown
Minimizing J with respect to a, % = 0, yields
U: = u(zi) are sought, i.e.
Q =A - ~ B U ~ (5)

with A = (CTC) and B = CT. The new shape


functions are now obtained as

where C is a nxm matrix.


This means that an interpolated curve C(z) is gen-
3.1.1 Finite element interpretation erated from some point values :u in each cloud as
shown in Figure 1. Note that the fitted curve does
If n = m, a standard finite element interpolation is not necessarily pass through the nodal unknowns U!.
obtained by inverting eq. (2) and substituting into
(1) as Recently, Batina [4] has used a similar type of least
squares fit for fluxes and stresses in compressible flow
analysis. However, he avoids the direct inversion of
matrix A by doing a QR decomposition.
N being the standard finite element shape functions In the present approach the danger of A being singu-
[71. lar is avoided by appropriately selecting the points
11-3

.
in the interpolation domain Qi This reduces both Taking A and IC constant, the analytical solution of
computational cost and memory. this first order homogeneous differential equation is
obtained as:
3.1.3 Weighted Lead Squama Approach
A drawback of the interpolation procedure so far pre-
sented is that equal weight is given to all the points
in ni. This can rapidly cause a deterioration of the
approximation [SI. A remedy can be the introduc- With uo = 1 and UL = 0, equation (11) reduces to
tion of weighting functions, such as a Gauss function,
which will be described next.
Following the least squares approach from above, we
can directly include the weighting functions w(xj) in
A test for time marching schemes is solving equation
eq. (4):
(9) by iterating until steady state is reached to a p
proximate the exact result. This is usually done by
n
J = C w ( ~ j ) ( u F - i i ( ~ j )=CW(Z~)(U;-P
)~
n
a) T 3
expanding equation (9) in time using a Taylor series:

j=1 j=1
At'd'u
(7) U"+' = un + O0
T Bt' (13)
i=l
Again minimising J with respect to a,we obtain
A discretisation in space must be performed next.
First, known and proven finite element methods will
U = A - ~ B ~ ~ (8)
be presented, and then the finite point method pr+
with A = W(xj)(CTC) and B = C T W . W is now posed will be described.
a diagonal matrix containing the weights w(x,) at
each point in Qi. 3.2.lFinite element solution
In [6], the authors demonstrate a strong sensitivity It is well known that the exact solution to this prob-
to the number of points chosen within each cloud Qi lem can be nodally reproduced by the finite element
if no weighting functions are used. In an example, method using the following Petrov-Galerkin meth-
the shape function plots show a drastic deterioration ods for all ranges of the Peclet number Pe [7]. This
can be achieved by expanding equation (13) up to
for both linear and quadratic base functions p.

3.2 The F P M in a one dimensional context


first order, replacing e with equation (9) and dis-
cretising in space using Petrov-Galerkh shape func-
tions:
Let us now apply the theoretical background to a
typical test problem, the linear 1D convection dXu-
sion equation, and compare its results to known so- w =N +h 8N
lutions from the FEM. Consider the 1D convection-
diffusion equation: with the upwind parameter a-: = coth(Pe) - 1

which is optimal for this equation. The so called


Taylor-Galerkin approach can also be used to recover
exact nodal values for this problem. In fact, if equa-
tion (13) is expanded up to second order (omitting
third order derivatives) and standard Galerkin linear
shape functions are used, equivalence to the Petrov-
Galerkin scheme can be proved [7] for
At steady state (E= 0) equation (9) becomes:
A-
8u
- 8
-
82 8 x
(ICE) =0
Ah
with u=u(x). and P e = -
2n
11-4

Figlu 2 shows the exact n dal values obt ined using


a Taylor-Galerkin twestep scheme with Atqt and
Pe = 1 [7,8].

8U
-
8z
=a 3

and for the quadratic case:

Figure 3: Exact solution to the convection diffusion For the linear case, it is not possible to directly com-
equation using a Taylor-Galerkin scheme. pute the necessary second order derivatives. This
can be overcome by performing an accumulation of
3.2.2 The finite point method (FPM) differences at the central point and the rest of the
Let us now analyse the finite point method in the points j within the cloud. Hence,
context of the 1D convection-diffusion equation. In-
tegrating eq. (9) in time, performing a Taylor ex-
pansion of eq. (13) up to second order leads to:

with point 1 being the central point.


It can be shown that this scheme, for equally spaced
inserting eq. (9) into (17) and omitting third order points (n = 3 and rn = 2,3), is equivalent to central-
derivatives, we obtain: differences which in turn is equivalent to FEM with
Galerkin shape functions [8].

The discretisation of the computational domain is


performed locally using arbitrary points, without the
need for fixed connectivities in a conventional mesh. I
I
I
I
I
I
I I I
Performing a least squares approximation in the vi- I
I
I
I
I
I
cinity of a point using using n points, we obtain I
I
I
I
I
I

an estimation of the necessary spatial derivatives 2 I I I

- -
I I I

9.
I I I
I I I

-
and Substituting these derivatives into eq. (18) w w w
a
w b
leads to a system of equationsfrom which the un- 2 1 3 X

known point values u(j can be found for each time h


increment. The approach is equivalent to using a
point collocation scheme [SI. Figure 1: Equally .paced points and their domain of
inhencc for n = 3. Note the derivative i.
As explained earlier, the unknown functions U
(
.) equivalent to a central dif€erenceapproxi-
and its derivatives may be expanded as follows in mation.
a given cloud, for the linear case (in what follows we The following shows this for the linear case (m=2).
assume U(.) = C ( x ) ) : Consider three points (1,2,3) with the coordinates
11-5

( x 1 , x l , x 3 ) and at equal distance h from each other


as shown in Figure 3. Let their unknown values be unearpollnomlsls.
Quadretic p0anOml.b t
u1, u1, us, respectively. Then, A from equation ( 5 ) 0.8
can be calculated as
0.6 .

0.4 .

02 -
Inversion of A leads to

Figure 4: Exact nodal values obtained by FPM for


Pc = 1, n = 3 and m = 1 , s .
h.(5) gives the polynomial coefficients a1 and a1
as: However, if the number of points in the local inter-
polating domain n; increases ( n > 3 ) , the algorithm
introduces excessive diffusion and the quality of the
result deteriorates, especially near strong gradients.

The first derivatives in the linear case are constant.


Using eq. (19), ( 2 3 ) and ( 2 4 ) they become now:

21 21-h
BU
- -- a1 = (-2 1 - - X) Ul 1 + (-2h2 - -)u1+
BX 2hl 2hl 2hl

which is exactly a central difference approximation.


The second differences 9 are taken as accumulated Figure 6: C a w weighting h c t i o n ; w(rj) quickly
differences at the central node 1, which leads to decreases as rj increases (for X;c = 1).
3.2.SIntroduction of weighting functions
It is now of interest to see if the results can be im-
proved by the use of weighting functions wj within
ni. The idea is to give additional weight to points
close to the central point and reduce it for points far-
which also is a central difference approximation. By ther away. Within each n; we define w, as a function
analogy, we can derive similar statements for the of the distance of each point j to the central point:
quadratic case (m= 3).
Having shown the equivalence of FPM ( n = 3 , m = w, = w ( r j ) , and r j = Ixj - x1I (27)
2 , 3 ) with FEM, it should be possible to recover n e
dally excact values for n = 3 and rn = 2 , 3 using the A possible choice for weighting could be a Gauss
finite point method. Figure 4 demonstrates this for distribution which was also used in all following cal-
P e = 1. culations:
11-6

Let UB test the solution of equation (18) using the


w(rj) = e - (*)' FPM without weighting and with Gaussian weight-
ing functions.
where X i is a characteristic length in each cloud ni.
c and p are user defined constants to adjust the sen-
sitivity of the weighting function. Usually, c = 1 and
p = 2 are chosen. Figure 5 displays w(rj) graphically
for Xic = 1. Further information on the FPM can
be found in [5,6].

0.4

0.2 -
O*
0.2 0.4 0.6 - - 018
t

WITHOUT wdghthg
EXACT -
Wng QAUSS wdghthg fundom +
0.8

0.2 0.4 0.0 0;a - - -


0.0 .
0.4 -
I
EXACT -
WITHOUT wdghthg
u.hg QAUSS weighthg fundlor). 0.2 -
0.8

O6 0.2 - - 0:4 - - 0.0 - - 0:a - 1

1
EXACT
WlTHOUTwdghthg
-
Umlng QAUSS weighlhg fundlor).
0.8

0.0

0.4
1

0.2
0.8

1
0.0

0.4
Figure T : Convection diffusion equationUsing linear
base functions and 4 nodes per cloud for
0.2
a) Pc = 0.6, b) Pc = 1 and c) Pe = 2.6.

Fig. 6 shows the F P solution using linear base func-


tions (rn = 2) and 3 points per cloud for three dif-
ferent Peclet numbers: a) P e = 0.5 b) P e = 1.0 and
c) P e = 2.5. Observe that exact nodal values are
Figure 6: Convection diffusion equationUsinglinear
base functions and 3 points per cloud for obtained in all cases.
a) Pc = 0.6, b) Pc = 1 and c) Pc = 2.6.
However, as the number of nodes in the cloud is
4. NUMERICAL EXAMPLES increased, a deterioration of the solution is visible
when no weighting functions are employed. Figures 7
4.1 1D Convection diffusion equation
11-7

and 8 demonstrate this behavior for n = 4 and n = 5. oscillations disappear if a weighting interpolation is
used.
Using Gaussian weighting functions, the improve-
ment of the solution is impressive. With A j = rmjn,
where rmin refers to the minimum distance of r in
Cli, practically exact nodal values are recovered for
this 1D test problem (see Figures 7 and 8).

EXACT
WIl"lwdgh1tng
-
Umhg OIU88 WdOhihg f u M l o r u +

0
. 0.4 0.6 03 - - \

0.6 .

W C T
WlTHOUT wdohiing
-
U&IQ mu88 w*ghihg fUndi0ru +

0.4.

0 2 i i : l l
0
0.8
Figure 9: Convection diffusion equation: quadratic
baae functions and 3 nodes per cloud for
a) Pc = 0.6, b) Pc = 1 and c) Pc = 2.6.

Additionally, we have also found that the quality of


the results worsens as the Peclet number increases
Figure 8: Convection difhsionequation using linear
b e functions and 6 nodes per cloud for
if no weighting functions are employed. Note that
a) Pc = 0.6, b) Pc = 1 and c) Pc = 2.6. again nearly exact nodal solutions are recoverd by
employing Gaussian weighted interpolation.
The extension to quadratic base functions (rn = 3)
exhibits a more drastic need for using of weighting 4.2 Extension to the 2D Euler equations
functions. Whereas with 3 noded clouds (n = 3) ex-
act nodal values are computed (Figure 9), strong os- 4.2.1Governing equations
cillations occur as n is increased (Figure 10). These
11-8

The id- from the one dimensional problem are ex-


tended to the solutions of the non linear two dimen- At = -
Ch and C < l
sional Euler equations: 1.1 +c

where C is the Courant number.


A difficulty in a multidimensional context arises from
the determination of h in a given cloud of points ni.
where In finite elements, using linear triangular elements,
h is defined according to the minimum height within
each element [8]. In meshless methods, a clear defi-
nition has not been presented yet. In our work, h has
been taken equal to &,,in, this being the minimum
distance to the center point within each interpola-
tion domain ni.
4.2.4Bakmcing dissipation
The different terms have the usual meaning [8]. Since the hyperbolic Euler equations do not contain
any diffusion terms, some balancing damping must
be added to prevent unphysical oscillations. Follow-
ing Jameson [9], 2nd and 4th order diffusion terms
are added to the fluxes. These terms are constructed
as follows in the FPM:

021

0.4 I I
0 02 0.4 0.6 0.8 1
where wj are the same weighting functions used in
the interpolations of eq. (8) and the coefficients of
Figure 10: Convectiondiffurion equationusing qua- eq. (32) are obtained as:
dratic b e functions and 6 pointr per
cloud for Pc = 1.

4.2.2 Time discretization


A twestep scheme is employed in order to advance
the solution in time towards steady state, i.e. (33)
a(2)and a(') are user defined constants. The sum-
mation j extends accross the number of points in
each cloud and is accumulated at both the central
point i and the point j . In subcritical flows E!') is
generally switched off.
4.2.5Selection of points
In a multidimensional domain, the difliculty arises
on how to define each local interpolating domain.
4.2.SStability
Even though weighting functions are employed, it is
still necessary to choose the most significant points
The two-step scheme leads to a conditionally stable for each ni. For the results of this paper, the central
explicit second order algorithm of with the following point plus the n - 1 closest points are chosen. How-
limit for At: ever, a condition of quadrants is imsosed such that
11-9

there must be at least one point in every quadrant The results of the FPM were obtained by employing
of orthogonal axes. This leads to a minimum of 5 7 nodes in ni, X = Amin, c = 1 and linear base func-
points per cloud. tions (rn = 3). A global comparison of the meshless
solution is shown in Figure 12. In a), b), c) and d)
At the boundary, the two points adjacent to the cen-
the mesh, the Taylor-Galerkin solution, a four-stage
tral point on the boundary plus the closest points are
Runge-Kutta Galerkin result and the FPM solution
chosen. Another condition is that no boundary sec-
for the density are presented, respectively. Quali-
tion is crossed so that points from the opposite side
tatively, all results are very similar. In Figure 13
are not chosen. For instance, at the trailing edge of
close-up views in the stagnation area enhance the
an airfoil, the closest points to a point on one side of comparison of density contours of a) FPM without
the airfoil may lie accrom the wall on the other side.
weighted diffusion,b) FPM with full Gaum weight-
Since there M a physical separation of these points,
ing, c) RK-Galerkin and d) Taylor Galerkin. Note
they are not included in the same interpolation do- the improvement of solution b) with respect to a),
main. not exhibiting any oscillations in the stagnation area
4.S 2D Results through the use of weighted diffusion terms.
43.1 Subsonic iesi case In Figure 14, a) velocity contours and b) velocity
vectors in the stagnation zone are displayed, respec-
The first 2D test case considered is a NACA0012
tively.
profile with a free stream Mach number of 0.5 and 0
degrees angle of attack, analyzed by Zienkiewicr el 4.S.2Supersonic l e d case
al[10,11]. In order to compare solutions, a finite el-
ement solution has been taken for comparison. The The second 2D test case is a hypersonic inviscid flow
meshless grid of 2556 points is shown in Figure 11. of Mach 8.15 around a double ellipse, which is well
The same points have been used for the FE solution documented by the proceedings of the workshop in
on the equivalent unstructured triangular mesh ob- Antibes, 1991 1121. The flow enters at an angle of 30
tained using a standard advancing front technique degrees. The solution is characterized by a strong
[7,81. primary bow shock and a weaker canopy shock.
To solve this problem, a grid of approximately 11000
points was generated using again the advancing front
technique. Linear base functions (m=3) and 6 point
clouds with Gaussian weighting were used. The re-
siduals of the solution have been reduced to Sir or-
ders of magnitude. Figures 15 a). b) and c) present
the meshless grid, Mach number contours and den-
sity lines, respectively. Note that the solution is very
smooth and the location of the shock is well cap
tured. The numerical overshoot of about 3% in Mach
number is within reasonable limits and it could be
improved by increasing the balancing diffusion. The
convergence of this solution was slow due to a low
Courant number of 0.25 (avoiding ueg. pressures).
Figure 16 a) demonstrates the high quality of the so-
Figure 11: Point distribution mound m NACA0012 lution in the vicinity of the stagnation area showing
profile
no oscillations. Figure 16 b) displays the pressure
Again, the idea is to compare the influence of the coefficient e, on the boundary of the double ellipse
weighting functions in the finite point approxima- which compares well to other contributors [12].
tion. In previous reports [5,6] we have shown re- 5. REFERENCES
sults proving the superiority of weighting functions [l] Nayroles, B., Touzot, G. and V i o n , P. 'Gener-
in a 2D context, but without using weighting func- alizing the Finite Element Method: Diffuse A p
tions for the balancing diffusion terms (see eq. (32)). proximation and Diffuse Elements", Computk
Here, the benefit of the weighted diffusion terms tional Mechanics, 10, 307-318,1992
shall be presented. [2] Belytschko, T.,Lu, Y. and Gu, L. "Element Free
11-10

Galerkin Methods", Int. Journal for Numerical vection Dominated Flows", PhD Thesis, Univer-
Methodo in Engineering, 37,229-256,1994 sity College of Swansea, 1986
[3] Liu, W.K., Jan, S. and Belytschko 'Reproduc- [9] Jame.son, A., Schmidt, W., and Turkel, E. "Nu-
ing Kernel Particle Methodsn, Int. Journal for merical simulation of the Enler equations by fi-
Numerical Methods in Engineering (to be pub- nite volume methods Usirg Runge-Kutta time
lished) stepping c.chema" AIAA paper 81-1259. AIAA
[4] Batina J., "A Gridlea# Euler/Navier Stoka So- 5th Computational Fluid Dynamics Conf., 1981
lution Algorithm for Complex Aircraft Applica- [lo] Zienkiewicr, O.C. and Wu, J. "A General Ex-
tions", AIAA paper, 93-0333, Reno NV, January plicit or Semi-Explicit Algorithm for Comprcu-
11-14,1993 ible and Incompressible Flows", Inst. of Num.
[5] Oiate E., Idclsohn S. and Zienkiewicr O.C. "Fi- Meth. in Eng., University College of Si-,
nite Point Methods in Computational Mechan- CR/682/91,1991
ics", Publication CIMNE No. 67, July 1995 [ll] Zienkiewicr, O.C., Codina, R., Morgan, K. and
[6] Oiate E., Idclsohn S., Fischer T., Zienkiewicr Sai, S. "A General Algorithm for Compressible
O.C. "A Finite Point Method for analysis of fluid and Incompressible Flow", Inst. of Num. Meth. in
flow problems", 9th Int. Conf. on Finite Elementi Eng., University College of Sw-, CR/842/94,
in Fluids, Venuia, Italy, 15-21 October 1995 June 1994
[7] Zienkiewicr, O.C. and Taylor R.L. "The Finite [12] Problem 6 of the Workshop on Hypersonic Flows
Element Method", 4th Edition, Volume 2, Mc. for Reentry Problems, Antibu, fiance, January
Gran Hill, 1991 22-25 1990
[E] Peraire, J. "A Finite Element Method for Cou-

Figure 1% NACAWl2 p l i l e : a) M A , b) TG solution, e) RK dution and d) FPM solution are .horn.


11-11
11-12

Pigum 111: Double &pr: a) Density contrmn m the ste.lpl.tion MI~C rad b) Pr+- coemdmr cp dong the bound-
of the body.
13-1

NUMERICAL
SIMULATION OF INTERNAL AND EXTERNALGAS DYNMIC
FLOWS ON STRUCTURED AND UNSTRUCTURED ADAPTIVE GRIDS
U.G.Pirumov,
I.E.Ivanov,
r.A.kKryukov
Moscow Aviation Institute
Volokolamskoe sk 4
125871 Moscow, Russia

cells needed to determine the coefficients of the


1. ABSTRACT polynomial).
Solution algorithms for solving the unsteady 2D Euler Numerical results are presented in Section 4 to
equations are presented. Cell-centered upwind control illustrate the capability of the proposed algorithms.
volume scheme are. developed with utilize the two-
dimensional monotone linear reconstruction 3. GOVERNING EQUATlONS
procedures. The new adaptive grid procedure are The goveming equations are the conservation form of
proposed to cluster the grid points in regions where the Euler equations for two-dimensional, unsteady,
they are most needed. This procedure is generalized compressible flows of a calorically perfect gas
for unstructured grids. Numerical results in two-
dimensional case are presented for linear and
nonlinear convection problems. 4, +W),
+GW, =S (1)
where
2. INTRODUCTION
The numerical simulation of many gas dynamics
processes, possessing applied sipnificance, requires the
solution unsteady two-dimensional Euler equations in
the complex geometry region. The typical feature of
inviscid gas flow about bodies, in channel of the
complex form or in jets is presence interacted shock
waves and other gas dynamics discontinuities [l-31.
For computation of such flows the high order schemes Here p, p and E are the density, pressure and total
of TVD or E N 0 type obtained wide spreading. energy, respectively, and U and v are the Cartesian
These schemes have high order of accuracy in the components of the velocity vector. S is the source
region of the smooth solution, well capture term. The system (1)-(2) of four equations is closed
discontinuities and preserve monotonicity of the with the polytropic equation of state
solution. In the present paper for the solution of Euler
equations high order version of Godunov’s scheme p=(y-1)(E-p/2(u2 + v 2 ) ) , (3)
[4,5] is used.
where y is the ratio of specific heats.
To improve efficiency of codes based on TVD and
E N 0 methods and to resolve local features of a flow 2. CLASS OF EUGE ORDER SCHEME FOR
the solution-adaptive grid algorithms can be used. The NUMERICAL SIMULATION GAS DYNAMIC
authors have developed an adaptive grid algorithm FLOWS
suitable for structured and unstructured grids. It is
based on the algebraic minimal moments scheme by 2.1 Finite volume formulation
Cormett, etal. [6,7] with cell-centered grid Let denote by Su a rectangular partition of the x-y
modifications. plane, where
The proposed method belongs to the class of moving
grid methods. Using these methods for structured s~=[xi-l/2xi+1/.dx ~J-l/2yJ+l/.d~
grids, strongly skew cells can be obtained near large with (xl,y,) denoting centroid of each rectangle S I .
gradient regions. In this case, 1D- procedure along With help of integral form of equation (1) for eaci
gridlines can yield large error due to decreasing of control volume Su following equation can be obtained
order of approximation. So it is necessary to use
essentially 2D reconstruction procedures. Note that
only 2D reconstruction procedures on unstructured
grids can be used.
In present paper the linear reconstruction procedures (4)
are. considered and one such procedure is developed.
It is based on the well-known in Russia 2D algorithm
by Tillyaeva [E] with modification which taking into
account a more wide additional support (the set of

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithm”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
13-2

Analysis (11),(13) and the flux difference


is the cell average-of q over the control volume at f;+llz,j(f)-f;-l,zj(f) in (10) shows 191, that number of
time t. The fluxes f and are given by Gauss points K must satisfied condition rRK. Then
the error relation satisfies

ji+l,Z, 0)- ji-l,,,, (0


= .Fi+i/z,,(t) - fi.i/z,, (0 + O(h'+'). (14)

%a/* Noting that the area hsu is O(h3, than upon


substitution of the numerical fluxes (13) into (10) we
gl+l/zj(t)= j g ( ~ ( x , u , + l / z , f ) ) d x . (7)
%,,2
have thus designed the spatial operator L that satisfied
Equation (4) can be treated as a system of ordinaty (9).
differential equations. Along any f=constant line, the We now wish to modify the "abstract" numerical flux
right-hand side of (4) is a spatial operation in q, and (13) such that conditions of approximation of desired
we rewrite this equation in the abstract operator form order for scheme be satisfied in regions where the
solution is smooth and, in addition, these fluxes will
account for possible discontinuities in q. This
modification follows naturally from the reconstluction
for the purpose of "separation" the spatial and procedure, by which the function qh(x,y),in (12), can
temporal discretizations. discontinuous at cell interfaces. presentation of qh(x,y)
within a cell SI In order to resolve these
2.I. I Spatial discretization discontinuities, the 6ux integrands in (13) are replaced
To achieve desired order of accuracy we replace the by
operator with a discrete spatial operator L, which
approximates to r-1 order f R P(qv
-
(Xi+l/Z,YL),~i+l./(Xi,l/Z,Y~)),
R P -
Lq(t)=Lg(f)+O(h'). (9) g (qii ~ ~ ~ ~ Y , + I / Z ~ ~ ~ , , , + l ~ ~ k ~ Y , + l / Z ~ ~ ~ ( 1 3

We defme L explicitly by where i u ( x . y ) denote the local representation of


qh(x,y)within a cell Su andF(ql,q2) denotes the flux,
across x=O, associated with the solution to the
Riemann problem whose initial states are q1 and q2

2.1.2 Temporal discretization


Equation (4) is discretized by using a Runge-Kutta
where f and g- are approximations of correspondmg method (R-K) of Shu [lo]:
order to fluxes f and in (6) and (7).
I-1
For approximation of integrals (6), (7) we can use i$) = C[a,qf' +&,AtLT'I, k 1 , 2 ,...,p ,
"classical" K-point Gaussian quadrature. Therefore, for ?U-0
fixed x and t, and sufficiently smooth f; the $4 = (q)cm, 3
*O)
qv = -w
qu , *P)
qv
-- -@+I),
qv (16)
approximation of the flux integral (6) by Gaussian
quadrature satisfies The order of accuracy, as well as its TVD properties,
is achieved by adequate sets of coefficients a,m,p,,,,
and p [lo].
Also in our code method 1111 is used. It is predictor-
corrector type method of second order accuracy, in
+ s(Xi+i/z,dhzR+', (11)
which on fmt stage the fluxes are calculated without
where function s relates to the quadrature error and solution of the Riemann problem and reconstruction
rl E (Y~-l/Z7Yj+I/Z).
procedure is pedormed only on fmt stage.
Let R be a spatial operator which reconstructs the set 2.1.3 Riemann solvers
of cell average and yields a 2D, piecewise polynomial Using of the solution of the Riemann Problem (RF')
qh(x,y)of degree r-1 which approximates q(x,y,t), with for calculation of the fluxes over faces of a control
a truncation error of O(h7 volume (16) allows to take into account local
directions of perturbation propagation. In general case
it is necessary to use really two dimensional Riemann
Solver for accurate calculation of 2D flows. But
recentlyproposed 2D Riemann Solvers 112,131 are too
complicated, tedious and sometime result in
13-3

instability, even for the f i t order schemes. So in


presented paper the solution of RP is determined by 3. ADAPIATION PROCEDURE
1D Riemann Solvers. According to common practice, The adaptive grid algorithm presented here is based on
which start from work of S.K. Godunov [SI, the the algebraic minimal moment method [6,7] with cell-
normal to face of a control volume is used as the centered grid modifications. For exact reconstruction
direction, along which the 1D RP is solved. It can of a linear function the centroid of control volume,
result in dependence of the solution quality on where averaged function values are stored, has to
numerical grid. coincide with the gravity center of this control
volume. So in presented algorithm f i t l y vertices of
In developed computer code [ 141 there is opportunity control volumes are replaced and then coordinates of
to choice of the 1D Riemann Solver (for example
gravity centers of new control volumes are evaluated.
exact solution 141 or approximate methods by Roe
1151, Osher 1161, Dukowicz 1171, Davis [18] and An analytical expression for the movement of cell
others 1191) for the solution of the R
I'. vertex P is given as

2.1.4. Boundary conditions - - - 1


M = P " , -p=--Crnnb(Bnb-
Correct setting of numerical boundary conditions is nb
one of the most important items for the numerical
simulation of unsteady gas dynamics flows. Numerical - means-sugunation over all neighbour cells with
boundary conditions based on using of characteristic a vertex at P . B is the vector location of the gravity
relations are most correct in physical sense and they center of corresponding cell.
are implemented in the present work. For the open The "mass" of cell n is defied as
boundaries the '"non-reflecting" boundaty conditions
[20] are used.

2.2. Two dimensional reconstruction algorithms


Using solution-adaptive grid methods for structured
grids, strongly skew cells can be obtained near large where j is index of adjusting with n cells, Asn is the
gradient regions. In this case, 1D- procedure along cell area and
gridlines can yield large e m r due to decreasing of
order of approximation. So it is necessary to use
essentially 2D reconstruction procedures
[8,9,21,22,25,26]. Most of these procedures are rather
complicated and very costly. At present paper most The f i t term within the parenthesis of equation (18)
simple linear reconstruction procedures are used represents an estimate of the maximum gradient of an
[8,221. They can efficiently realized on the structured arbitrafy function,J on a grid cell. The adaptation will
grids. be sensitive to the gradient of this function. The
constant c is the user specitied constant which
In the reconstruction procedure by Tillyaeva 181 controls the adaptation strength.
(denoting below as TL) the five-point stencil is
decomposed into triangles. The slopes of planes For boundary point adjustment two algorithms
passing through function values at the vertices of each proposed in [6,23] are used.
of the four triangles with a vertex at ij are In the method of minimal moments (6,7] the nine-
determined. The derivatives with respect to x and y of point stencil is decomposed into triangles. The gravity
linear reconstructed function are evaluated from center for each of the four triangles with a vertex at i j
corresponding slopes by minmod operator. As shown is determined. A weighting function value is computed
in [SI for case of structured quadrilateral grids only for these points. The new location of point i j is at
two opposite triangles with a vertex at i j can be used common center of mass of these four triangles and is
in the reconstruction procedure. Moreover, on given by (18) with summation over these triangles. In
rectangular grids the reconstruction became couple of the presented method "natural" information is mainly
the independed ID minmod limiters. used: gravity centers and areas of computation cells,
As follow from numerical results TL reconstruction is that are computed and stored for a flow field
too much diffusive. So we modify it by taking into computation. In some cases an estimate of the
account slopes of central plane, reconstmcted with maximum gradient in the weighting function (18) can
algorithm similar [21] over all points of the stencil. be obtained as auxiliary result of a reconstruction
We denote this algorithm with MT. procedure. It can be rather important for overall
efficiency of flow solver with dynarmc solution-
Last algorithm that studied in this paper is the linear adaptive grid techniques.
reconstruction proposed in 1221. Initially the estimate
of solution gradient in the cell are computed using the The presented method can be easy generalized on
approximation of boundafy integral for some path unstructured cell centered grids. The illustrations in
surrounding this cell. Then the obtained gradient Fig. 1 were produced to show how presented
estimation is restricted to satisfy the monotonicity algorithm can be used on structured quadrilateral and
principle. unstructured triangular grids. Fig. la presents surface
plot of given function H(x,y)=tanh(3(x-yz))
134

+tanh(3(x2+y2-1)). Figures lb and IC contain the BJ results - Figs. 4c and 4f. As we expect the TL
solution-adaptive grids produced by our algorithm. reconstruction procedure is most dissipative. MT and
BJ results are rather close, but in MT case errors are a
4. NUMERICAL RESULTS little smaller and maximum values are a little greater.
These problems also shows that the MT
4.1. Linear sealar 2D problems reconstruction may not preserve the symmetIy. We
Numerical dissipation of explicit hgh resolution have some difficulties with implementation of the BJ
schemes with various "reconstruction" algorithm were reconstruction on the curvilinear grids. So results for
compared by numerical results for the two this grid aren't presented in Tabs. 1 and 2.
dimensional advection equation [24]
4.2. A channel with a 15' compression-expansion
rmP
The next case is the flow through a duct with
compression-expansion ramp in the bottom wall. The
The exact solution of (20), (20) consist in the rotation conditions for this case are: M,=2, ~ 1 . 4 The
.
of the initial values round ( x o , y o )with angular velocity Computational grid is equally spaced and contains 180
w. In this paper presented two series of calculations. cells in the streamwise direction and 60 cells in the
As initial values was chosen a cut-out cylinder and a cross flow direction.
cone (fg.2). We used the angular velocity m to be 0.1 The computed Mach contours are presented in Fig. 4.
and x0=50, y0=50. The region of computation was Note that the induced and reflected shocks are quite
(0,10O]x[O,1001. The numerical calculation were done thin and any unphysical oscillations are absent. AU of
on three type of grids with 100 grid points in each characteristic features of the flow are well resolved on
direction. The fust type is uniform rectangular grid, such fme grid without adaptation.
the second type is smooth curvilinear grid (Fig. 3.a)
described by transformation 4.3. A Oblique Shock-reflection problem
One of the most popular problems for checking out
various elements of numerical algorithm (such as
reconstruction, adaptation etc.) is the regular
reflection of an oblique shock wave by a flat plate. In
Figs. 5 a-f, results are shown for a case with M,=2.9
and p = 29", where p is the angle made by incident
shock wave and the k t plate Fig. 5a. First steady
where c1= Ax(i - I), Ax = I , solution was obtained on the uniform 60x20
ql=Ayo-l), Ay=l. rectangular grid. The corresponding pressure contours
And the third type is random grid (Fig. 3.b) are presented in Fig. 5a. Then grid adaptation was
performed by the proposed above method. The
pressure was used as the adaptation function. M e r
X# = tl+ &$,Ax that new steady state was obtained. Figs. 5a and 5b
y# = ql t #@Ay depict the adapted grid and associated steady state
where and a,, are uniformly distributed random flow solution. These figures shows that the pressure
numbers on (-0.4, 0.4). gradients become much better resolved.

At time t=20rr the initial values have carried out one 4.4. A underexpanded jet flow
full rotation and returned to their initial position. The The next case is the unsteady underexpanded
approximations of the initial values on uniform grid supersonic jet flow. The conditions for this case are:
are shown in Fig. 2. To improve picture resolution in M,=l.S,n=p pm_ = 3, T=Tm and 5=y,=1.4.
Figs. 2 and 4 we used only part of the computation J _
computatio gnd 1s equally spaced and contains 180
The

region [SO,lOo]x[25,75]. Size and initial position of the points in the streamwise direction and 80 points in the
cut-out cylinder and the cone are same as in [24]. We crossflow direction. Fig. 6 shows the computed Mach
perform long time calculations until +I2077 which contours for most characteristic time moments. Note
corresponds to six full rotations of initial values. As that the nonreflecting boundary conditions allow to
mentioned in [24] these problems are well suited to calculate such rather complicated flow almost without
benchmark the numerical properties of the schemes. unphysical reflection on the open boundaries.
In this paper three reconstruction procedures The steady state solution is shown in Fig. 7a. Fig. 7b
described in section 2.2 are compared. Numerical shows the steady state solution obtained using coarse
results obtained with these reconstructions on 90x40 rectangular grid. For this solution the grid
computational grids of three types are presented in adaptation was performed using the Mach number
Tab. 1 for the cone and in Tab. 2 for the cut-out gradients as the adaptation function. In Fig. 7c, the
cylinder. These tables contain numerical solution adaptive grid is presented and Fig. 7b shows the
errors calculated with respect to the L, norm and computed Mach contours. The solution computed
maximum values of obtained solutions. Fig. 4 shows using c o m e adapted grid is mush close to the fm
numerical solutions computed using uniform grid solution in Fig. 7a. But some flow features aren't
rectangular grid. TL reconstruction results are shown succeeded to capture. The second Mach stem places
in Figs. 4a and 4d, MT results - Figs. 4b and 4e and somewhat farther from the nozzle cut than in Fig. 7a.
13-5

But the first Mach stem are resolved better than by Technique for the Solution of Navier-Stokes
using the fine gnd. It is main deficiency of moving Equation", AIAA Pap., 90-1605, 1990.
solution-adaptive gnd algorithms. If there are large 8. Tdyaeva N.I., "Generalization of Modified
gradients of parameters in a flow region then grid Godunov's Scheme for Unstructured Grids",
points are too come in regions of middle and low Uchenyle Zapicki TSAGI, XVII, 2, 1986, pp. 18-
gradients. 26, (in Russian).
9. Casper J., Atkim H.L., "A Finite-Volume High-
4.5. ~ o p l now
e Order EN0 Scheme for Two-Dimensional
In the last example we present results of numerical Hyperblic Systems", J.Comput.Phys., 106, 1993,
simulation of the internal axisymmetric nozzle flow of pp. 62-76
the ideal gas with ~ 1 . 2 2 The
. nozzle consists fium 10. Shu C-W., Osher S. "Efficient Implementation of
two paas. First part IS a h v a l nozzle and second one Essentially Non-Oscillatory Shock-Capturing
is cylindrical tube adjoining to the supersonic part of Schemes", J.Comput.Phys., 77, 1988, pp. 439-471.
the Laval nozzle. 11. Rodionov A.V., "High Order Godunov Scheme",
Zhurnal vichislitelnoy mathem. 1 mathem. phys.,
Initially the steady state solution was obtained using 27, 4, 1987, pp. 585-593, (in Russian).
130x40 simple grid (Fig. Sa). In Figs. Sa and 8b the 12. Ramsey C.L., van Leer B., Roe P.L., "A
top half of figures shows the computed Mach contours Multidimensinal Flux Function wth Application to
and the bottom half shows the computational gnd. the Euler and Navier-Stokes Equation", J. Comput.
For the computed steady state solution the grid PhyS., 105, 1993, pp. 306-323.
adaptation was performed using the Mach number 13. Roe P.L., "Dmrete Model for the Numerical
gradients as the adaptation function. The new steady Analysis of Time-Dependent Multidimensional Gas
state solution and the adaptive grid are presented in Dynamics", J.Comput.Phys., 63, 1986, pp. 458-476.
Fig. 8b. 14. Kryukov IA., Ivanov I.E., "High Resolution
Monotone Method for Computation Internal and
5. CONCLUSIONS Jet Inviscid Flows", in "Nonequilibnum processes
In present paper the upwind monotone numerical in nozzles and jets", Proc. 1st Int. Cod., Coll.
method for the solution of the Euler equations is Abstr., June 1995, pp. 92-93.
presented. This method is based on the high order 15. Roe P.L., "Approximate Riemann Solver
version of the Godunov's scheme. This method is Parameter Vector and Difference Schemes",
realized using both the structured quadrilateral and the J.Comput.Phys., 43, 1981, pp. 357-372.
unstructured triangular grids. Essentially 2D 16. Osher S., Solomon F., "Upwind Difference
reconstruction procedures make possible to perform Schemes for Hyperbolic Systems of Conservation
calculations using the strongly skew gnds. Some Laws", Math.Comput., 38, 158, 1982, pp. 339-374.
features of 2D reconstruction procedures are. studied 17. Dukowicz J.K,"A General, Non-iterative
to solve the linear scalar problem. The new solution- Riemann Solver for Godunov's Method",
adaptive grid algorithm is proposed. J.Comput.Phys., 61, 1985, pp. 119-137.
Presented numerical results illustrates the capability of 18. Davis S.F., SIAM J.Sci.Stat.Comput., 9, 1988, pp.
the proposed algorithms. It can be see that the grid 445-473.
adaptation procedure make possible to obtain 19. Einfeldt B., SIAM J.Numer.Anal., 25, 1988, pp.
significantlymore accurate results. 294-318.
20. Thompson KW., "Time Dependent BoundaIy
6. REFERENCES
Conditions for Hyperbohc Systems",
1. Pirumov U.G., Roslyakov G.S., "Gas Flows in
J.Comput.Phys., 68, 1, 1987, pp. 1-24.
Nozzles", Springer-Verlag, 1985, 425 p. 21. Vankeiibilck P., Deconinck H., "Solution of the
2. Gorbunov V.N., Pirumov U.G., Ryzhov Yu.A, Compressible Euler Equations with Higher Order
"Non-EquWum Condensation in High-speed ENO-schemes on General Unstructured Meshes",
Gas Flows", Gordon and Breach Science in "Computational Fluid Dynamios'9T, v01.2, 1992.
Publishers, 1989, 290 p. 22. Barth T.J., Jespersen D.C., "The Design and
3. P i m o v U.G., Roslyakov G.S., "Numerical Application of Upwind Schemes on Unstructured
Method for Gas Dynamic", Moscow, Vishai Meshes", AIAA Pap, 89-0366, 1989.
Shkola, 1987, 231 p., (in Russian). 23. Kania LA. "An Adaptive Grid Algoritbm for
4. Godunov S.K., "Finite Difference Method for Accurate Flowtield Calculations", AIAA Pap., 90-
Numerical Calculation the Discontinue Solutions 0327, 1990.
24. Mum C-D., "On the Numerical Dissipation of
of Gasdynamic Equations", Mathem. Sb., 47, 3,
1959, pp. 271-306, (in Russian). High Resolution Scheme for Hyperbolic
5. Godunov S.K., Zabrodin AV., Ivanov M.Ya., Conservation Laws",J.Comput.Phys., 77, 1988, pp.
Krayko AN., Prokopov G.P., "Numerical Solution 18-39.
of Multidimensional Gasdynamics Problems", 25. Harten A., Cha!aavarthy S.R, "Multidimensional
Moscow, Nauka, 1976,400pp., (in Ruman). EN0 Schemes for General Geometries", Tech.,
6. Connett W.C., Agaswal RK., Schwattz A.L., "An Report 91-76, ICASE, 1991.
Adaptive Grid-Generation Scheme for Flowtield 26. Ab& R "On Essentially Non-oscillatory
calc~lation~", AIAA Pap., 87-0199, 1987. Schemes on Unstructured Meshes: Analysis and
7. Connett W.C., Agarwal RK., Schwattz A.L., Implementation", J.Comput.Phys., 114, 1994, 45-
Wheeler J.C., "An Algebraic Adaptive Grid 58.
13-6

b) 0)
Fig.1. Adaptive respow ofuniform grids to weighting function H(x,y)=tanh(3(x-$)) + tmh(3(x%y2-1)).
a) weighting functioq; b) regular quaddated gri@ c) unstructuEd triangular grid.

a) b)
Fig. 2. Initial values and exact solution after each full rotation. -
a) cone, b) - cut-out cylinder.
13-7

a) b)
Fig. 3. Examples of curvilinear (a) and random (b) calculation grids (20x20).

Fig.4. Computed solutions on the uniform grid after six rotations. a)-o) - cone, d)-t) - cut-off'cyliader.
13-8

Table 1. Results of solution of the linear scalar problem for rotating cone.

Table 2. Results of solution of the linear scalar problem for rotating cut-out cylinder.

I
0.0 0: 5 IC0 1'.5 2'. 0 215

Fig. 5. Computed pressure contours for a 15' compression ramp (M=2).

0'

0)

Fig. 6 . Pressure contours for oblique shock reflection problem (M=2.9, p=29').
a) solution on uniform grid; b) solution on adaptive grid; c) solution-adaptivegrid.
13-9

Fig. 7. Unsteady underexpanded (n=3) jet.

Fig. 8. Steady underexpanded jet. a) - fine grid solution; b) - coarse grid solution without adaptation; c ) -
adaptive grid; d) - adaprive coarse grid solution.
13-10

MRCH NUMBER

0: 0 0'. 1 0: 2 3
0'. 0: CI 0'.5 0'. 6 0: 7
x
a)
MRCH NUMBER

X
b)
Fig. 9. Nozzle flow problem. a) solution and grid without adaptation; b) adaptive solution and @id.
14-1

An Investigation of The Effects of the Artificial Dissipation Terms in a Modern


TVD scheme on the Solution of a Viscous Flow Problem.

R. D.Briggs' and S.Shahpar' .


T h e Manchester Schwl of Engineering,
Aerospace Engineering Division,
Oxford Road, Manchester. M13 9PL
England

subsonic flow, and reporls that upwind schemes using a matrix


1. SUMMARY dissipation technique, are generally better than central
The goal of the present investigation is to discover the effects difference schemes using a scalar dissipation formulation, for
of certain parameters in a modem TVD scheme on the producing good boundary layer profiles. He also reports a
solution of a viscous flow problem. This report includes slight velocity overshoot within the boundary layer. even
details of the TVD scheme used in this study. The scheme is using the upwind schemes. Tatsumi et al [5] have concluded
an extension of the work of H.C. Yee [I], and uses an upwind
that scalar switching schemes can produce good accuracy if a
weighted dissipation term, and central differencing to flux limiting technique is included. Tatsumi et al also
calculate the viscous terms. suggests that antidiffusive schemes will produce overshoots
The entropy correction parameter and the choice of flux as described in [4] unless a flux limiter is included in the
limiter when computing viscous flows are under investigation. algorithm. Caughey and Varma [6] have used an integral
The effectiveness of this TVD scheme in solving viscous flow technique to measure the effects of artificial dissipation on the
problems has recently been questioned by Lin [Z]. However. tlow calculations around a transonic aerofoil. They found that
this investigation shows that by carefully selecting the limiter the dissipation errors and the total numerical errors were of a
and the value of the entropy parameter. adequate viscous flow comparable magnitude. but in general the dissipation errors
results can be obtained. were larger than the total numerical errors. They also found
that a Mach number scaling technique, for reducing the
Solutions to the Navier-Stokes equations for an artificial dissipation being added, did reduce the dissipation
underexpanded sonic jet on a flat plate with a supersonic related errors in most cases. Turkel and Vatsa 171 have
crossflow are used to illustrare the method. The test compared a scalar artificial dissipation model with a matrix
conditions were M-=2.61, and Re=749.000, and the boundary dissipation model on a 3D transonic aerofoil. They found that
layer was considered to be laminar everywhere. The the matrix dissipation model improved the accuracy of the
numerical code has been evaluated for this test case using scheme. The accuracy was comparable to that produced by an
experimental data presented by Zukoski and Spaid [3]. upwind TVD scheme. It has also been reported that the matrix
The study includes a qualitative analysis of the amount of dissipation technique in a central difference scheme can
artificial viscosity added by the TVD algorithm compared to produce high resolution results in a viscous flow, and that the
the real viscosity; an investigation of the effects of the matrix dissipation is essential for this behaviour, Swanson and
artificial viscosity term on the solution. including changes in Turkel [E].
pressure and skin.friction distribution along the surface of the In this work, a TVD artificial dissipation switching method
flat plate, and the change in the boundary layer separation which is similar to a matrix dissipation technique has been
point for different values of the entropy parameter. studied. This switch is controlled by two parameters, the
choice of flux limiter. and the value of the entropy correction
2. INTRODUCTION parameter, which has an indirect effect.
Over the past decade a vast range of Total Variation
Diminishing (TVD)schemes have become available, and have In order to study the effects of these t e r m we have considered
been widely used. These schemes use some method of a complex viscous flow problem, as further irregularities
intelligent switching to put artificial dissipation into a problem caused by artificial dissipation may present themselves in such
where it is needed or to conversely remove artificial a problem. An underexpanded jet interaction with a
dissipation from areas of a problem where it is not. In this supersonic crossflow has been chosen. This problem includes
way the shock capturing capabilities of Euler solving a separated boundary layer and regions of recirculation, see
numerical schemes has improved enormously. figure I . The effect of the artificial dissipation terms on these
flow phenomena have not been examined before.
When the Navier-Stokes equations are being solved however,
the interaction between the real viscosity and the artificial let injection into a crossflow is an important aerodynamic
viscosity, provided by these modem schemes, must be problem and has many uses in the aerospace industry.
considered and evaluated. Reaction control systems on space vehicles and missiles are
transverse jet problems. lets for vortex control on the
A great deal of work has been done on various aspects of the forebody are now being considered for high angle of attack
artificial dissipation added in a scheme, and how it affects the control systems for fighter aircraft. Fuel injection in
solution. Allmaras [4] has examined a boundary layer i n a

' Postgraduate Student


Post-doctoral research fellow, M AIAA

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995. and published in CP-578.
Inwm

Figure 1. Diagram of a jet interaction with a supsrsmic crossflow.

supersonic ramjets (scramjets) can also be considered as a jet where.


in supersonic crossflow problem.
The flowfield around a jet injection is shown in figure 1. An
underexpanded jet transversely injected from the wall, into a
supersonic main air flow, expands rapidly through the strong
Prandtl-Meyer fans and forms a Mach disk. A how shock
wave is formed upstream of the injector due to the interaction
between the main and injected flows while the main flow
bends the injected flow parallel to the wall. A boundary layer
separation occurs and a weak separation shock wave appears
upstream near the injector due to an adverse pressure gradient
caused by the injection in a boundary layer flow. Another aT
weak shock appears downstream near the injector due to a f = UT,, + vT, + k-ax
reattachment of the injected flow with the wall.
aT
The objectives of this present study were to measure the effect g= +M, + k--;
of the artificial dissipation switching algorithm on the ay
accuracy of the solutions produced for this test case. Also, a
p IS pressure, p i s density, e is total internal energy, U and v
quantitative and qualitative examination of the amount of real
viscosity compared with the artificial viscosity being added in are velocity components, p is viscosity, T i s temperature. and
the solution has been included. k is the heat transfer coefficient.
For a general coordinate system the equations are transformed
3. NUMERICALMETHOD to the following:
3.1 Governing Equations
For a Cartesian coordinate system the Navier-Stokes equations (5)
can he written in a conservative form:
where,
ai3 a6 a6
-+-+-=-+- akv aZ,
at ax ay ax ay
where the inviscid terms are,
and similarly for the viscous terms,

J is the Jacobian of the general coordinate system

and the viscous terms are.


In order to construct the total variation diminishing scheme,
the Navier-Stoke equations must be put into the following
form,

(3)
au an
-+A-+B-=-+-
ai7 aFv a8"
(7)
at at atl at
14-3

where, A and B are the Jacobian matrices for the The function a(z) is defined as.
transformation of and respectively. Therefore A and B
can be described as dz)= [$z) -hz'I (14)

The function ~ ( z is) calculated using,

where the columns of the matrices R A and R, are the right


eigenvectors of the matrices A and B. A A and A , are
vectors which are related to the eigenvalues of A and B, and This function is introduced to prevent non-physical solutions
can be found in reference [91. such as expansion shocks, when one of the eigenvalues goes
to zero. The function introduces a small amount of artificialI

3.2 Discretisation of the Epuatioas viscosity. A relationship is provided in order that '6 ' is
The Navier-Stokes equations, in the form shown in equation 5. suitably scaled for highly skewed grids. This relation is,
can be simply discretised into a two step scheme. The first
step solves the flow in the &direction. and the second step 8={fi+?+O%(,/R+,/E)] (15b)
solves in the q-direction. This method of discretisation can
lead to second order accuracy in time. where fi and ? are the CO-variantvelocities. A study of the
actual effects of varying the value of the constant. 6, in the
solution, appears later in this report. This constant is referred
to as the entropy parameter throughout this report.
The term 'g', in equations (11) and (13), is the flux limiter.
where, Five different limiters have been implemented and
investigated, they are given in reference [I]. and are:

A global time step, At, is calculated with a CFL number of


0.95. The numerical flux functions, F and G, in equation 9a,
are calculated using a finite volume approach.

(10)
Note tha$ the flux functions with a superscript * are calculated
using U ' In reference [I], Yee has offered several other
methods of calculating the flux functions, but they are not The minmod function of a list of arguments is equal to the
covered here. smallest argument in absolute value if the list of arguments are
Roe's averaging is applied to cell vertex points in order to of the same sign, or is equal to zero if any arguments are of
calculate the flow variables at (i+l/Zj) and (iJ+l/Z). opposite sign.

.
The vectors, OA and OB in equation (IO) contain the anti- In equation (16c) E is included to stop any division by zero.
is given a small value, usually of the order of 10.'.
E

diffusiveterms that are under investigation in this report. They


provide a second order upwind weighted dissipation term, see The accuracy and shock resolution of Euler solutions increases
reference [I]. The equation for the components of the vector going from limiters (16a) to (16e). Throughout the rest of this
OAare shown here, a similar relation can be formulated for report equations (16a) to (16e) are referred to as Limiters 1 to
the vector !De 5 , respectively.

33 GridGeneration
The grid used for this problem was calculated using a very
simple analytical approach followed by a short smoothing
operation. It is necessary to have a high concentration of
points at the jet exit. where large changes in the flow will
occur, and near the wall. in order to accurately capture the
boundary layer.
For the function y one finds, The grid generator introduces regions of highly concentrated
points where specified. This leads to the production of a grid
which is highly irregular. Rapid changes in grid point
concentration can cause the numerical code to fail, or spurious
glitches in the solution of the flow to occur.
has a rectangular profile, the actual profile would be closer to
a quadratic profile. The jet boundary conditions are fixed at
initial conditions forthe injected flow.
The pressure distribution along the surface of the plate has
been compared with experimental results pruduced by Zukoski
el al 131, for the test case described above. The experiments
have been done for a three dimensional case, and the jet
injection hole was circular. The comparison is shown in
figure 3. This numerical solution has been calculated using
0.0 02 0.4 0.6 0.8 q.0 1.1 1.4 3.8 1.0 2.0
limiter 1, and with &0.001.
XlL Upstream of the jet, the results produced by the code compare
well with experiment. Downstream of the jet there is a large
Figure 2. Grid generated for the transverse jet case. discrepancy between experimental and numerical data. This
discrepancy is probably due to three dimensional effects
which are coming into play amund the jet. Some of the flow
ALaplacian smoothing operator is applied to the grid to solve will be passing amund the jet, reducing the mass flow
these problems. Up to 15 smoothing iterations are performed through, and just downstream of it. Another possibility can be
on the grid. The number of iterations executed depends on the attributed to a turbulent region formingjust downstream of the
degree of irregularity of the grid, and the number of points in jet. At this stage of the work, the numerical code does not
the regions wherethere is a high concentration of points. The include a turbulence model and this pati of the flow can not he
grid generated for this test case, using this method is shown in accurately represented. However, as an initial foray into this
figure 2. problem and for comparing the effects of changing different
parameters, the resolution of the results especially near and
4. RESULTS A N D DISCUSSlON upstream of the jet is adequate.
The test case for the investigation is a laminar boundary layer
developing over a flat plate. A jet issues perpendicularly into 4.2 The Entropy Parameter
the supersonic crossflow from 6 1 , where Lx#,, the
position of the jet. The inflow conditions for the flowfield are
In order to examine the effects of the entropy parameter, S, as
defined in equation (15), numerical solutions for the test case
calculated for a Mach number of 2.61. The inflow conditions
were found for seven different values of S,varying from 0.001
of the jet are as follows: the jet pressure ratio, Pp-=7.0 the
to 1. All of the cases. in which the entropy parameter was
temperature ratio, T)”-=l.O; and Mj=l.O. The Reynolds being studied, were calculated using Limiter 1. This limiter is
number of the flow; ReL=749.000; and the freestream the most robust one, and was least likely to fail when the more
temperature, T=300K. These values were derived from the extreme values of the entropy parameter were being tested.
data provided in reference [3].
The skin friction and pressure distributions across the flat
plate, for different values of 8, are presented in figures 4 and
4.1 Comparison with Experiment
Other than the results obtained to test the grid dependency of 5, respectively. These plots show the effect of S on the
the solution, all of the numerical results produced were numerical solution. The pressure plots show that as the
calculated on a l00xl00 grid as shown in figure 2. The grid entropy parameter is increased the shock wave becomes less
has 6 points in the jet, and contains between 30 and 35 points well defined. For the highest values of the entropy parameter
in the boundary layer region. Simple boundary conditions the shock wave has smeared all the way to the jet injection
have been used everywhere. The flat plate is modelled using point. Also by increasing the entropy parameter the low
no-slip conditions and is considered to be adiabatic, the inflow pressure region after the jet, denoting a recirculation region,
condition is fixed at the initial condition, and all outflow becomes damped out and the pressure plateau related to the
conditions use a simple linear extrapolation technique. The jet separation region upstream of the jet reduces considerably.
The Skin friction distributions show the change in the point of
boundary layer separation, xSn (i.e. where the skin friction

I’ =I/
M OS 1.0 IS 50
0.0 0.5 1.0 2.0 xn
x/L
Figure 4. A graph comparing t h e effectsof
Figure 3. Pressure distribution along the surface of different values of Son the skjn friction distribution
t h e flat plate. across thsflat plate.
I 4-5

.
I" I
a0 a5 1.0 15 1J
dL
Figure 5. A graph comparing the effects of Figure 6. Velocity profiles in the boundary layer
different values of S on the pressure distribution for different values of 6.
along the flat plate.

profile tint crosses zero) with S. As the entropy parameter


increases the xXpmoves closer to x+,. It can be clearly seen
from both of these plots that the choice of S must be
considered carefully.
In essence, changing the value of 6 alters the minimum
amount of artificial viscosity that is added to the solution.
This has the undesirable effect of altering the shape of the
boundary layer. Boundary layer profiles plotted for locations,
xLd.319 and x L d . 8 7 9 are presented in figures 6 and 7. It
can be seen from figure 6, in which the boundary layer has not
yet separated, that as S is increased, in general. the boundary 0.0 1
layer becomes thicker. The variation is complex, such that the 0.0 02 01 R6 OS la
boundary layer thickness reaches a peek, and then it begins to Valoeity,UN-
decrease slightly. This is caused by the addition of artificial
dissipation to the real viscosity, introduced by the Navier- Figure 7. Velocity profiles in the boundary layer
Stokes equations, in the boundary layer. Figure 7 shows for different values of S.
velocity profiles near the wall, after the boundary layer has
separated. The change in the boundary layer separation point
has a great effect on these profiles. The recirculation region dissipation is comparatively small for low values of S?but it
increases and the separated boundary moves further from the increases to an overwhelming amount for 84.0. Figures Ea-2
flat plate as S decreases, until S reaches a value of 0.01. to c-2 show the dissipation profdes for a separated boundary
Further lowering of 8 moves the separated boundary layer layer. These plots show that even for very low values of the
slightly closer to the flat plate. However, it can be seen that, entropy parameter. the artificial viscosity is still high enough
when values of S below 0.01 are used, the boundary layer to be interfering with the solution. Artificial dissipation has
profiles seem to become less sensitive to changes in S, and been introduced by the scheme to sharpen the definition of the
have converged. . boundary layer as if it was a shock wave. Obviously this isn't
necessary in the case of the boundary layer. However, the
An examination of the amount of artificial dissipation and real effect of increasing 6 seems to be an increase in the thickness
dissipation added to the solution near the flat plate is shown in of the separated boundary layer.
figures (8a) to (8c). These figures represent dissipation
profiles in the boundary layer. The values plotted are the Many of the methods used to limit the introduction of the
artificial and real second-order terms for the x- and y- artificial dissipation in the boundary layer would not work in
momentum parts of T, (see equation IO). respectively artiticial the separated region. Caughey and Varma 161 provide two
Gz,real Gz.artificial G, and real G3. These terms have been methods of limiting the artificial dissipation added by the
used because high gradients of variables are expected normal scheme. The first method simply sets the artificial dissipation
to the wall direction. All of the graphs show the amount of on surfaces to zero. The other method scales the dissipation
artificial dissipation added to the scheme compared to the real using a function of the local Mach number. Neither of these
dissipation. Not surprisingly, the addition of the artificial methods are likely to have a great effect on limiting the
dissipation seems to cause a considerable change in the real artificial dissipation tem, as the artificial dissipation being
dissipation profile which also must have an effect on the added in the separated region is neither near the flat plate, nor
boundary layer thickness. Figures Ea-I to e-l are plots for the in a relatively low velocity region. Also, for most values of
unseparated boundary layer. They show that even for small the entropy parameter. the artificial dissipation added at the
wall is relatively close to zero.
values of 6, the amount of artificial dissipation added to the y-
momentum equation is considerable compared to the real Figures 9a to c give a more qualitative view of the artificial
viscosity. For the x-momentum equation the artificial and real dissipation being added near the wall. Contour plots
14-6

a-1) a-2)

bl) b-21

c-I) c-2)

d-1) d-2)

e-2)

~. i 1

Figure 8. A comparison of dissipation profiles. for differentvalues of 6 and different limiters.


throuah an UnseDarated la-1 to e-1) and seoarated la-2 to 8-2) baundaw Iamr
a-1) Umlter 1 d=O.001 &
.) timiter 1 6=0.001
o.rm

OW3

> 0.w

0.m

0.m
0.0 Ob I .o 1.5 2J 0.0 0.5 1.0 1.5 2.0
XR XR

b-1 ) Limiter I 6=0 05 b-2) Limiter 1 d=O.O5

om,

om
06 05 10 IS 20 0.0 0.5 1 .o 1.5

XlL XR
c-1) timiler 1 6=1.0 C.2) Urniter 1 6=1.0

0.0 0.5 t.0 1.5 2.0


XR
d-ll Limiter 3 b=O.O6 "-2) Urniter3 d=O.M

oa 05 1.0 15 2.0 0.0 0.5 1.0 1.5 2.0


XlL xn
9-11 Limiter 5 6=0.06 e-2) Umiter 5 d=0d5
o m a.m

D.075 0.075

> 0.050 0.m

o m 0.M

om 0.m p
0.0 1.0 1.5 2.0 0.0 03 1.0 7.5 1.0

XlL XA

Figure 9. Contour plots of real dissipation (a-1 to 9-1) and artificial dissipation ( a 4 to 9-2) for the flowiield in the vicinity of the flat plate
using severd different limiters and values of 6.
14-8

0-1 ) Umiter 1 6=0.001 a-2) Urniter 1 6=0.001

00 06 18 1s 20 0.0
UL
b-1) Limiter 1 6=005 b-2) Limiter 1 6=0.05

0.0 0.5 1 .o 1.5 2.0 0.0 0,5 I .o 1.5 2,0

UL UL
el) Limiter 1 6=1.0 c-2) Limiter 1 6=1.0
0.01

"m

0.0 0.5 1.0 1.5 PO


UL
'I Umiter 3 6=0.05
0.w

0.m

gom
0.0,

0.M
0.0 0.5 LO 1.5 20 OD 0,5 1.e ,I 7.0
XJL dL
e-1) Limiter 5 6=0.05 e-2) Umiter 5 6=0.05

Figure 10 Contour plots of pressure across the whole flowfield(a-I to e-1 ) and Mach number near the flat plate(a-2 to e-2)
using several different limiters and values of 6
14-9

..............................

1,lrnlter 2
I.irnllrr 1
1,lrnltur 4 ... :

1,lmltrr 5

-0.0005- . . . . . . . . . .j .................. .......


I.lrnltrr 3
l.lmller4
0.6 -

0.4 1 I I I
I I I I
0.0 0.5 I .o 1.5 2.0 0.0 0.5 1.0 I .5 2.Q
xn xn
Figure 11. Pressure distribution along the flat plate for Figure 12. Skin friction plots for the flat plate, for each
five different limiters. limiter.

of G2-real and G2-artificial for three values of 6 have been similar to that shown in equation (15b). Lin also uses
included. These plots can not be directly compared with each different values of the entropy parameter for the linear and
other as the variation of the contour lines is different for each non-linear waves. This means a much smaller value for the
case. They show general trends only. Of particular interest is entropy parameter can be used for the linear waves and hence
the small region just upstream of the jet where both the the boundary layer will be less affected by the artificial
artificial and real dissipations are being added in comparable dissipation term. An investigation of how effective this
amounts. concept is, is underway. The results presented by Lin are
encouraging.
Figures loa-I to c-l show pressure contours for the whole
flowfield. These plots illustrate the effects of increasing 6 on
the whole solution. The separated shock wave becomes less 4.3 Comparing Limiters
well defined until i t is not even clear that there is a The choice of flux limiter is an important factor in TVD
shockwave, and the shock wave caused by the reattachment of schemes. Five different limiters have been investigated here.
the boundary layer, downstream of the jet, also becomes less Limiter I is the well known minmod limiter. Limiter 2 is the
easy to recognise. limiter formulated by Van Leer and limiter 5 is known in the
literature as the "Roe's Superbee" and is highly compressive.
Figures loa-2 to c-2 are contour plots of Mach number for the
flow near the wall. These plots illustrate the Mach disk, the It was found that the numerical scheme failed when limiter 5
separation and reattachment of the boundary layer, and was used if the entropy parameter was set below a value of
regions of recirculation. The plots also show the slight 0.05. Therefore in order that the limiters could be compared,
increase in boundary layer thickness with increasing 6. The the computed solution was found for each limiter with the
small pocket of recirculation just downstream of the jet is entropy parameter set at a value of 0.05. In all likelihood this
reduced in size when large values of 6 are used. means that the best possible results for limiters 2, 3 and 4 have
not been found.
In conclusion, when 6 is increased to a value of 0.5 and
Figures I 1 and 12 show pressure and skin friction
beyond, the solution becomes highly inaccurate. However, for distributions along the tlat plate for each of the limiters. The
values of the entropy parameter equal to and below 0.01, the pressure distributions show that the major effect of using
solution seems to be less sensitive to changes in 6. This different limiters is to change the point of boundary layer
profile represents the solution where very little artificial separation. The shock definition on the surface of the flat
dissipation is being added to the boundary layer and i t is also plate, denoted by the pressure gradient of the shock wave, is
the solution closest to the experimental pressure data. not greatly improved by using a more compressive limiter.
Other methods of modelling the entropy parameter, see From figures I I and 12 it can be seen that limiters 2 , 3 and 4
equation (15b), have been defined by Muller [ 10,l I ] and Lin produce similar results. Also, these results compare well with
[2]. Muller has used an entropic function of the local spectral the results obtained using limiter 1 with values of 6 below
radii to model the entropy parameter. A brief examination of 0.01, as shown in figures 6 and 7, but generally it seems
this technique showed that for 6=0.005, and using limiter I , limiters 2, 3 and 4 are better than limiter I .
the results produced were very similar to the results produced
using the method described by equation (15b). I t is possible From figure 12, one can see that limiter 5 , the "Superbee."
that Muller's entropy function will be more effective when does not seem to be well conditioned for this problem, and is
used with other limiters, and this will be investigated in later probably unacceptable for use when solving any viscous flow
work. problem. The corresponding skin friction distribution is
considerably different from those obtained by the other
Lin [2] states that the viscous flow results using the scheme limiters, both upstream and downstream of the jet.
described in this investigation are unacceptable if a value of
6=0.25 is used. This agrees with the results presented here. Figures 13 and 14 present velocity profiles for each of the
Lin suggests using a form of the entropy function which is limiters. The plots given in figure 13 are for the unseparated
14-10

0.03 1 0.03- ... ..............._....................................................


!

1 Position on flnl plate, Xn=0.31Y


'_ ........
I
0JJ*5- ............ . .................................
~ .......................
&.. .A

. i
j

0.0 0.2 0.4 0.6 0.8 1.0 11


0.0 0.2 0.4 0.6 0.8 1.0
Velocity, UAJ,
Velocity, U/U,

Figure 13. Velocity profiles in the unseparated Figure 14. Velocity profiles for t h e separated boundary
boundary layer for different limiters. layer for different limiters.

boundary layer. As expected, the data suggests that limiters drawn from these graphs. Firstly, on the real dissipation plots,
with a more compressive nature reduce the thickness of the the concentration of the contours uptstream of the jet are far
boundary layer. Ln the case of the scheme using limiter 5, the higher than for the other limiters, and there is a small region of
boundary layer thickness has been significantly reduced. concentrated contours downstream of the jet for limiters I and
Again, limiters 2, 3 and 4 produce similar profiles. Limiter 1 3 which is not as prominent for limiter 5. Secondly, the
produces a slightly thicker boundary layer than the others. artificial dissipation added by limiter 5 is quite different in
pattern to the dissipaton added by the other limiters.
The results in figure 14 are velocity profiles for the boundary
layer after it has separated. Limiter 2, 3 and 4 produce similar The pressure contours shown in figure IO, show how the more
profiles, however there is now a more significant degree of compressive limiters produce better defined shocks waves and
difference in the results being obtained. The variation in expansion fans. The movement of the separation shock wave
boundary layer thickness for different limiters, is reversed further upstream of the jet, for limiter 5 , is also clearly
compared to the unseparated case. The more compressive illustrated here. The Mach number contours show the change
limiters produce a thicker boundary layer. Primarily this is in the thickness of the boundary layer for different limiters.
caused by the change in x,,,, for the different limiters. These The plot for limiter 5, most clearly illustrates the reattachment
results are similar to those obtained for low values of the of the boundary layer. The recirculation region downstream
entropy parameter using limiter 1, as shown in figure 7. of the jet is much bigger for limiter 5 , than for the other two
limiters present.
Dissipation profiles for limiters I , 3 and 5 shown in figures
8b, d and e respectively, indicate that for the unseparated The results presented here show that limiter 5 is not a good
boundary layer the artificial dissipation introduced into the y- choice of limiter for a viscous flow problem, because of its
momentum equation is high for limiter 3, but low for the other highly compressive nature. Limiter 1 produces acceptable
two limiters. Limiter 5 introduces artificial dissipation of the results if the value of the entropy parameter is limited to 0.01
opposite sign to the real dissipation, in the x-momentum for this test case. Limiters 2, 3 and 4, the mid-range limiters
equation. This result explains the decrease in the boundary including the Van Leer limiter, produce similar results and are
layer thickness when using limiter 5. Also the amount of real probably best suited for this problem.
dissipation being added when limiter 5 is being used is
slightly higher than for the other limiters.
4.4 Effects of Grid Size
The dissipation profiles for the separated boundary layer show In order to check the dependence of the solution on the
a marked difference between each of the limiters used. The available number of grid points, the numerical code was run
amount of real dissipation being added when limiter 5 is being on three different grids with different point concentrations.
used is much smaller than that added when the other limiters The grid sizes used for this study were, 50x50, IOOxlOO and
are used. The artificial dissipation being added to the x- 200x200. These calculations were all done using Limiter 3
momentum also seems to be reduced, although there is a and the entropy parameter, 6=0.05.
region at the edge of the boundary layer where large amounts
of dissipation of the opposite sign is being added. Each of the A comparison of the pressure distribution along the flat plate
other limiters also add this opposing dissipation but not in for each grid is shown in figure 15. The plots show a severe
such a comparably vast quantity. However the other limiters degradation of results between the different meshes used. The
add more dissipation in other regions of the boundary layer, most coarse grid does not define the shock wave well, and the
e.g. near the wall. Also in contrast to limiters land 3, the pressure valley and plateau upstream of the jet do not reach
artificial dissipation added to the y-momentum equation by the values produced by the other two grids. The main
limiter 5 is of the opposite sign to the real dissipation being differences between the 200x200 and the 1OOx 100 grid are the
added. These graphs go some way towards explaining why improved shock definition at the wall, and a slight increase in
the results from limiter 5 are so different to the results the pressure plateau for the higher grid concentration.
obtained from theother limiters. The skin friction distribution comparison given on figure 16,
The contour plots of dissipation for limiter and are show a marked difference in profile for each grid type. The
shown on figures 9b, d and e. Two main conclusions can be skin friction given by the 200x200 grid is far higher than the
14-1 1

2.0 -l -. 1

0.4 Y . - I
I I I
I I I I
0.0 0.5 1.0 1.5 2.0
0.0 0.5 1.I) 1 .5 2.0
Distance along flat plate, x/L Distance along flat plate, fi

Figure 15 A graph showing the effects of grid size on Figure 16 A graph showing the skin friction profile
the pressure distribution along a flat plate. along the flat plate for different grids.

I
I
--- 2wxmu
I

00 0.2 04 0.6 0.8 I0 1.2


Velocity, UIU,

Figure 17. Velocity profiles for an unseparated


boundary layer for different grids.

other grids, and the recirculation region just downstream of Figures 19 and 20 show the artificial dissipation terms added
the jet is much better defined when more points are used. The to the x-momentum equation profiled through the unseparated
differences can be attributed to the fact that, for a higher point and separated boundary layer, respectively. For the
concentration the boundary layer is better defined. The unseparated boundary layer, the lOOx 100 and 200x200 grid
location at which the boundary layer separates does not vary produce a similar profile, and near the wall the amount of
greatly over a change in grid point concentration. artificial dissipation being added for each of these cases is
Boundary layer velocity profiles are provided on figures 17 quite close. The real differences in the profiles occur towards
and 18 for each of the grids. The unseparated boundary layer the edge of the boundary layer. The amount of dissipation
profiles, presented in figure 17, show the simple improvement provided by the scheme using the 50x50 grid is sometimes
twice as much as that introduced when the other grids are
in boundary layer definition as more points are put near the
being used.
wall. Table 1 gives the number of points in the boundary
layer for each case. The grid used earlier in the report does In the case of the separated boundary layer, the 50x50 grid
not provide an accurately defined boundary layer, but it was of seems to add less artificial dissipation than the other two grids.
acceptable quality for the comparative study being performed. Although it is not clear from figure 20, the dissipation being
added by the IOOxlOO and 200x200 grid is almost identical in
Figure 18 shows velocity profiles in the separated region for
each of the grids. In the recirculation region near the wall, the
lOOxl00 and 200x200 grids produce similar results. The
50x50 grid produces a solution with a larger amount of
I GRIDSIZE
Points in the
boundary layer
recirculation than the other two grids. I 50x50 I ~ 23
Both sets of velocity profiles show that the grid concentration 1oox 100
has a large effect on the calculation of the boundary layer, and
hence the rest of the solution. 200x200
Table 1. Grid points in the boundary layer.
14-12

, ... ..,., ,, , . ,. ,
:
,
.. . ..... .
: I
i I
:
, , .. ... . . . ,..... . .. ..

;
,.. .. ... ... . .

i
0.0 ; I
I I I I I
-0.02 -0.01 0.0 0.01 0.02 0.03 -0.6 -0.4 -0.2 0.0 0.2 0.4
Artificial dissipation in GIterm Artificial dissipation in GI term

Figure 19 Dissipation profiles through an unseparated Figure 20 Dissipation profiles through a separated
boundary layer for different grids. boundary layer for different grids.

the recirculation region. It is only when the edge of the Viscous Hypersonic Flows”, J. Comput. Phys., 88, 1990,
separated boundary layer is reached that large differences pp3 1-61.
become apparent. In the region of the separated boundary 2. Lin, H-C. “Dissipation Additions to Flux-Difference
layer, the IOOxlOO grid seems to add far more dissipation in Splitting”, J. Comput. Phys.117 ,1995, pp20-27.
this region than the other two grids. It can be clearly seen that 3. Zukoski, E.E., Spaid, F.W. “Secondary Injection of Gases
in this region the grid is becoming more coarse, and numerical into a Supersonic Flow”, AIAA J., 2, Oct 1964, pp1689-
truncation errors are becoming dominant. 1696.
It is clear from this brief study, that grid point density has a 4. Allmaras, S.R. “Contamination of laminar boundary layers
great effect on the viscous regions of the solution, and the by artificial dissipation in Navier-Stokes solutions”,
amount of artificial dissipation being added by the numerical Proceedings of the Conference on Numerical Methods in
scheme should not be analysed in isolation. Fluid Dynamics, Reading, UK, 1992.
5. Tatsumi, S., Martinelli, L., Jameson, A. “Design,
Implementation, and Validation of Flux Limited Schemes
5. CONCLUSIONS
for the Solution of the Compressible Navier-Stokes
The numerical scheme described in this study has been used to
Equations”, AIAA-94-0647.
model a transverse jet interacting with a supersonic flow. This
test case differs from others that have been used to study 6. Caughey, D.A., Varma, R.R. “Evaluation of Navier-
Stokes Solutions Using the Integral Effect of Numerical
artificial dissipation, as the problem includes a separated
Dissipation”, AIAA J., 32, Feb 1994, pp294-300.
boundary layer, and reverse flow regions.
7. Turkel, E., Vatsa, V.N. “Effect of Artificial Viscosity on
This scheme can be used to produce adequate viscous Three-Dimensional Flow Solutions”, AIAA J., 32, Jan.
supersonic flows. It has been shown here, that the choice of 1994, pp39-45.
limiter and value for the entropy parameter is of great 8. Swanson, R.C., Turkel, E. “Aspects of a High-Resolution
importance for producing good results. For the transverse jet Scheme for the Navier-Stokes Equations”, AIAA-93-
test case, limiters 2, 3 and 4 produces good results, and limiter 3372-CP.
4 is probably best suited. Although the examination of the 9. Yee, H.C., Kutler, P. “Application of Second order
entropy parameter was not done for each limiter separately, Accurate Total Variation Diminishing Schemes to the
when limiter 1 was used, it was found that the best results Euler Equations in General Geometries”, NASA TM-
were produced when the entropy parameter was no greater 85845, 1985.
than 0.01. It is reasonable to assume that this range of values IO. Muller, B. “Implicit Upwind Finite Difference Simulation
will produce acceptable results with the other limiters. of Laminar Hypersonic Flow over Flared Cones”, Notes
on Numerical Fluid Mechanics, 29, 1990.
6. REFERENCES ll.’Muller, B. “Comparison of Upwind and Central Finite
1. Yee, H.C., Klopfer, G.H., Montagne, J.L. “High- Difference Methods for the Compressible Navier-Stokes
Resolution Shock-Capturing Schemes for Inviscid and Equations”, Notes on Numerical Fluid Mechanics, 30,
1991.
16-1

A Flux Filter Scheme Applied to the Euler and Navier Stokes


Equations
A.Vinckier
J. Jacobsen
S. Wagner
Institut fur Aerodynamik und Gasdynamik
Universitat Stuttgart
Pfaffenwaldring 21
70550 Stuttgart, Germany
6 kronecker delta
1 SUMMARY h eigenvalue
e,, 8, CFL number in x und y direction
In this contribution, we present a multi-dimensional 0 root of scheme
upwind scheme. In contrast to the Flux-Vector or the 0 root of spacial discretization
Flux-Difference-Splitting method, where an upwind
operator is used before the residual is calculated, this
scheme uses an operator on the discrete flux
integration or flux balance and assigns then filtered
2 INTRODUCTION
parts of the residuals to the vertices of a cell.
The so called Flux-Filter operator, will be derived in a Upwind Methods have become very popular over the
consequent manner on an one-dimensional basis with last decade and can be categorized into two major
the purpose to allow a stable updating. The scheme is methods, the Flux-Vector-Splitting and the Flux-
linearity preserving and should therefore lead to an Difference-Splitting method.
improved accuracy. The schemes of Steger&Warming [ l ] and Van Leer
The Flux-Filter scheme has been successfully [2] are representative for the Flux-Vector-Splitting
implemented on the Euler and Thin Layer Navier schemes. Here, the fluxes are splitted into two parts, a
Stokes equations, for structured and unstructured positive and negative part. The positive, respectively
grids. The unstructured grids are made of triangular negative, fluxes have purely positive, respectively
and quadrilateral cells. negative, eigenvalues and can therefore be
differenciated with backward, respectively forward
upwinding.
List of Symbols The Flux-Difference-Splitting methods or Riemann
solvers are another group of schemes. Here the
A flux jacobian matrix conservative variables are taken to be piecewise
d2),d4) artificial viscosity vector constant between the cell faces. At the faces there is a
energy fluid state on the left side and a different fluid state at
inviscid flux the right side, which results in an interaction. This
identity matrix interaction, seen in one dimension, has a mathematical
pressure source vector and physical exact solution. It is equivalent to the
shock tube problem also known as the Riemann
flux residual with p order of integration
problem. The most popular approximated Riemann
surface solvers are from Roe [3] and from Osher [4].
conservative vector
Those schemes solve the upwinding by treating each
eigenvector matrix
space dimension seperately along the gridlines or
convection velocity along the normals of the cell faces. This has a
pressure disadvantage that contact discontinuities which are not
preferential factor aligned with the grid are not properly solved. To
time overcome this problem, a new group of methods has
velocity emerged since the early 1990’s, the so-called multi-
scalar value on point i,jat time level n dimensional upwinding scheme. Two distinct methods
have been developed up to now; the Flux-Function
flux-filter matrix methods [5] and the Flux-Fluctuation methods. The
volume Flux-Function scheme will calculate the flux through a
coefficient face in a Riemann manner but independently from the

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
16-2

grid. The variation in Flus-Function methods lays in


the ‘Riemann’ directions chosen which mainly consist
of the convection and pressure wave directions in (3)
contrast to the main grid directions. The Flux-
Fluctuation methods are based on the flus integration
upon a upwinding method distributes the residuals
values to the cell’s vertices.
The Flux-Filter scheme is a variant of the Flus-
Fluctuation methods. Similar approaches are 3.1 The Flux-Filter Operator
developed by Rossow [6], Giles [7] etc., but this The primary purpose of the Flux-Filter operator is to
scheme differs in the way how the flux residual is extract those elements of the residual which allow a
calculated and distributed to the vertices. The stable updating [SI. The solution of a numerical flow
distribution is based on the characteristical problem is reached when the numerical process has
propagation directions along the grid lines. The Flux- converged, which can be seen as a numerical
Filter is an operator which selects those quantities of a equilibrium state. On a local scale in CFD the changes
flux residual that propagate towards the cell vertex of imposed by the flus balance or residual must result in
interest. This scheme can be applied to structured and the reduction of the discrepancy in the flus balance.
unstructered grids and can solve the Euler and Navier The flux balance for a cell i-112 (between grid
Stokes equations. Stability analysis has shown that points i - 1and i ) is given by
preferential direction flus integration and artificial
viscosity is required. In case of the structured grids a At
combined second and fourth order viscosity is -(& - = -AU (4)
Ax
implemented, for the unstructured grid method only a
second order artificial viscosity have been introduced.
and suppose the flux balance for cell i + 1 / 2 is
In the following section the basic idea of the Flux-
satisfied, hence
Filter scheme is explained for the quasi one-
dimensional Euler equations, followed by the
extension to two dimensions. An analysis of the scalar -(e+,
At
AY
-F,)=o
model equations will highlight some problems
associated with the trapezoidal flus integration. In the
last section a number of results will be presented. The equation which imposes the condition that the
summing up of a filtered portion of the residual on
grid point i reduces the discrepancy in flux balance is
3 ONE DIMENSIONAL FLUX
FILTER SCHEME
The discrete quasi I-D Euler equations for a cell
which is located between the grid points i and i - 1 is The influence of 7 ( A U ) on equation (31s
given by
At
-(F(U,+I)- F(U,+Y(AU)))=-aAU (7)
AY

The stability demands that for all grid points the


discrepancies disappear, or at least remain bounded by
where U is the conservative solution vector, F the flux AU . Hence the stability conditions are
vectors and S the cross sectional area.
Ola12 (8)

for equation (6) and

-1lall (9)

mu needs to be distributed over the


The residual - for equation (7). Both conditions lead to the
At requirement
grid points i and i - 1 . The question is what part of
the residual must be imposed on the left and on the
right grid-points. The distribution is obtained through
an operator 7 which must be linear preserving. The Euler equations have the property that
Hence, the equation for a grid point i is given by F ( U ) = A(U)U where A = aF I aU . Rewriting
equation (6) and assuming AU to be small leads to
16-3

The introduction of the P-vector should require a


I
reformulation of the filter based upon the eigenvalues
F
and eigenvectors of -+ P . The determination of
Ax
Subtracting equation (1 1) with equation (4) gives those eigenvalues and eigenvectors will lead to a
severe increase in the numerical workload and
therefore the Flux-Filter will be based on the
eigensystem of the flux.

The Jacobian matrix A possesses a complete set of In order to assure the conservation of the scheme, the
real eigenvectors, hence introduction of the filter operators may not lead to
additional sources, hence

where X is the eigenvector matrix and A i s the


eigenvalue matrix. The operator 7 is now defined as

where is a diagonal matrix with 0 or 1. Hence

where the positive, respectively negative, Flux-Filter is


set by a still undefined U , , respectively U , . This
implies that
The solution of this eigenvalue problem is

which is always true when U , is identical to . The


Imposing condition (10) on all eigenvalues gives obvious choices are

At U , = U , = (ui +ui-l)
os-A.6. <1
Ax ) I - 2

Therefore, to obtain a stable scheme the 6 imust be This leads to the Flux-Filter scheme for the quasi one
At dimesional Flux-Filter scheme:
zero when hi is negative and -(Aj6i)ma 5 1 . The
Ax
first condition defines the Flux Filter Operator:

where I + is a diagonal matrix where the elements are


1 where the corresponding eigenvalue Ai are positive,
where
analogue for I - . The second constraint imposes the
well-known CFL condition for an explicit scheme.

with =
ui +U,-]
3.2 Implementation of the Flux-Filter 2
operator The trivial solution for steady state solution is
The previous section has defined an operator which
will theoretical allow a stable iterations process. For its
implementation three facts must be taken into account:

1. The additional term P .


2. The distribution of the residual may not for all cells of the computational domain. Unlike the
lead to any sources or sinks. traditional methods, the Flux-Filter formulation leads
3. The filters may not block numerically the to the solution of the original discrete quasi one
propagation of information. dimensional Euler equations (Fig. 1).
16-4

7--= [o] (24)

4 TWO DIMENSIONAL FLUX- where U is the averaged flow vector for the cell. This
FILTER SCHEME formulation reflects the region of dependency. Similar
for 7". and 7'-
In this section, the construction of the two- The conservation requirement is imposed with
dimensional Flux-Filter [9] scheme will be outlined.
The numerical equation for point iJ of structured grid Q=I-(?--+?-+ +?+++?+-) (25)
becomes:

if 7--f [O] then


?--=?--+e/.

I where n is the number of non-null matrices.


Where 7--is the two-dimensional Flux-Filter, which
The formulation for a triangular cell is
models the upwinding. The residual Ri+,12,j+112 is the
surface flux integration of the computational cell
defined bY the points
( i ,j ) , (i + 1, j ) , (i + 1, j + l), ( i ,j + 1). Each cell, which
belongs to point iJ, contributes a filtered portion of its
residual to the temporal change of point iJ. The
purpose of the Flux-Filter is to extract this portion of else
the residuals which allows a stable updating.
z [o]=

4.1 Two-Dimensional Flux-Filter where U is the averaged flow vector for the cell.
Operator Similar for T.
and .
The two-dimensional Flux Filter is based on the one The conservation requirement is imposed with
dimensional operators defined in the last section. The
following constructions have been tested: Q=I-(,-+,+,) (29)
Analysis has lead to the following requirements for the
two-dimensional Flux-Filter: if 3 f [O] then
2 =%+e/.
0 for conservation, the sum of all Flux-Filters for one
cell must add up to the identity matrix: c7 =I . where n is the number of non-null matrices.
Although that for a steady state solution all
residuals should disappear, this seems to be an
unnecessary requirement. However, at a shock the 4.2 The Flux Residual
temporal change becomes zero with residuals The flux residual or flux balance is the flux integral
which are non-zero. If this constraint is not applied over the cell's circumference.
the shock position and strength are incorrect.
0 if the cell has a supersonic velocity component
pointing away from a grid point then the
corresponding Flux-Filter must be the null-matrix.
-+-
au
at 'f-
R
F.iidS=O

This reflects the perception that in the above case where R is the cell's surface. The first order discrete
no information can propagate towards this point. flux integration based on point i, j for a quadrilateral
This lead to the following scheme for quadrilateral cell is
cells (Fig 6.3):

Iif 7 - f [O] and 4-f [O] then

7--= 1(7-(u,iis)+
4 g-(u,ii,))

I The second order flux integration


integration is given by
or trapezoidal
16-5

model equation. The model equation for the multi-


dimensional scheme will be the two dimensional
convection equation:

au au au
-+a-+b-=O (39)
atax ay
Note the inclusion of the information attached to the In this equation a quantity U is convected with a
vertex i + l , j + 1 . The first order integration violates velocity a in the x-direction and a velocity b in the .y-
the criteria which requires that the flux calculation of direction. The theoretical solution preserves the initial
a cell's face is independent of the cell in consideration. function along the convection direction. The accuracy
The first order integration has the advantage that the by which a numerical solution approaches this
resulting scheme is stable. The second order theoretical solution is fully dependent on the
integration fulfills the latter criteria but it has a numerical scheme.
stability problem, which will be analyzed and clarified
in detail. For those reasons, a blended first and second Equation ( 3 9 ) represents the numerical model
order integration will be used, and is called the equation for a one-by-one dimensional first order
preferential direction integration Given by upwind scheme and for the Flux-Filter scheme with
the first order flux integration

where the calculations have shown that for accuracy


and stability the optimum value of q is 0 . 2 5 .
for positive values of a and 6. The multi-dimensional
For reason of stability, artificial viscosity must be
upwinding scheme makes use of the trapezoidal
introduced, which is of course an unwelcomed feature.
integration of fluxes. Hence, the numerical model
The artificial viscosity is a second and fourth order
equation is given by
damping given by

[ y ):
- a-+b- til,] - ( "
-a-++-
2
")
2
~ ~ - 1 , ~( 4 1 )

D,(,:) = d4 -( a$ - b$)uf,l-l -

The numerical behaviour of both methods are analyzed


for convection directions of 0, 22.5 and 45 degrees for
a delta function on the y-axis. Figure 2 presents the
4.3 Thin Layer Navier Stokes result of a 22.5 degree convection using the first-ordcr
The terms for the viscous fluxes are introduced on the integration and using the trapezoidal integration
right hand side of equation ( 2 2 ) . Hence method. The diffusion is dominant for the convection
direction of 45 degrees, while for a convection
direction aligned with the x-axis the scheme produces
the theoretical result. For the reason that the numerical
scheme will produce considerable cross-diffusion when
the convection direction is not aligned with the grid
the first order integration scheme is highly grid
dependent.

To obtain a meaningfull viscous solution, the influence The second order approach leads to the solutions
of the viscous term must exceed the influence of the which matches the theoretical solution for convection
artificial viscosity in the viscous dominant flow directions of 0 and 45 degrees. The solution for the
regions. 22.5 degree convection is dispersive. The accuracy
levels of both methods are given in the next table
where the error is the mean square deviation from the
theoretical solution:
5 SCALAR MODEL EQUATION

In this section, we will analyze the accuracy and


stability aspects of the Flux-Filter scheme based on the I
16-6

Errir I first order 1 second


0 10.00 10.00
22.5 10.16 10.12
45 10.23 Io.00

Although the second order flux integration produces


better results and is therefore less grid dependent, the
transient phase can be extremely chaotic. The Error artificial preferential preferential
following case will clarify a peculiar problem. Setting viscosity integr. q=0.25 + a.v.
0 0.069 0.000 0.038
%,j - 0,Vj 22.5 10.097 10.093 10.073
'1.0 =1 (43) 45 I 0.081 1 0.115 10.095
9
(x=Y
9, Hence, modifications, such as artificial viscosity and
preferential flux integration, must be included to
with 9, = aAt I Ax and 9, = bAt I Ay
stabilize the scheme. The pure second order flux
the values for all q j can be determinated in fimction integration scheme will not work.
of a with a the second order integration scheme

1-a
5.1 Von Neumann Stability Analysis
u l , j = -- '1, j-1 (44)
l+a
hence The results of a Von Neumann stability analysis [lo]
is presented in figure 5 where the maximum
amplification factor is plotted in function of 9,,9, for
(45) q =.25and a small amount of artificial viscosity.
Stability is obtained when 9,8, 20.3 which is the
For the case that a appraoches zero the scheme will upper limit for the CFL number.
produce an undamped oscillatory profile in the
complete domain. This reduces the robustness of the Each time discretization method has its own stability
Flux-Filter scheme where even a slight disturbance is contour wherein the roots of the space discretization
immediately transmitted, in an unfavorable manner, must lay. Figures 4 presents the spacial-roots for the
throughout the numerical domain. In contrast, the Flux-Filter scheme for CFL numbers of 0.3, 0.5, 2.0,
profile for the first order integration scheme is 10.0. The conclusion is that the gain for using Runge
Kutta is minor and that in theory large CFL numbers
can be used with an implicit scheme. However, in
practise, the implicit scheme worked for a maximum
CFL number of 1.0. The gain of factor 3 is not
sufficient to justify the use of an implicit scheme, due
which does not have a oscillatory behaviour. The
to the significant increase in workload.
problem can be remedied with the addition of artificial
viscosity andor the use of a preferential integration
direction. The preferential integration is a blended
form between the trapezoidal integration and the first 6 RESULTS
order integration (Eq.34). Hence the preferential
integration is a first order integration:
The supersonic wedge
The geometry is a two dimensional channel with a 15'
U;;' - = -(.5 + qxe, + 9 , ) U i , ] wedge and followed by a 15' expansion corner (Fig 6).
The inflow Mach number is set to 2. This
(47) configuration induces interactions between shock and
expansion waves [ 111. A shock wave is produced at
the wedge and reflected at the upper boundary. The
reflected shock wave is weakened by the expansion
For the critical case that 9, = 0 the profile becomes fan. The expansion fan is also reflected by the upper
boundary. Dependent upon the length of the channel,
the shock wave and expansion fan are reflected
manifold. Analytical results predict a Mach number of
1.454 behind the first shock wave. The maximum
deflection for this Mach number is 10.5', which
implies that the first reflection will induce a subsonic
flow with an entropy layer (or slip stream). This
16-7

problem is a widely used standard test case [I I]. The Extension to a 3dimensional Flux-Filter scheme is in
grids are a structured 180 by 60 grid, a unstructured progress
grid with 1664 quadrilateral elements and an
unstructured grid with 3633 tnagular elements. Figure ACKNOWLEDGEMENT
6 presents the mach number distribution for the 3
grids. This research projwt was sponsered by the DFG under
contract Wa424/10.
NACA0012 Transonic
REFERENCES
This standard AGARD test case [I21 consists of a
NACAOOIZ profile in a transonic flow. The angle of Steger, J. L , Warming, R.F.: Flux Vector Splitting of
attack is 1.25" and the free stream Mach number is the Inviscid Gasdynamic Equations with Application lo
0.8. The main features are a strong shock located at Fimte-Difference Methods. Journal of Computational
x=O.62 on the upper surlace and a weak shock at Physics, Vol40, pg 263-293 (1981)
x=O.37 on the lower side. The Mach number
distributions is given in figures 7. Van Leer, B.: Flux Vector Splitting for the Euler
Equations. Proc. 8th International Conference on
This problem is solved with a Runge-Kutta scheme on Numerical Methods in Fluid Dynanncs, Berlin,
a 160 by 50 structured grid. The maximum CFL Springer ( I 982)
number was 0.4. After 3000 iteration steps, a pseudo
convergence was obtained, where the upper shock
[31 Roe, P.L.: The Use of the Riemann Problem in Finite
position continued to move back and forward around DifferenceSchemes. Lecture Notes in Physics, Vol 141,
one grid cell. pg354-359,BerlinSpnnger Verlag(1981)
The predicted Cl (0.3708) and Cd (0.0213) cdlicients
correspond well to those given by the AGARD [41 Osher, S.: Numerical Solution of Singular Pertubation
test. [121 Problems and Hyperbolic Systems of Conservation
Laws.Mathematical Studies, Vol47 Amsterdam, North
Holland ( 1981)
NACA0012 Mach 0.5 Reynolds 5000
This test case [I31 demonstrates the use of the Flux- Powell, K.G., Barth, T.J., Parpia, LF.: A Solution
Filter scheme on the Navier Stokes equation. The test Scheme for the Enler Equations Based on a
case consists of a subcritical flow ( M a 4 . 5 ) over a Multidimensional Wave Model. A I M 93-0065 (1993)
NACA-0012 with a Reynolds number of 5000. The
angle of attack is 0 degrees. The flow has a Rossow, C.C.: Efficient Cell-Vertex Upwind Scheme
recirculation at the trailing edge. The predicted for the Two Dimensional Euler Equations. AJAA
Journal vol. 32, pg278-284, (1994)
location of separation from references is at ~ 0 . 8 2
This scheme predicts the flow separation to at ~ 0 . 9 2
(Fig. 8), which indicates that the Flux-Filter scheme 171 Gib,M., Anderson, W , Roberts, T.: Upwind Control
Volumes A New Upwnd Apprcach. AJAA 90-0104
has an incorrect degree of dissipation. (1990)

7 CONCLUSIONS Vinckier, A.: An Upwind Scheme Using Flux Filters


Applied to the Quasi I-D Euler Equations. ZFW 15
The Flux Filter scheme has been applied to the Euler (1991)
and Navier Stokes equations, on structured and
unstructured grids, and with Euler stepping, Runge- [91 Vinckier, A., Jacobsen, J., Wagner, S.: A Flux Filter
Kum and with an implicit time integration scheme. Scheme Applied to the Euler and Navier Stokes
The results from the computations have demonstrated Equations. b m a s Proceedings. Stuttgart 1994. J.
several features. First, it has been shown that the Flux- Wiley (1994)
Filter method is capable of obtaining highly accurate
solution on the basis of a truly multidimensional [IO]Lomax, H.: Finite Difference Methods for Fluid
approach. Second, the stability analysis has shown that Dynanncs, Lecture Notes AA214A. Stanford University
the scheme does not allow too large time steps. ( I 987)

The integration, spatial and temporal, could be [IIlLevy, D.W., Powell, K.G., Van Leer, 9.: An
Implementation of a Grid Independent Upwnd Scheme
improved to allow much larger time step and fortheEulerEquations. AJAA 89-1931 (1989)
convergence rates. This would be advantageous for the
3dimensional version. One way is the use of higher 1121Viviand, H.: Numerical Solutions of Twc-Dunensional
order flux integration or the use of flux limiters. The ReferenceTestCases. AGARD-AR-2IL.pg6.1 (1985)
temporal integration can be improved by solving the
implicit method more accurately or by implementing a [I31 Venkatakrishnan, V.: Viscous Computations Using a
multigrid scheme. The improvements should also be Direct Solver. Computers & Fluids Vol 18. No2 pg 191-
aimed at reducing the level of artificial viscosity or to 204 (1990)
eleminate its use.
znn.,
15

10

a5

r~~~
ODI O 3 5
Mach Number

a0 DS , , , ,
, 10 $0
Massflux- and ERthalDy Error

4s M 05 10

Figure 1 Solution of a supersonic-subsonicflowin a diverging nozzle.

Figure 2 Solution of the scalar model equation with first order, respectivelysecond order flux integration
16-9

Figure 3 Solution of the scalar model equation with preferential


flux integration and artificial viscosity.

RK Modified 4r

n 4[

CFL=2.0 4[ CFL= 10.0


!
_1

Figure 4 Spacial roots of flux filter scheme vs stability contours of different time integration metha i
16-10

i.5

O
‘m
0.5

.~ ~~ ~~~ ~~~ 1- ~~~~

Figure 5 Amplification factor and i (CFL= 0.3) for the flux filter scheme.

StNctured Grid 180x60

1.87451
UnstructuredGrid 16MOuad-Elements 175D Nodes (AdvancingFmnlTechnique) 1.80692
1.73933
1.67175
1.60416
1.53657
1.46898
1.4014
1.33381
1.26622
1.19863

I 1.13104
1.06346

UnStNctUred Grid 3633Tri-Elements 1911 Nodes (AdvancingFront Techniaue)


. .

Figure 6 Supersonic flow over a 15 degree wedge solved on different grid types.
16-11

math
0.53BB07
0.483128
0.429446
0 375165
03220M
0.28M04
0 214723
0 '161042
0.107361
0.0536807
Implicit Multidimensional Upwind Residual Distribution
Schemes on Adaptive Meshes
H. Paillere, J.-C. Carette, E. Issman, E. van der Weide, H. Deconinck and G. Degrez
von Karman Institute for Fluid Dynamics
ChaussCe de Waterloo 72, 8-1640 Rhode-St-Genke, Belgium

Abstract Cartesian components U and U. The third equation is the


familiar compressible potential equation written in primi-
The paper reviews recent developments in multidimen- tive variables U and U , with c the soundspeed. The fourth
sional upwind schemes based on the residual decomposi- equation is the vorticity equation (Crocco’s equation).
tion or fluctuation splitting approach. Unlike the stan- which is coupled to entropy and total enthalpy through
dard finite volume approach, the upwinding is based derivatives in the direction n normal to the streamline,
on multidimensional physics, e.g. convection of entropy a,, = -;a= + :av, where q = is the norm of
and total enthalpy along the streamline and convection the velocity vector. The coefficients PI and a2 are given
of acoustic Riemann invariants along the Mach lines in
by
steady supersonic flow. The resulting schemes on trian- 1 I
gles and quadrilaterals are very compact, with stencils PI = - a2 = - (2)
(7- h‘ P’
ronsisting of nearest neighbours only and can be made
monotonic and second order, like the TVD schemes in corresponding to the definition 8.9 = ap - c’ap. For
finite volumes. Numerical examples show the improved shock free flow with uniform homentropic and homen-
performances compared to state-of-the-art methods. T h e thalpic inlet conditions, the first two equations have the
paper further describes the introduction of convergence trivial solntion S(z,y) = C‘, H ( z , y ) = C‘, leading to
acceleration techniques which exploit the compactness of the potential formulation for steady irrotational flow.
the stencils and the implementation of solution adaptive At this stage of the analysis, it is instructive to recall that
error control. The latter is based on scalar finite element the first two equations are ordinary differential equations,
N posteriorierror estimates which are applied to the Euler which can be integrated by marching along the stream-
system in decoupled form thanks to the multidimensional line starting at the inlet boundary, commonly known as
iesidual decompositions. the method of characteristics. It is m fact remarkable
that this idea of upwinding entropy and total enthalpy
along the streamline is totally absent in the state-of-the-
1 THE CASE FOR MULTIDIMEN- art conservalrw methods for solving the steady Euler
SIONAL UPWINDING equations. One of the key aims in multidimensional u p
winding methods is precisely to reintroduce this idea in
.it. the heart of present day upwind schemes for com- a conservative formulation. Note further that these two
puting compressible flows is the solution of the one- equations are equally valid in 3D.
dimensional Riemann problem : it describes the evolu- Turning our attention back to the two remaining equa-
tion of the flow which results from bringing into contact tions, the analysis simplifies by considering a streamline-
two fluids at constant but different states. aligned coordinate system with coordinates .I‘ in the
Conservative Finite Volume Methods use this building streamline and Y in the normal direction. In the new
block as follows : writing the conservation law for a given axes, the velocities are denoted by U = q, V = 0. giving
cell. the cell-face fluxes in the spatial operator are evalu- for the Euler equations (1):
a t . 4 b>-solving for each face in turn, the one-dimensional
Riemann problem defined by the cell averages (or a recon-
struction at the cell face) on either side, thereby assuming
a series of one-dimensioual problems in the direction of
t.hr cell-face normals.
.Alt,liougli extremely successful, the question rises how
truly multidimensional physics could be brought into this For supersonicpow, the latter two equations can be ex-
pictiirr. and what benefits could be expected from doing pressed as ordinary differential equations along the Mach
lines rt and r-, (see Figure 1). by iiitroduciiig the acouc
SO.
Consider therefore the case of the steady Enler equations. tic characteristic variables
Choosing entropy, total enthalpy and the components of
vrlorit.y as t,he independent variables, the equations take
-
1 he following familiar quasilinear form
C.VS=O
In these variables the Euler system for steady supersonic
i;.VH=O flow can be written as
(1)
( 1 -$)U. - ~ ( u v + u r ) + ( l - $)”# = o a..x s = o
V> - uY + PlanS+ a& H = 0.
&H=O
The lirst two equations express that entropy S and to- axc+ + +a,.c+ =0 (5)
tal cnt,lialpy H are Riemann invariants along the stream- Ma-I

linrs. which are the fieldlines of the velocity vector < with axc- - +a,.c-
kla-1
= 0,

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorilhms”
held in Seville, Spuin, from 2-5 October 1995, and published in CP-578.
17-2

,. r+ it is given by

r o o1

la 0
0
0
0
0 1 1
01

or in Cartesian coordinates
rig. 1 : Mach angles = t p and streamline coordinate sys-
Y)
Lein (1,

Here U+ and U- are defined as U+ = +


M2-l and

where Lgp =
+ defines the Mach angle i p be-
l.wveen Machlines and streamline, see Figure 1. Again,
the physics dictate a multidimensional upwinding dis-
U- =
Jmax($,
_M1-1-d
*c
and s and fi
IMZ- 11) and 1 = *.
are given by @ =
To circumvent
the singularity at the sonic point, E is different from zero
cretization of the acoustic equations by a space opera- and given a small value (ty&dy 0.05).
tor upwinded along the Mochlines, as in the method of Clearly, the third and fourth characteristic equations de-
characteristics. The point of deviation from the method couple for all flow regimes. implying as before that en-
of characteristics is to achieve this in a conaeruotive for- tropy and total enthalpy are conserved along streamlines
mulation. such that shocks and contacts can be handled in the steady state. Considering the first and second
without any special treatment. This indeed is the hall- equations, U- = 0 and U+ = 1 for supersonic flow, and
mark and basis of success of the state-of-the-art Finite the equations are fully diagonal; the system is in fact
Volume approach. The price paid in these methods, identical to eqn(5), where the acoustic variables are made
however. is that, the upwinding is based on locally one- to propagate along the Mach lines. In the subsonic case,
dimensional physics, by considering the states adjacent the system is no longer diagonal and the two acoustic
to each finite volume face as the initial data for a one- equations become coupled and form a system which is
dimensional Riemann problem in the direction of the face elliptic at steady state.
normal. Such an approach precludes the use ofMach lines The residual in conservative variables is obtained by
or the streamline as the upwinding directions. transforming eqn(9). giving
For subsonic pow, the two acoustic equations form an
elliptic subset. and it is less clear what should be the
opt.imal space discretization.
To fix the ideas. consider again the Enler system in the where R is to transformation matrix from characteristic
form ofeq. (3), hut assume that the inlet conditions are to conservative variables.
such that the flow is irrotational. so that the third and
fourth equations decouple from S and H :
2 RESIDUAL DISTRIBUTION SPACE
DISCRETIZATION
The finite volume setting with its underlying disconlinu-
ous solution representation naturally leads to the defini-
tion of 1D Riemann problems at the discontinuous inter-

(and hence t,he Mach angles) become complex (e).


For subsonic flow, the eigenvalues of the system matrix
I--Ma
For .ll= 0 this is the set of Cauchy-Riemann equations
faces, although some progress has been made in the s~
lution of three-statr two-dimensional Riemann problems
PI.
Therefore, in this work, we concentrate on approaches
governing incompressible potential flow. based on a continuous represelltation of the solution over
structured or unstructured meshes, with the solution
Diffrrmt ways for discretizing such a system will be dis- stored at the vertices. Such a framework. which allows
cassed in srct,ion 2. AU these methods are based on easy incorporation of upwinding concepts, is provided by
an unsteady version of eqn(5) or ( 6 ) , whereby the un- the residual distribution approach :
steady terms are in general chosen not to be physical,
but such that the resulting system is hyperbolic in time in a first step, the conservative flux balance (ceU
and achieves fast convergence to the steady state. residual) is evaluated over a cell with unknowns lo-
cated in the vertices by a aiinple contour integration
Such a clioice [I] is given by the following system, both (e.g trapezium or midpoint rule).
valid for subsonic and supersonic flow, and called the
h.vperbolic/elliptic splitting. Defining the characteristic in a second step the cell residuals are distributed
variables to the vertices to form the nodal space operat.or (or
nodal residual) which becomes a weighted average
of the adjacent cell residuals.
The space discretization is consistent and conservative,
(7) which is easily shown hy summing up t.he discrete equa-
tion for all nodes and observing thal. the interior fluxes
17-3

vanish, leaving only the contributions from the bound- the coefficients CA, are all non-negative for k #
aries. i . Stability and monotonicity preservation is then
\ratious residual distribution schemes have been prc- guaranteed under the CFL-like condition
pased, such as the central scheme of Jameson [3] or the
Lax-Wendroff schemes of Ni [4], Hall [5] and Morton [6].
In the present context the residual distribution frame- Y
work has been used to formulate conservative mnltidi- k
mensional upwind advection schemes. At the scalar level,
linear and non-linear advection schemes are obtained by Indeed, if :U is a local maximum. i.e. (11; -U:) 50
distributing the cell residual to the downstream nodes M, then 5 0. Consequently a local maximum
only. in this way, properties such as positivity and 8ec- cannot increase and similarly a local minimum can-
ond order accuracy (linearity preservation) [7, 8, 91 can not decrease. The condition(l5) is called global pos-
he built-in. As long as the distributed parts sum u p to itivity and is difficult to impose for fluctuation split-
Ilie coiiaetvotivecell residual. the schemes satisfy discrete ting schemes. Therefore a more restrictive property
conservation. is introduced, namely local positivity, see [I?]. This
The appbcation to the Euler equations, e.g. for advec- means that condition(l5) is imposed for each con-
tioil of entropy, total enthalpy and acoustic variables, is tributing control volume in eqn(l4), which is very
straight forward, provided that a conservative linearira- easy to check. Positivity will in general be linked to
tion can be found which ensures that the flux balance over some upwinding. in the fluctuation splitting context
a cell caii be written exactly in terms of the quasilinear upwind biasing is obtained by limiting the distribu-
equations discussed in section 2. tion of the cell residual to the downstream nodes.
L i n e a r i t y Preservation or Residual Property
2.1 Distribution schemes for scalar ad- Second order truncation error in the steady state
is obtained by demanding that no updates are sent
vection to the vertices if the cell residual is zero. This is
The subject of multidimensional shock-capturing advec- obtained when the distribution coefficients p." are
tion schemes on triangles has been extensively treated in bounded, such that
previous pubhcations, and the reader is referred to 19, lo],
as well as to the work of Roe and Sidilkover [8, 111for de- p?+" + 0 when cb" + O (17)
tads. Only the most important aspects are recalled here,
and the extension to quadrilaterals is briefly discussed._ It can be proven that only non-linear schemes (a scheme
('onsider the linear advection equation, with constant A: is called linear if the coefficients C k , in eqn( 15) are inde-

au
-+A.
at
- -
vu = 0 (11)
pendent of Uk) can satisfy both properties.

2.1.1 Schema on triangles


The corresponding integral form of eqn(l1) is obtained by Considering the triangle with inward normals 13, shown
integrating over a control volume R (triangle or quadri- in figure 2, the fluctuation 4". eqn(l2). can be written
1,iteral) This leads to the definition of the cell residual as
3
01 fluctuation, @T=Ck,": k,=;A.Ii,
1-
.=I
. (18)

The k , are convenient parameters in the design of upwind

4
where r is the boundary of the control volume 0. Be-
cause f h e solution is stored in the vertices of the cell, the
contour integral can be easily evaluated by the trapee-
i i i m rille In the fluctuation-splitting approach, fractions
of 0'' are sent to the cell vertices, which after assembling
contributions from all cells leads to the nodal update.
The seini-discretization at point I is then
k
+
n J. i
= -R(u,)
where S, is the area of the median dual cell around node
i . and the /3: are the the distribution coefficients which Fig. 2 : Triangle and inward normals 13,
w n i up t,o one for each cell. T h e way these coefficients
iiic evaluated, determines the properties of the scbeme.
'Tlw most. import.ant of these are: schemes. Since the inward normals ti, sum up to zero. one
Positivity
has also E,
k. = 0. Four important distribution schemes
are:
.X monotonic scheme can be obtained by demanding
posit,ivity. Suppose that. the numerical solution at
iriesh point i is U , . Then the positivity property The N and PSI or limited N scheme
requires t,hat in t.he discrete iorm of eqn(l1)
Define k: = max(0,k.) and k; = min(O,k,), then the
distribution to the nodes for the N scheme are given by:

p:4= =I.,'(., - I',") (19)


The Lax-Wendroff scheme
The distribution coefficients for the classical Lax-
Wendroff scheme are

This scheme is positive but not linearity preserving. How-


ever. among the linear positive schemes, it has the lowest
cross-diffusion. It is also the scheme with the narrowest
stenril. hence the name N scheme. where Sr is the area of the triangle, and At is t.he Lax-
From this scheme the contributions for the non-linear PSI Wendroff dissipation coefficient with the dimension of a
scheme can be constructed by limiting the N scheme dis- time.
tribution as follows
2.1.2 Schemes on quadrilaterals
The extension of these schemes to quadrilaterals is rather
straightforward. The inner scaled normals v i , are given
This is identical to Sidilkovers general limiter for- in figure 4. The parameters k , are calculated as in
mula 1131 when the MinMod function, Q(r) = $(? +
sgn(r))rnin(r,1). is chosen. This scheme is both positive
and linearity preserving.

The FV and l i m i t e d FV scheme


For the first order upwind finite volume (FV) scheme
on the median dual mesh the normals of the dual grid
are needed, see figure 3. In terms of these normals the
fluctuation is given by

4T=ka(U3 -Ul)+kb(US-Uz)+kc(UZ-UUI) (22)

Depending on the signs of the dot products of the ad-

3 Fig. 4 : Quadrilateral and inward normals ti,

eqn(l8). The distribution coefficients of the LDA and


Lax-Wendroff scheme are very similar to these on trian-
gles and will therefore not be repeated.

The quadrilateral N and PSI scheme


The distribution coefficients for the quadrilateral N
scheme are:
PYQ4Q= k:(u, - U,,,) (26)

Fig. 3 : Normals for the triangle FV scheme Compared with the scheme on triangles, eqn(l9). every
point h a s his own inflow state, given by

vrction vect,or with t,he normals of the dual grid, c$T is


distribut.ed t,o the nodes according to the formula
FV T
9, 4 =k;(U3-Ul)+k;(U?-u,)
d zF V 0, T = k b ( U 3 - uz) + k$(uz - UI) (23)
FV T - k +
$3 d - a (U3 - U l ) +li:(u3 - U ? )
Again this scheme is positive, but not linearity preserving
and is more diffusive than the N-scheme. The limited sec-
ond order version of this scheme is obtained by applying [(kz + lkIl)--UZ + (kr + Ikll)-U4 + !%;U1 + k;ua]
c q i i ( L 1 I to the distribution coefficients 0:". (27)
Again this schenie is positive but not 1inearit.y preserv-
ing. The distribution coefficients for. the quadri1at.eral
Tlre L D A scheme PSI scheme are obtained by applying the limiter fnnc-
For the linear LDA (Low Diffusion A) scheme, the con- tion eqn(21) to f Q
1.ribiit.iona are given by:
The q u a d r i l a t e r a l FV and l i m i t e d FV scheme
As on triangles the normals of the dual grid are needed,
see figure 5 . The fluctuation is
,rl' .
II* acheine is linearity preserving, however it is not pos-
,t,W
17-5

3 CONSERVATIVE LINEARIZATION
We now consider system (30) in conservative form

au aF
-+-+-=O,
aG
or
au
-+VF=O
-- (34)
at az ay at

To maintain discrete conservation, the cell residual ha+


to be evaluated as the flux balance of tlie conservative
variables over the cell, for a triangle :

(3.5)
Fig. 5 : Normals for the quadrilateral FV scheme
On the other hand, the positive advection distribution
schemes require a quasi-linear form of the residual. A
conservative linearization is defined such that the quasi-
aiid the dist.ribution to the nodes linear form integrated over the surface is identical t.o the
= k,(u4 - UI) + k,(u2
y,$Q - UI) flux integration over t,he boundaries obtained by a par-
ticular integration rule. For the Euler equations on trian-
= k.(u3 - u2) + k,+(u* - 11,)
:f"Q,$Q gles, this is easily achieved by assuming that the Roe pa-
&.Q,$Q = k,+(u, - uz) + k,+(u3 - u4) (29)
rameter vector z = fi(1, U, w , H ) varies
~ linearly over
d:"Q,$" = k,+(u, - U,) + k,(U3 - U,) each element. Since U, F and G are quadratic in the
components of 2, the Jacobian matrices aUJO2.a F J a Z .
The limited version of this scheme is obtained as before. and aG/aZ are linear in the components of Z. making
the integration over a triangle trivial. Defining the aver-
2.2 System distribution schemes age state $? over the cell:

.\s explained i n section 1 the two-dimensional supersonic


Enler equations can be completely decoupled and the
scalar schemes of the previous chapter can be applied.
l l o w r v ~ rfor subsonic flow the two acoustic equations,
ean(8). form an elliptic subset, which cannot be decou- (36)
pled. One way to treat such a system is to introduce the the flux balance over element T may be expressed in
coupling terms as source terms and distribute them with quasilinear form as
the LDA or Lax-Wendroff scheme. By doing this the pos-
ir.ivity property will be lost and therefore positive system
distribution schemes are to be preferred. Among the sys-
l.ciii distribution schemes we mention the Lax-Wendroff
9T = AT F(Z)dn, - G(Z)dri, ( 37)

and SUPG distribution. Recently, positive system dis-


trihut.ion schemes have been explored. generalizing the = ST + gf.?y]
[if.?. (38)
*calny FV and N scheme discussed before.
C'onsider t.lie nnst.eady hyperbolic system of equations
- -
where A and B are the aylytical flux Jacobians evalu-
given hy
ated at the average stat,e Z:
aW OW OW
-+Aw-+Bw-=O
at al- BY
Extellding tlie ideas of the scalar schemes we define the
Iliatrices li, as and
1
lit = - ( A w n , , .
2
+ Bwn,,,) (31)

Iiroause the syst,em is hyperbolic, ti, can be written a Because the exact Jacobians are nsed. one can transforin

li, = R. A, L. (33) t.ion matrices are evaluated at the average atat.e Z.


-
(38) into any quasilinear form as long as t,he t.ransforma-

wiwre t l w columns of R, contain the right eigenvectors, On quadrilaterals it is more difficult and for the nioinent
.\, is a diagonal mal.rix of the eigenvalues and L, = R;'. a linearization is used which is only exact far parallelo-
gra,,E.
' l . 1 1 ~ mat.ricrs li,' and ii,- are given by
The global update for the system, analogous to t h e scalar
c a e eqn(l4), is t.hen given by
1;; = R , A + L , . Ii; = R;A;L, (33)
Here hf contains the positive and A; the negative eigen-
valws.
\Vi1 11 I.hesc definit.ions t.he system schemes on triangles
t a n he obtained just, by replacing the scalar I;, by the
nial.rix I<,i n the equations (19), (23), (24) and (25). On (42)
(Iiiiiclrilal,crals t,lie N-scheme i n the form (26) does 1101
geiieralizr to syst.ems and only system versions of the FV.
I.DA rliid I ~ x - L ~ ~ e ~ ~schemes
I r o f f can be obtained. where D? is the cell distrihet.ion matrix
17-6

4 NUMERICAL RESULTS USING


EXPLICIT TIME STEPPING
Results of three inviscid computations are given. In fig-
ure 6 the structured mesh, the Mach number isolines and
the entropy distribution on the a~ifodare shown for the
subcritical, M, = 0.63,~~ = 2'. flow over a NACA-0012
airfoil. The scalar quadrilateral PSI scheme is used for
the convection of entropy and total enthalpy along the
streamlines, while the system Lax-Wendroff scheme is
used for the coupled acoustic subsystem.
The unstructured mesh, Mach number isolines and the
entropy distribution on the airfoil for the transonic
NACA-0012, M , = 0 . 8 5 , ~= 1'. can be found in fig-
ure 7. The distribution scheme IS the system PSI scheme,
which allows monotonic capturing of the shock in one or
two cells.
The third testcase is the thoughest, namely the hyper-
sonic (M, = 8.7), axisymmetric flow around a hyper-
boloid flare. The mesh, a triangulated structured Navier-
(1)Strurlurcd Erid for tbr NACA-0012. 32 x 128 C C I ~
Stokes mesh with aspect ratios over 100. and the Mach
number k h e s are given in figure 8. The solution is
I 1 monotonic, the shock is captured very well and the car-
buncle phenomenon, seen in Finite Volume solvers with
Roe's a p p r o m a t e Riemann solver, is not present. Again
the system PSI scheme WBS used.

5 IMPLICIT ACCELERATION
Explicit time-integration of the semi-discrete equations
(42), although straightforward and robust, suffer from
stability limits for some classes of problems. such as sub-
sonic flows with stagnation regions and viscous flows.
Implicit time-integration is in turn less limikd by restric-
tions over the time-step but requires on the other hand
large non-linear systems of equations to be solved.

I- I 5.1 Time-stepping strategy


( b ) Mash number isoline,
As we are only interested in the steady state solution, we

- restrict our attention to the linearized backward Euler


time-stepping scheme, which can be written as:
LOOP OVER TIME:(for k = 0.1, ...) until convergence:
- Choose time increment A",
- Compute increment A* as the solution of:

(43)

.UPDATE: U*+' = U* + A l l *

where J R ( U ) =.=. IS the Jacobian of the residual


R ( U ) , a sparse -an$"non-symmetric matrix. and where
J P denotes the augmented Jacobian l / A * t + J R . An
overview of different approaches to solve the steady state
equations can be found in [14].
At each time-step k, the main ingredients of the dgo-
0.00 0.35 0.67 L .oo
x rithni can be listed as:
Is) Enlropy distribution on the airloll computing a Jacobian matrix J R ( U * ) ,
Fig. G : Medi, Mach number isolines and Entropy distri-
bution on the airfoil for thesubsouic NACA-0012 c solving the linear system (43),
(A/, = 0.63,a = 2"), for the hyperbolic/eUiptic
splitting. PSI on entropy and total enthalpy, choosing a time increment Akt and a non-linear u p
Lax-Wendroff on acoustics. date strategy

The three next sobhedons will be devoted to t h e descrip


tion of each of these tasks.
(a) Triangulated itrurlured grid. 2170 noder

( b ) M a c h number isolines lbl Msch nurnbcriaolincs, general v i c y and zoom o i Irading edge

Fig. 8 : Mesh and Mach number isolines for t.hr hyper-


sonic, axisymmetric hyperboloid flare, A.I, =
8.7. Hyperbolic/elliptic split.ting with t.he system
PSI scheme.

5.2 Jacobian computation


5.2.1 Differentiating the Residual
i-
As the spatial discretization stencil involves only
distance-one neighbours, each individual component of
the Jacobian can be computed at reasonable cost. Lim-
iting the Taylor expansion of R,(U, cl,,,), the nodal+
residual at node i with the m-th component of U at node
j perturbed o f a small quantity E . to the first order terms,
one has:

R,(U, +Elm)-R,(U)
0 00 o 33 0.67 LOO (44)
E
m

It shows how each entry of the Jacobian (JR),,,,a 4 x 4


Fig. 7 : Mesh. Mach number isolines and Entropy distri- matrix with m as the column index, can be computed
bution on the airfoil for the transonic NACA- by a first order finite difference. Because of the compac-
0012 ( M m = 0 . 8 5 , ~= lo), for the hyper- ity of the scheme, this computation requires only twelve
bolic/elliptic splitting. System PSI scheme. (twenty in 3D) additional explicit residual evaluations.
Following the same steps of the explicit solver (i.e. loop
over the cells, in each cell compute fluctuation and dis-
tribute contributions to be assembled at the nodes), the
algorithm to compute the Jacobian is:
17-8

INITIALIZE R ( U ) = 0,J R ( U ) = 0, pattern of J R and D a n x n diagonal matrix which ac-


LOOP over triangles (T=1,2, ..., nhr of cells): counts for the sparsity structure of the Jacobian matrix:
0 Compute fluctuation and distribute contributions
+
to the 3 nodes (: = 1,2,3) : R, c R, S-:?
0 LOOP over the 3 nodes of the cell (j=1,2,3):
0 LOOP over the 4 components of U, (m=1,2,3,4):
D.,=6:'6.* with ( 6 ; ) ' -
- [ 0 if ( JR).,, = 0
(A!)J otherwise (47)

- Perturb m-th component of U, t U, €1,. + Broyden's method allows to update the Jacobian matrix
- Compute new fluctnation, without having to compute twelve residual evaluations.
- Distribute the 3 contributions (i=1,2,3): On the other hand, non-linear convergence will he at most
linear and more iterations will he needed at the non-linear
level.

+
where * T [ I J J el,) denotes the residual contribution 5.3 Solution of the linear system
t,o node i when the m-th component of U at node j has Following the linearization process, the linear system (43)
lieen perturbed. is iteratively solved with left (or right) preconditioning:
A key issue t,o the numerical computation of the Jacobian
.U a finite-difference approximation is the proper choice ~F(U')-~JF(U*)A* = - . j p ( ~ k ) - 1 ~ ( ~ k ) 3(48)
of E . which can be determined here on a component-by-
coinpouent. basis. The question is treated by Schnabel[lS] with .?,(Uk)obtained by some incomplete approximate
who advocates: factorization of JF(U*). Block ILU factorization is used
in our numerical experiments. Krylov subspace acceler-
E = Jiimax[I~,.ml,t~~(~,.m)lsign(U,,m), (45) ation techniques have been considered to accelerate the
convergence of the iterative solve. In the framework of
d h typ( U,,,) typical user-defined order of magnitude
a this paper, we have favoured GMRES[lG] among other
for the m-th component of U at node j and q a lower solvers because of its optimality and since it does not
bound on the inaccuracy in the residual R ( U ) evaluation represent a severe limitation for 2D medium size p r o b
(relative noise). This lower hound is at best the machine- lems on today's computers despite its storage require-
epsilon of the computer and can he larger if R(U) is com- ments. We refer to [l7] for a description and as-ment
puted by a lengthy piece of code. Should q be worse or if of alternate preconditioners and other Krylov subspace
R(L1) is not differentiable everywhere, one might rather techniques, such as Q M R and TFQMR[l8]. A constant
resort t,o the secant method, known for multidimensional Krylov snhspace dimension of 30 is used in the numerical
problems as the Broyden's update. experiments and the linear solver is stopped when the
normahed linear residual drop5 below lo-', This linear
convergence criteria is easily met within the 30 Krylov
5.2.2 Broyden's method subiterations in the early stages of the Convergence pro-
Broyden's update method is the multidimensional exten- cess when the CFL number is not too large.
sioii of the secant method used for nnivariate problems,
avoiding the need for computing any derivative. If the
kth Newt,on-Raphson step is denoted' by: 5.4 Global convergence and fixed-point
method
The choice of an optimal time-step is a key issue to en-
sure a fast and robust convergence. It seems logical to
wit.1) A'lr = Cl*+' - U k 9the generalization of the one- increase the time-step when approaching to the converged
diniensional secant condition is that JR(U*+') satisfies: solution as the likelihood to be within the radius of con-
vergence of the Newton met.hod increases. Automatic
time-increment control algorithms have been set up to
relieve the user from explicitly monitoring the CFL num-
whereA'R= R ( U * + ' ) - R ( U * ) . However, Thisdoesnot, ber following the convergence level. Some experiments
determine JR(U*+') uniquely in more than one dimen- with such algorithms can be found in [17]. We present
sion. I n Broyden's update approach, JR(U*+') is chosen now a technique which consists. after some approxima-
by making the least change (see[lS] for proper matrix tion, in accelerating a fixed-point method. The technique
norms) to JR(U')),consistent with thecondition (46). As never reaches any Newton-like convergence. but, shows,
such. the method suffers a major drawback as it entails a
for a constant limited CFL number, a good global con-
romplete fill-in of the Jacobian matrix whereas the true vergence behaviour. The technique consists in solving the
.lacobian matrix is sparse. Alternatively. we can look for steady-state Euler/Navier-Stokes equations with a n infi-
nite CFL number, i.e. full Newton time integration. but

-
t,lie solution to the same least change problem under the
addit,ional condition B E ~ ( J Rwhere) S ( J R ) represents using a finite CFL number in the preconditioning matrix
the set of 11 x n matrices with the same sparsity pattern at each linearization: J p ( U * ) = JR(U*)and ~ F ( U * )
ob-
a., .lrt. The resulting update is given by: tained some factorization of l / A ' t + J R . It should be
pointed out that, since the Krylov subipace dimension is
JR(l:*") =JR(uk) + not increased. tl<w results also in solving less accurately
the linear system.
P , ( J ~ I { D - [A'&
' J R ( U * ) A * ~A*U},
] The scheme, already used in [19]. is building up t.he
main features of the flow at the very early stages of
wlieie 'PP+~~,,,
is the matrixoperator which maps any ma- the convergence process much faster than the classical
trix onto the same matrix but restricted to the sparsity backward Euler discretization in time. Asymptotically
though. the method shows a monotonic linear conver-
"The iimc-step, has tern eluded from the formulation. However, gence behaviour and never reaches the convergence rate,
Ithe arguintnration which follows stdl holds, as backward Euler die-
crrl ~ d a ~ i oini i tmie amounts t o e Classical Newton's method where possibly quadratic. of backward Euler. T h e method a p
I t w incrmwni 3 lr h a s been under-relaxed lor the update. pears t.lierefore as compleinentary t.o backward Euler as
17-9

i t can be used for the first non-linear iterations and pro- Implicit time integration was performed by updating the
vide, so doing, a well-featured initial guess for backward Jacobian with Broyden's formula, with a maximum CFL
Euler. number of 200. Convergence history is shown in Fig. 10
The scheme can be viewed as an accelerated fixed-point and was achieved in about 750 CPU-eeconds. In com-
met,hod. T h e basic implicit technique consists in a sim- parison, abont 40000 CPU sec were needed to reduce the
ple relaxation procedure immediately followed by a non- residual to lo-' using explicit Euler time-stepping.
linear update. The relaxation procedure is based on some
approximation ~ F ( U *of) the augmented Jacobian of the
residual J F ( U * ) = l / A * t + JR(U*),
and the o v e r d pro-
cess reads:

Mkfl = U*- jF'(V")R(U*)


= x(ukj.
This formulation can be seen as a particular case of the
linearized backward Euler time-stepping where only one
single iteration is performed at the linear level, with
j F ( U k ) as the preconditioning matrix. Then, let us de-
fine F ( U ) = U - "(Cl) and apply Newton-GMRES to
solve G ( U ) E 0

V k + l = U *+ Ak with J&*)Ak = -C(U*). (49)


If Jo is approximated by ~ F I J R
(which is only true at
the non-linear convergence), one has:

Fig. 10 : Snbcritical flow over a NACA-0012 Conver-


gence history obtained with Broyden's u p
which is nothing else than full Newton iterations to solve date, 750 CPU sec
R((r) = 0. where the system arising at each lineariza-
tion has been left-preconditioned by j;'. In practice,
the technique amounts indeed to add l/A*t only in the The second test case is the viscous flow over the same
preconditioning matrix. airfoil with M, is 2.0 and Re = 106, which belongs to
Numerical experiments have shown that the acceler- the GAMM workshop on compressible viscous Row solver
ated fixed-point method requires CFL numbers of order test suite ([ZO]). Fig. 11 shows the density contours of
0(1~.0(10). the solution computed with the hyperbolic/elliptic split-
ting model and convergence history is depicted in Fig.
12. Convergence starting from a uniform flow field with
fixed point accelerated method for the two first iterations
followed by backward Euler, is achieved within about 12
iterations and 350 CPU-seconds. For backward Euler.
the inital CFL number of 100 was increased at every it-
eration by a factor C 2 = 2.0 up to lo6. GMRES with a
Krylov subspace dimension of 50 was used for this test.

Fig. 9 : Subcrit.ical flow over a NACA-0012: Iso-Mach


CoIItoUrS

5.5 Numerical results


Xanicrical results are presented for subsonic and tran-
sonic viscous computations, Tests were performed on a
DEC! Alpha AXP 3000/400 workstation. The subcritical
flow around a NACA-0012 airfoil at free-stream Mach
numher At, of 0.63 and 2 O angle of attack is first con-
sidered. T h e grid is made of 5249 cells with far-field
honndary conditions located 50 chords away from the Fig. 11 : NACA-0012 M , = 2.0, Re = 106: Density
body. Iso-Mach contours are depicted in Fig. 9. The isoline contours
spare discretisation used the hyperbolic/eUiptic splitting
model and a detailed view of the grid is shown in Fig. 7.
17-10

6.2 Extension of the error estimation to


the Euler equations
Once we are equipped with such a reliable and efficient
error estimate, it is quite natural to apply this error esti-
mate to each individual scalar equation resulting from a
residual decomposition step as described in section 1. Let
4 k be the scalar fluctuation corresponding to equation I;
and dr the contribution of triangle T to this fluctua-
tion. We consider then the following adaptive algorithm:
Given a tolerance TOL and an initial triangulation 70. de-
termine successively triangulations 'Tj with NI elements,
mesh spacings h, = h,($*) and corresponding approxi-
mate solutions U,,( J = 1,. .., J),such that h, is maximal
under the local condition, for k = 1,. . .4:
i i j
--12.0
0.0 8.0 16.0 24.0
N iterations
on T E %-I until (on the final mesh) the global norni be
Fig. 12 : NACA-0012 M, = 2.0, Re = 106: Conver- such that:
gence history, 350 CPU sec

6 MESH ADAPTIVITY
which is the stopping criterion. Notice that (53) seeks to
In [?I], it was proposed to use the residual decomposition equidistribute the contribution from each element to the
technique developed in the context of multidimensional global error bound.
upwind methods as a tool to extend the SUPG method From the adaptivity criterion (53), one can isolate h, for
to compressible flows. This idea was shown to lead to each triangle T which provides ns with a new "reference"
increased performances and robustness compared to the size for each triangle. Then, it is easy to decide whether
standard system extensions of SUPG 122, 231. a given triangle has to be refined, coarsened or kept as it
In the present section, we report the continuation of this is. Of course, when dealing with the 2D Euler equations,
work with focus on mesh adaptivity 124, 251 and we will one can compute four different required mesh sizes h,(+k)
show that the use of the mnltidimensional residual de- for the next triangulation 'Tj. At that point, several o p
compositions introduced to generalize the SUPG scheme tions can be taken. It could be decided for instance to
to hyperbolic systems allows for the derivation of an er- control the error only on one of the 4 equations but this
ror estimation procedure for the Enler equations in a very is risky because one could miss some of the flow features
natural and inexpensive way. which are not "seen" by the corresponding variable. Our
prefered choice therefore consists in taking the minimum
of the four mesh sizes,
6.1 SUPG a posteriori error estimate
(55)
The main ingredient of the proposed error estimation is
the a posteriori error estimate developed by Johnson and
Eriksson 126, 271 for the SUPG scheme applied to the which ensures an eqnation-by-equation control of the er-
following convection-diffusion equation: ror over the mesh under the required tolerance TOL.

i.
VU - d . (&") =f in 61, (50)
6.3 Adaptivity technique
with Dirichlet boundary conditions on the boundary r The adaptivity technique developed in the present re-
of the computational domain 61. If we assume that the search is inspired by the innovative work of Richter [ZS].
advect.ion vector . i
is constant, the a posteriori error esti- It consists in non-hierarchical h-refinementjderefinement
mat,e for the scalar shock capturing SUPG scheme applied allowing efficient mesh optimization operations such as
to the sl.ationary problem (50) can be written from 1271 edge swapping and Laplacian smoothing.
as: T h e refinement operation is achieved by the introduc-
tion of an additional node for each edge of an element
ll~-crllL2p)5 CII min(l,R-'h') R(U)II',(q+maxR1'2
r-
, for which the calculated spacing is less than the element
(51)
parameter h. For interior edges, the additional node is
where placed at the mid-point of the edge and the solution at
the new vertex is interpolated from the solution at the
R ( I ' ) =I i.dU-f 1 + SEBTCn
max
I 8U
I' onTE7,
(52)
extremities, whereas for boundary edges, the geometrical
location of the new node is determined through a spline
interpolation involving the four dosest existing points.
wit.11 T a triangle of mesh 7.R the artificial viscosity of For any edge that is subdivided in this manner, the two
I.hr SUPG scheme and ns the normal to side S of T. Note adjacent triangles associated to this edge both have to be
that. for simplicity, the computed solution U is compared divided in order to preserve the consistency of the final
with the solution ii of a pertnrbed continuons advection- grid.
difusioii problem obtained by replacing K by R(U) in eq. Our coarsening strategy is based on the nse of a non-
( 5 0 ) . In general, llu - iill is expected to be dominated hierarchical data structure which enables the deletion of
by.C111? - CJll, where C is a constant, so that control of nodes of the initial grid and the use of the structural o p
liii - - ~ lsuffices.
l timization techniques described below. The coarsening
17-11

is achieved in two steps. First, given the set of elements


flagged to be deleted, a list of nodes to remove is con-
st.ructed. Then, the deletion of these nodes is performed
simultaneously with the reconnection of the remaining
nodes t o obtain a conformal mesh. This is done by iden-
tifying each element involved in the coarsening with one
of the t.liree possible derefinement cases: triangles with
1.2 or 3 nodes to be deleted (see fig. 13).

(a) F i n d adapted mcih15032 nodes)


Fig. 13 : Possible coarsening configurations and their as-
sociated treatments
After the adaption step itself, a series of mesh optimiza-
tion operations are performed in order to improve the
quality of the grid. The first operation consists in a stan-
dard Lapladan smoothing modified in order to reduce the
clnstering around nodes with degree lower than 6 and the
dispersion of nodes around nodes with degree higher than
6. T h e second operation is an edge swapping procedure
which aims at minimizing the number of nodes with a
high degree. This increases t h e number of nodes which
have an optimal degree. T h e final operation consists in
sett.ing a minimum value to the degree of the nodes by
removing undesirable low degree configurations as shown
i n fig. 14.

( b ) M a c h number i i o l i n ~ j

Fig. 15 : Mesh adaptivity for transonic NACA-0012


(M, = 0.85.a = 1'). Scalar shock-capt,iiring
SUPG scheme associated wit,h the hyper-
bolic/eUiptic splitting, TOL=O.IO

ACKNOWLEDGMENTS
Fig. 14 : Three "pathological" low degree node configu-
rations and their associated treatments T h e second and third authors are supported by a fellow-
ship of the Belgian Fund F.R.I.A. Part of the research
FOI more details about the adaptivity technique we refer was supported by the CE through Bright/EuRam con-
tlir reader to 1241. tract AERO-CT-0040 and the ESA MSTP program.

6.4 Numerical results REFERENCES


T h e tra,nsonic flow around a NACA-0012 airfoil (M, = 111 van Leer, B.; Lee, W.-T.; Roe, P.: Characterist.ic
0.YS.a = 1') is computed. T h e initial mesh is a coarse Time-Stepping or Local Preconditioning of the Eider
t,riangulatioii with 587 nodes and 1094 elements obtained equations.
rvit,h t,lie front.al Delannay method by Muller e t al. 1291. 1991, AIAA paper 91-155?-CP.
Tlir constant, C appearing in (53) was chosen equal to 1.
t,lie rolerance level T O L was fixed a t TOL = 0.10 and the 121 Abgrall, R.: Approximation of the Multidimensional
error estimat,ion was performed on all equations. Three Riemann Problem in Compressible Fluid Mechanics
adapt.ion steps have been achieved before meeting the by a Roe Type Method.
st.opping crit.erion. Fig. 15 shows the final mesh as well as SIAM Journal of Numerical Analysis, 1994, Submit-
t,lie Mach number isolines of the corresponding solution. ted for publication.
T h e final mesh (fig. 15a) indicates clearly that all features
of t,lir flow, i.e. the stagnation zone and expansions near 131 Jameson, A.; Baker, T.; Weatherill, N.: Calculation
the leading edge, the two normal shocks and the slip line of Inviscid Transonic Flow over a Comp1et.e Aircraft.
c:inanat.ing from t,he t,failing edge have been detected. 1986, AIAA-86-0103.
17-12

141
,. Ni. R.-H.: A Multide Grid Scheme for Solving the I191 Issman, E.; Degrez, G.: Convergence acceleration of
Eu'ler Equations. a 2D EulerlNavier-Stokes solver by Krslov subspace
AIAA Journal. Vol. 20, 1981, pp 1565-1571 methods.
ECCOMAS 94. 1994
151 Hall, M.: Cell-vertex multigrid schemes for solution
of the Euler Equations. [20] Numerical simulation of compressible Navier-Stokes
Numerical Methods for Fluid Dynamics, I1 (K.W. flows.
Mort.on and M.J. Baines, eds.). Oxford University Bristeaux, M.-O., Glowinski, R., Periaux, J., Vi-
Press, 1986. viand, H. eds., Proceedings of the GAMM work-
shop, held at INRIA, Sophia-Antipolis (France), on
[6] Morton, K.;Paisley, M.:On the cell-centre and cell- December 4-6, 1985, Vol. 18 of Notes on Numerical
vertex approaches to the steady Euler equations and Fluid Mechanics. Vieweg, 1987.
t h e use of shock fitting.
Lecture Notes in Physics, Vol. 264. Springer-Verlag, [?I] Carette, J.-C.; Deconinck, H.; Paillere, H.; Roe,
P. L.: Multidimensional upwinding : Its relation to
19x6
finite elements.
Int. J. of Num. Meth. Fluids. Vol. 20, 1995. pp 935-
,lil. Roe. P.: Lluear advection schemes on triangular 955.
medies.
Cranfield Institute of Technology
. report, November [2?] Hughes, T. J . R.; Mallet, M.: A new finite element
1987. CoA 8720. formulation for computational fluid dynamics: Ill.
The generalized streamline operator for multidimen-
181 Roe, P.: "Optimum" upwind advection on a trian-
sional advective-diffusive systems.
gular mesh.
Comp. Meth. Appl. Mech. Engrg., Vol. 58, 1986, pp
ICASE Report 90-75, 1990.
305-328.
['J] Struijs, R.; Deconinck, H.; Roe, P.: Fluctuation [23] Hansbo, P.: Explicit streamline diffusion finite ele-
Splitting Schemes for the 2D Euler Quations. ment methods for the compressible Euler equatious
VKI LS 1991-01. Computational Fluid Dynamics. in conservation variables.
J. of Comp. Phys.. Vol. 109, 1993, pp 274-288.
[lo] Deconinck, H.; Struijs, R.; Bourgois, G.; Roe, P.:
Compact advection schemes on unstructured grids. [24] Carette, J.-C.; Deconinck, H.: Unstructured mesh
VIiI LS 1993-04. Computational Fluid Dynamics, adaptivity for SUPG formulations hased on residual
1993. decomposition of the euler equations.
Proc. of the VKI Lecture Series on Computational
11 I ] Sidilkover, D.; Roe, P.: Unification of Some Advec- Fluid Dynamics. VKI LS 1995-02, 1995.
tion Schemes i n T w o Dimensions .
ICASE Report 95-10, 1995. [?SI Carette, J.-C.; Deconinck, H.: A posteriori finite el-
ement error estimation for the euler equations based
[ I ? ] Deconinck, H.; Struijs, R.: Bourgois, G.; Roe, P. L.: on multidimensional residual decomposition.
Compact. advection schemes on unstructured grids. Venice, Proc. gfh Int. Conf. Finit,e Element,s in Flu-
Proc. of the VKI Lecture Series on Computational ids, 1995,
Fliiid Dyuamics. VKI LS 1993-04, 1993.
(261 Johnson, C.: Adaptive finite element niet,liods for
[ I 31 Sidilkover. D.: Multidimensional upwinding and diffusion and convection problems.
multigrid. Comp. Meth. Appl. Mech. Engrg., Vol. 82, 1990, pp
Proc. of the 12th AIAA CFD Conference, San Diego, 301-323.
l a n e 19-22, 1995. AlAA paper 95-1759-CP, 1995.
[XI Eriksson. Ii. E.; Johnson. C . : Adapt.ive stream-
1141 Drgrez, G.; Issman, E.: Solving steady compressible line diffusion finite element methods for stat.ionary
Row problems with subspace iteration methods. convection-diffusion problems.
I n l y 3-i3 1995, 3rd Int. Congress on Industrial and Mat.h. Comp., Vol. 60, 1993, pp 167-188.
Applied Mathematics, Hamburg.
[28] Richter, R.: Schtmas de capture de discontinuit.Cs en
(151 Schnabel. R. D.; Dennis, J . E.: Numerical methods maillage non-structure avec adaptation dynamique.
for unconstrained optimization and non-linear equa- applications a u x tcoulements de I'a6rodgnamique.
t,ions. Prentice-Hall Series in Computational Math- PhD thesis, IMHEF, Ecole Polytechnique Federale
cnial.ics. Prent,ice-Hall. 1983. de Lausanne, Switzerland, 1993.

1161 Y. Saad. M . H. S.: Gmres: A generalized minimal [29] Miiller, J.-D.; Roe. P. L.; Deconinck, H.: Delaonay-
residual algorithm for solving nonsymmetric linear based triangulations for the Navier-Stokes equations
systems. with minimum user input.
SIAM J . Sci. St.at. Comput.. Vol. 7, No 3, July 1986. Lecture Notes in Physics. Vol. 414. Springm. 1992.
~pp856-869.

I l i ] Issnian. E.: Degrez, G.; Deconinck, H.: Implicit


it.erative methods for a multidimensional upwind
riiler/navier-st.okes solver on unstructured meshes.
I!M. AIAA Paper 95-1653, AIAA 12th CFD Coa-
frrcnce, Sari Diego, CA.

1181 H W.Frrund. N. M. N. G. H. Golub: Iterative s o


liitioii of linear systems.
A c t a Numerics. 1991, pp pp 57-100.
18-1

Multidimensional Upwind Dissipation for 2D/3D EulerNavier-Stokes


Applications
P. Van Ransbeeck and Ch. Hirsch
Department of Fluid Mechanics, Vrije Universiteit Brussel,
Pleinlaan 2, I050 Brussels, Belgium

1. SUMMARY
Genuinely multidimensional upwind dissipation models are such as the one-dimensional concept of 'Total Variation
developed for the 2D/3D EuledNavier-Stokes equations using Diminishing' TVD schemes.
a cell-centered finite-volume approach on structured grids.
Very recently a more formal approach towards a general
The numerical f l u x is formulated using the artificial
formulation of artificial dissipation terms, applicable to
dissipation concept. An overview is given for 2D/3D compact
structured as well as unstructured meshes is developed, based
upwind dissipation for stencils u p to respectively 6 and 8
on the concept of Local Extremum Diminishing (LED)
points. A classification is set up for first and second order
schemes, by way of generalised limitersI4. All these
accurate schemes that have respectively minimum and zero
developments however still remain in the dimensional
cross diffusion. Second order monotone schemes are
splitting approach.
developed using the concept of non-linear limiter functions
applied on multidimensional ratios of flux differences. A In this framework, 2D multidimensional upwind schemes have
classification is presented for different families of 2D ratios. been reformulated as a way of defining dissipation terms, with
3D multidimensional limiters based on 3D ratios of flux the requirements of positivity and classical limiter
differences are introduced. The scalar dissipation models are c o n ~ e p t s ~ In
. ~contrast
~ , ~ ~ .to the dimensional-split models, the
cxtended and applied to the EuledNavier-Stokes equations multi-D dissipation depends on the direction of the convection
based on a characteristic decomposition of the inviscid speed and on variations of the solution or fluxes i n different
operator. The resulting characteristic compatibility equations mesh directions. The corresponding numerical flux for a cell
consisting of convective and source terms are depending on a face is determined by a multidimensional interpolation inside
set of 3 propagation directions. An overview is given for an upstream triangle. As a result the multi-D dissipation is
different choices of directions. The multidimensional more compact than the models with a one-dimensional
discretisation is considered for both the convective and source interpolation along the mesh lines. Recently a comparison and
terms along its associated advective speed. unification was performed for the underlying scalar linear and
non-linear positive convection schemes for both the finite
2. INTRODUCTION volume and fluctuation methods' 1*27*32.
In the last ten years extensive research has been ongoing The idea of multidimensional limiters was first introduced for
towards the development of genuinely multidimensional a 2D scalar convection problem2'. Different classes of 2D
upwind schemes. The main motivation is to reduce the mesh limiters have been classified and applied to the 2D Euler
dcpendency appearing in classical dimensional-split schemes /Navier Stokes equation^^.'^*^'.^^". I n the present paper an
and as a result to capture the physics more accurately. Two overview is given for compact 2D convection schemes for
main approaches are found in literature: the fluctuation stencils up to 6 points. Different classes of 2D ratios are
splitting schemes and the finite volume schemes. for a review determined by the choice of i ) a triangular interpolation
see ref, 19.23.30 domain and ii) variations along meshlines or diagonals. The
The fluctuation splitting schemes consist of an upwind analysis is extended for 3D convection schemes as basis for
distribution of a fluctuation (residual) over the nodes of a the development of dissipation models including 3D limiters
triangular or tetrahedral ce112,3,,'6,'7.'8,2'.22I n the finite and ratios. A classification is given concerning first and
volume methods the numerical f l u x is determined using second order schemes with respectively minimum and zero
multidimensional extrapolation'.6.7,'s,24,~s. Application of cross diffusion for stencils up to 8 points.
both methods to the EuledNavier-Stokes equations consists of The scalar dissipation models are extended to the
two basic elements : ( I ) a suitable wave m ~ d e l l i n g ~or- ' ~ EuledNavier-Stokes equations based on a characteristic
characteristic d e c o m p o ~ i t i o n ~ of
~ ' the
~ ~ 'inviscid
~ ~ ' ~ operator d e c o m p ~ s i t i o n ~of. ~ the inviscid operator. The resulting
and, (2) a scalar convection scheme. characteristic compatibility equations represent the convection
The concept of artificial dissipation associated to central of an entropy, a shear and 2 acoustic waves. They consist of
schemes. became a key element i n Euler and Navier-Stokes convective and source terms that depend on a set of 3
calculations during the last 15 years. The family of upwind propagation directions. A n overview is given of different
schemes, which can be considered as a rational way of strategies concerning the choice of the directions. The
defining dissipation in a numerical algorithm, has led to a resulting equations are discretized using the scalar dissipation
matrix dissipation form, as opposed to scalar models. The multi-D dissipation models are considered for
One of the essential elements of the upwind dissipation is the both the convective and source terms based on its associated
concept of non-linear limiters, leading to high resolution. 2nd advective speed.
order scltemes. satisfying some condition of monotonicity.

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
18-2

3. DIMENSIONAL-SPLIT UPWIND DISSIPATION 3.2 Monotonicity condition


To prevent oscillations, L is limited by use of a non-linear
3.1 Numerical flux formulation limiter function CP , in order to fulfil monotonicity conditions.
Considcr 3 3D scalar hyperbolic conservation law based on It assures that local maxima can not increase and local minima
the fluxes I'=(f,g,h). can not decrease. The approach in ref. 28 is used and is
recently defined as Local Extremum Diminishing (LED)
conditionI4. Rewriting the residual of (2),

with ~=(a,b.c)=(af//au.ag/au,ah/au)the convection speed. A (7)


cell-centered conservative finite-volume semi-discretization of
( I ) on a Cartesian mesh yields,
the positivity condition is defined by

For scheme (5)-(6) this is fulfilled if the flux limiter d,


satisfies

where e.g. the numerical flux on cell face i+l/2,j,k is


expressed by
Conditions (9) are valid for all classical TVD limiters. In ID
(3) the monotonicity concept is equivalent with the Total
Variation Diminishing (TVD) approach.
consisting of a central part being the average of the fluxes in
the cell-centers left and right to the cell face. The numerical
dissipation on cell face i+l/2,j,k is represented by di+l/2,j,k. 4. MULTIDIMENSIONAL UPWIND DISSIPATION
All classical central and upwind dimensional-split dissipation
models can be formulated as a function 4.1 2D Scalar Upwind Dissipation
In the following an overview is given for compact 2D scalar
- (4) upwind schemes, including linear and non-linear classes
'i+l/2,j.k -- d (...Jui-1/2 j,k~'Ui+l/2,j,k''Ui+~/2 ~,k~'")
having first and second order monotone schemes. This study is
dcpending on ID consecutive differences of the solution along based on a theoretical analysis of 2D linear convection
the corresponding mesh line with e.g. 6Ui+l/2,j,k=ui+[,j,k - schemes of which the basic elements are in ref.6 It is a
u ; j , k . For example consider the I st and 2nd order upwind Flux generalisation of the analysis of ref.24 where only first order
Differcncc Splitting schemes (FDS I ,FDS2) where the optimum schemes are considered. This general study is set up
dissipation (4) is specified by, for a > 0, for cell-centered molecules with a finite volume and structured
approach. I t is based on a general 9-point stencil that is
derived in Cartesian and streamline coordinates. Conditions
concerning second order accuracy, cross diffusion.
The first term i n (5) is a diffusive contribution and monotonicity and relations between some of these are
corresponds to first order upwinding (FDS I , L=O). Function L investigated.
represents an antidiffusive term that introduces higher order In contrast to the classical dimensional-split dissipation, the
accuracy. multidimensional upwind dissipation is based on variations of
the solution i n different mesh directions and on the total
convection speed (a,b). The domain of dependence of the
multi-D upwind dissipation models is taken in an upstream
with flux limiter (I, depending on a ID ratio based on the sign direction to the convection speed.
of a as shown in figure I ,
In the following, the linear form of ( I ) is considered on a
uniform mesh with mesh spacing Ax=Ay=l. The fluxes are
(6b)
f=au and g=bu with constant convection speeds a,b>O. A
general form of the multi-D upwind dissipation on face
i+1/2j, can be written as

The first term is the classical first order upwind dissipation


from equation (5). The second term L represents the multi-D
dissipation that is function of differences of the solution i n
both mesh directions.

i- 1/2,j,k i+l/2,j,k Two options have been investigated for the choice (6ux,6uy)
i n ( I O ) leading to compact 2D upwind schemes,and are
Figure 1 S e c o n d o r d e r dimensional-split upwind illustrated in figure 2. Both variations in x and y- direction
dissipation (a>O). determine a triangular domain of dependence for the multi-D
artificial dissipation.
18-3

4.1.1 Lirimr- 6-poOr1 r(pwirirl sclreiries


Configuration ( I ) (figure 3a) consists of only 3 triangles
For the linear subclass of ( I O ) the numerical f l u x is
because the interpolation domain for cell faces i-112j and i j -
determined by a linear interpolation in the corresponding
1/2 are identical. Configuration (11) in figure 3b consists of 4
triangles of figure 2. e.g. for configuration (I),
triangles where the continuously shaded areas are referring to
r------- -i--------1
- - - - - - - -- - - - - - - -
I
cell faces i+1/2j.
I I I I I I
I I I I I I

0 0 0 0 Both families of schemes (figure 3) have subclasses of 2nd


order accurate schemes, illustrated in figure 4. The subclass of
zero cross diffusion schemes or second order accurate
schemes for the homogeneous convection equation, is a two
parameter family for both options I and 11. A comparative
study performed between the fluctuation splitting unstructured
multidimensional upwind schemes of Deconinck and co-
workers, ref.3 and the present dissipation models is performed
in ref.32 It shows that the Low Diffusion schemes A and B
I I
(LDA and LDB) are 5-point zero cross diffusion schemes of
the 6-point family of configuration (11) (figure 3b).
(a) config. (I) (b) config. (11) The 5-point continuous interpolation scheme (config. I )
mentioned in figure 4 is investigated i n ref.10*15,33and is
Figure 2 Interpolation domain I and II for the scalar based on a continuous interpolation for the numerical flux
multi-D upwind dissipation (a,b>O) inside the polygon formed by the 6 surrounding cell centers of
the cell-face.

f3l)2,j = aui,j - a q j - 1 / 2 + P 65+1/2,j-I ( 1 1) A more severe constraint is the condition for general second
order accuracy, in the classical sense, defining a unique
Using the definition of the numerical flux (3), the multi-D
member of the class of compact zero cross diffusion second
upwind dissipation model is determined from ( I I ) ,
order schemes for the non-homogeneous convection equation.
Notice that for the configuration I the scheme is an upwind
scheme while for configuration I I the scheme is the classical
central scheme
with positive interpolation coefficients a and P depending on
a and b. Similar formulas are valid for the fluxes on cell faces
ij+1/2 introducing analogue coefficients 6 and y. 4.1.2 Linear 4 - p O b l r upwind schemes
Both 6-point families have a subclass of 4-point stencils in
Interesting to notice is the sign of the multi-D contributions in
common with the choice of P = p O in (12) and figure 3,
( I 2). The term based on coefficient p and defined in the same yielding the numerical dissipation for a,b>O
direction as the 1st order term reduces the dissipation as for
classical higher order schemes. While the term depending on 1
a in the other mesh direction increases the dissipation (12).
di+l/z,j = 711
' 'Ui+l/z,j + L ( Sui,j-l/z ) (13)
This addition o f dissipation is not a loss of accuracy, on the with
contrary i t reduces the diffusion in the cross flow direction as
shown in ref.I0. The resulting 4-point schemes are actually a one-parameter
Writing out the residual, the resulting 6-point families are (A=a+6) family of schemes, although the parameters a and 6
dcterrnined by 3 parameters A=a+6, p and y. Figure 3 shows can be chosen independently. The general 4-point scheme is
for both configurations the interpolation triangles for the four splitted in a central part and dissipation term:
cell faces.
r-------- 1-------- ?--"""-I
I I I I
I I I I
I I I

(a) config. (I)

Figure 3 Six-point linear schemes, config. I and II


18-4

I Several interesting schemes are recovered by choosing a


Res,,i = ( a 8x+ b Sy)ui,i - D,.i (I4) particular value o f A as illustrated i n figure 5. Concerning the
subclass of 4-point monotone first order schemes, the lower
where thc upwind dissipation i s formulated as
limit (A=O) corresponds to the 1st order classical upwind
scheme that has maximum cross diffusion (FDSI).The upper
D,i = I ( a 6 : 6 ; + b ~ 6 ~ + 2 A 6 ; 6 ; ) u i i (15)
l i m i t (A=min(a,b)) corresponds to the minimum cross
(,,. S. 6 + and 8 - that respectively ccnt,.,al, diffusion scheme. This scheme has been investigated i n
hy
different formulations. ref.2.7,20.24725. The 4-point family
I'orward and backward differences. The first 2 terms i n (15)
has a unique non monotone zero cross-diffusion scheme
rcprcsen[ [he Ist older dimensional-split upwind dissipation,
being second order accurate for the homogeneous convection
Thc additional 2nd difference term represents the
equation, ref.6.25. Since this scheme is a linear second order
niultidimcnsional dissipation. The parameter A, determines
scheme i t can not be monotone and shows oscillations near
the amount of multidimensional upwind dissipation.
discontinuities.

. .
2D linear 6-point upwind
schemes
4
( A , Pm Y/
+
2nd order

4
homogeneous
conuection conuection
I I
2-parameter class
zero cross diffusion

I I

4
config. I
I
+
config. I I
I I

continuous interpolation,
e.g. 5-point linearity __ ___ _ _ _ _ _ , - _ _ _ _ _ _______
I ,

preseruing schemes a.b


jj
Leuy, Powell and Uan Leer a+b

1'891, Hirsch and Uan


LDR,LD8, Struys,
Deconinck and Roe 1'91) -- -* I I
--..-.,: : .an
I ,
______ ______
d2

Ransbeeck ('94) , I I I I I

(a+b)n: .a.b I d2 I ; .M I

Figure 4 Overview 2nd order 6-point upwind schemes

I 4-point upwind schemes


(A)

It
monotone
2nd order
I1st order)

I O S A S min(a,b) I
dimensional-split zero cross diffusion
minimum cross diffusion

Rice and Schnlpke ('EIS), Sidilkouer ('891,


Sidilkouer 1'891, Hirsch
and Lacor ('891, Roe ('981,
Hirsch ('91 1

Figure 5 2D 4-point upwind schemes


18-5

For more details about the 2D convection schemes and In ref. I o monotonicity conditions on the coefficients in (17)
dissipation including 4-point,5-point and 6-point schemes, see are derived for the 4 classes. All 2D ratios found in literature
e.g. I. fit in one of these classes.
e.g.2s*26927.32

4. I .3 211 ~i:lti~lii~ze~zsionulLiiniters 4.2 3D Scalar Upwind Dissipation


The idea of multidimensional limiters was first introduced by I n the following a brief overview is given of a theoretical
Sidilkover, ref. 25 in the framework of a scalar convection analysis of 3D linear convection schemes of which the basic
problem. Hirsch and Van Ransbeeck, I ,3 I '3' elements are in ref. 32 and which will be discussed in more
considered various multidimensional dissipation formulations details elsewhere. This study is based on the extension of the
based on positivity and classical limiters. To illustrate the 2D analysis discussed in section 4.1. A general form of the
mcthodology the unique 4-point zero cross diffusion scheme multi-D upwind dissipation on face i+l/2,j,k, similar to ( I O )
( f i g s ) is considered below. can be written as
The definition of multidimensional limiters follows the ID
methodology. Starting with a 1st order monotone scheme,
limited antidiffusive terms are added. One of the main
differences with the dimensional-split limiters is that as 1st The first term is the classical first order upwind dissipation
order scheme the minimum cross diffusion scheme from fig.5 from equation (5). Thc second term L represents the multi-D
is selected. having a higher accuracy than the classical 1st dissipation that is function of differences of the solution in the
order schcme, e.g. ref.31. The 2nd order dissipation is three mesh directions.
rewritten as the first order dissipation plus anti-diffusive
Ii in i ted correction term, 4.2.1 Linear 8-point upwind sclrentes
d[:',/2,i= d::;/2,i + L( 6u; , &Uy ) (16) I n the following. the linear form of ( I ) is considered on a
uniform mcsh with mesh spacing Ax=Ay=Az=l. The fluxes
with L;+l/zj = Aa '~i,i-I/? @ ( ri+l/z,i) are f=au, g=bu and h=cu with constant convection speeds
a,b.oO. The 8-point molecules, for a,b,oO are defined by the
1 following extrapolation formula on e.g. face i+l/2,j,k,
A ~ i = c t ( ~ ) - c t (=~ -(b-min(a.b))
)
2
q:112,,.k = au0 - ax(uo- U?)- Px(uo- u4) - yX(u2- us) ( 19)
where Act represents the difference in interpolation coefficient
bclween 2nd and I st order scheme. Near discontinuities the referring to figure 7 for the overall configuration of the
limiter is switched off (@=O) and the 1st order multi-D scheme. Using the definition of the numerical flux (3), the
dissipation is applied. In smooth regions (I, = I and then the multi-D upwind dissipation model ( I 8) is determined from
linear 2nd order scheme is applied. ( 19).

The definition of the multi-D ratio i n (16) and the


corrcsponding variations 6ux and 6uy arc related to the choice
of triangular interpolation domains 1 or I I from figure 2. The
following definition of general 2D ratio is used,
with

= oLx6ui,j-l/2,k -k Px8Ui,j.k-IC! -k ~x"i,j-I,k-l/2

and is related to the choice of a triangular interpolation. Two with interpolation cocfficients ax,px and yx depending on a,
triangle configurations are shown i n figure 6. For each b and c. Similar formulas are valid for the fluxes on cell faces
configuration two options arc considered when fixing 6uy and ij+l/2,k and i,j,kflR using respectively the sets (ay.py,yy)
with the numerator of (17) taken as: a variation along x- and (a29P d z ) .
direction or a variation along the diagonal.

I_ - - - - _ _ I _ _ -- - - - I--------------
I I I I I I
I I I I I I
; o ; o ; ; o ; o ;

Ila (c) Ilb (d)

Figure 6 Four classes of 2D multi-d ratios


18-6

Dii, = 3 (a6:F;+b6$F~+cF:F;)u~~,

t' + ( A S; S; + B F; F; + c 6; 6; - D 6; F; F ; ) u ~ , ~ , ~ (23)

The first 3 terms in (23) represent the 1st order upwind


dissipation. T h e additional terms represent the
multidimensional dissipation which consists of mixed 2nd and
3rd differences. The parameters A,B,C and D determine the
amount of multidimensional upwind dissipation. Choosing a
specific scheme corresponds to fixing the 4 parameters A,B,C
and D. Since the cell face values are determined by 9
interpolation coefficients. every scheme has 5 degrees of
freedom in choosing the interpolation coefficients in (21 ).

4.2.2 First order iiioiiotoiie schemes


Figure 8 shows a classification for the %point family of
upwind schemes including monotonicity and zero cross
Figure 7 3D upwind s c h e m e s for 8-point stencils diffusion conditions. The lower limit of the monotonicity
condition corresponds with the classical first order upwind
I t is important to observe that the resulting molecule actually scheme with a maximum amount of cross diffusion. The upper
represents ;I four-parameter family o f schemes when the limit corresponds to the minimum cross diffusion scheme also
following set of parameters is choscn identified in ref.'j. In the 2D case (e.g. c=O) this scheme
reduces to the 2D minimum cross diffusion scheme of fig. 5
A = a x + 0:Y + yY
investigated before i n e.g. ref.6*24*2s.Different interpolation
B = Py + P, strategies are investigated in ref. 32.

c = P, + Y, +a, + U,. 4.2.3 SLiOclms of zero cross rhffk~ioiischemes


D = Y, + Yy i-Y, (21) Evaluating the zero cross diffusion condition in fig.8 one can
notice that there is no condition on parameter D. As a result
Hascd on (10)-(21) the following general 8-point scheme is there is a one-parameter subclass of zero cross diffusion
rccovercrl which is splitted in a central part and dissipation schemes. Different interpolation strategies i n (20) lead to
Lerm: different values of D. The arithmetic average procedure
1 corresponds with the scheme used by Roe and Sidilkover as
Rcs~,~,,= -i ( a 6*+ b + c 6,)u;,, - D+ (22) starting point in their theoretical analysis for first order

where thc upwind dissipation is formulated as

monotone 2nd order


( 1 st order) zero cross diffusion

0 5 A 5 min(a,b)
+
0 IB I min(b,c)
0 I C 5 min(a,c)
max (O,A+B-b,A+C-a,B+C-c) 5 D 5 min(A,B,C)
I
I

I I

a*b2+ b2c2+ a2c2


bilinear interpolation : -D =
4abc
C = min(a,c) r e l a t e d t o min. cross
D =min(a,b,c) - diffusion :
-
Figure 8 Classification of 3D upwind s c h e m e s for 8-point stencils
18-7

optimum linear schemes in two and three dimensions in ref.24. 5.1. Characteristic Decomposition
In the present approach we choose the scheme that
corresponds with the value of D which is identical with the 5.1.1 CIiuracteristic variables/ compatibility eyuatiorts
value for the first order minimum cross diffusion scheme: The 2D Euler equations are expressed by
D=min(a.b.c).
au +
-&- 9.G = g + 2.au= 0
4.2.4 31) MirlticliiiiensionalLimiters +
where A=(A,B) are the jacobian matrices. The eigenvalues of
I n this section the 2D multidimensional limiters discussed
++
bct'ore are extended for the 3D upwind schemes. The 2nd the matrix K = A.K associated to an arbitrary unit propagation
ordcr zero cross diffusion scheme related to D=min(a,b,c)
(fig.8) is rewritten as the first order minimum cross diffusion direction 2 define for a large part the behaviour of the
scheme plus anti-diffusive limited corrcction term, solutions to the Euler equations. Wave-like solutions exist if
the eigenvalues of K are real and the corresponding
eigenvectors linear independent5. The latter define a similarity
transformation which diagonalizes matrix K,

P-I (2.2) P = A (28)

with 1 ( b - min(a,b) )
Aa, = a;*)- a!') = -
2 with the left eigenvectors being the rows of P - l , the right
eigenvectors being the columns of P and the diagonal matrix
Ap, = p',"- p'," =-
I ( c - min(a,c) ) A consisting of the eigenvalues,
2
3.2 , h ( 3 L
h(')=h(2)= c , h(4)= 5.2-c (29)
Remark that only one limiter is applied in (24). An alternative
possibility would be to add a different limitedratio to each Using the left eigenvectors, a set of characteristic variables
component of the correction term. Near discontinuities the can be constructed,
liinitcr is switched off (CP=O) and the 1st order multi-D
dissipation is applied. In smooth regions Ct, = I and then the 6W = P-1 6U or 6U = P 6 W = 5
k=l
6 ~ ( k ) i ( ~ ) (30)
linear 2nd order scheme is applied.
Notice that this definition of L is not the same as for the linear or 6w(') = 6p - 6p/c2
multi-D models (20) because the reference dissipation has
been changed to the minimum cross diffusion scheme instead
of the classical 1st order upwind scheme. The definition of the
3D ratio is based on the variations 6uy and 6uz i n the
correction term of (24) and some extra variation in the third
direction. Thus for face i+l/2,j,k a variation 6ux is introduced
6w (4) = -Z3.65 + 6plpc
in the 3D ratio,
with p being a free parameter. Eq. (31) is not the only possible
definition of characteristic variables5, but the above choice is
well appropriate for our purpose and is based on 3 arbitrary
propagation directions,
Equation (25) has the same form as the definiton of a 3D ratio
in the formulation of a new fluctuation splitting scheme in iti = (K~,,K.'Y ) = (cosei,sinei) , li= (K.'Y ,-K~,) for i=1,2,3 (32)
ref.26 . Different possibilities can be considered for 6u,, as
shown in ref.12 where three different classes of 3D ratios are In order to identify appropriate wave decompositions, the
characteristic variables are defined4by different propa ation
defined. Each definition corresponds with the variations in a
tetrahedron constructed by the three variations along x-,y- and directions: w ( I ) , w ( ~ areLe5ted
) ) w 4, are
to K~ and w ( ~ and ?
z-axis. For more details concerning monotonicity conditions related to respectively K,.K3. Multiplying eq. (27) by the
see ref. 1 2. matrix P- I and introducing the characteristic variables (30)
(3 I ) leads to the characteristic compatibility equations :
5. EXTENSION FOR THE EULEWNS EQUATIONS aw + P
7F I A P g + P-lBPg=0 (33)
The conservative form of the 3D Navier-Stokes equations is
writLen as:
or after working out (33) explicitly,

with conservative variables U = ( p , p u , p v , p ~ , p Ethe


fluxes (F.G.H) and the viscous fluxes (Fv,Gv,Hv).The latter
) ~ ~inviscid

are appromimated by a central discretization and will not be P


-
+ -I - .I,.Vp =0

considered below. Application of the multidimensional


upwind dissipation models from section 4 to the inviscid
fluxes consists of 3 consecutive steps:
I ) Clramcteristic decomposition of the Euler systetn
2) Multi- D cliscretisation o f the characteristic eyrrations
3 ) Re-trcrrisfortnationto cotlser~~ariveiruiiiericalj7ii.r.
18-8

The corresponding 4-wave model consists of one entropy obtained i n a laminar boundary layer on very coarse
wave, one shear wave and two acoustic waves8. The first two meshes O.
terms in cach equation of (34) represent the convection of the 9
Cviivectivii vf entropy and enthalpy
associated wave i n the characteristic direction, --.
The first characteristic direction K , is taken perpendicular to
the velocity

The subscript c in (35) refers to the convective part. The third 1, = (40)
terms i n (34) are the coupling or source terms, and their
presence results from the fact that the jacobian matrices i n Using the definition of specific entropy and total enthalpy,
(27) are not simultaneously diagonalizable by matrix P. Notice
that the coupling terms show also an advective behaviour F v 6p
associated to the directions, 6S=-(--6p), 6H=*(---) 6p 6p + v.6v-.
-B
(41)
P c2 PO-1) c2 Y
(36)
the first two characteristic equations of (34) are rewritten as
that are the the normal directions to the propagation directions
+++
KI,K2,K, . The subscript s refers to the source terms.

5.1.2 Propagation clirectivns


+++
The choice o f LhZ propagation directions K1,K2,K, with
related normals 1,,12,13 in (34) are still to be defined. A main
c_onpct on these directions comes from the factor Eq. (42) shows that in steady state
enthal are constant along the
the entropy and total
streamline, see also
K,.(K~+ICJthat shows up in the denominator of P. To prevent
an ill-defined transformation this factor should be maximized. ref.'7,pi919 (where p=O). The 2nd equation of (42) can
Other conditions to impose on the design of the propagation further be simplified by choosing the parameter p as
directions are the continuity from subsonic to supersonic flow p = - cz/pu E: Il(y-I ) leading to
range and robustness. Different possibilities of propagation
aw(l)
directions have been examined. - - %.as = 0
at Fv
Di~iKoiitili~atiori
npprvcich
The source terms in the system of compatibility equations (34) (43)
can be eliminated by the following set of propagation
directions8.
As a result the Euler systcm (34) is splitted in a hyperbolic
1,.ap = 0 , 1,. (1,.V)3 = 0 ,
- 1 - 1
I,= I, (37) part that represents the convection of entropy and enthalpy
-
The firs+t iirection K~ is taken along the pressure gradient
along the streamline in steady state (43) and a remaining
acoustic subsystem with source terms, as i n the
while K ~ , K , are taken equal and defined by the strain rate hyperboliclelliptic splitting in ref. I 6 t 1 * .
tensor. The use of this set of directions depending on gradients Mcichangle splitting
of the solution, shows a lack of robustness i n Euler
calcu1ations7. I n the framework of the fluctuation splitting
a machangle splitting was developed. The first direction is
Coinb ir t n I io1i pressure g raclieiithelvcity taken perpendicular to the velocity and the 2nd and 3rd
+++ directions are respectively perpendicular to the positive and
In non-uniform regions K1,K2,K3 are taken along the pressure
gradient, negative characteristics or machlines

i,.ap = 0 .
+

I,=
+ -

I?= I,
1
(38) e, = e +25 e,= e, + p , e1 = e, - p (44)

In smooth regions a continuous switch between the pressure with 8 the flow angle and p = a r c t a n ( l / d c - ) the
gradient (38) and the streamline direction is introduced. This machangle. A fully decouple system of characteristic
model shows good accuracy in both subsonic and supersonic equations is obtained in steady state,
regime7.9*10.A better convergence behaviour than with (37)
is obtained cspecially in supersonic flow. I n some cases (e.g.
;.as = 0 ?.aH = 0
subsonic flows) convergence can only be obtained by freezin (45)
the directions after a certain residual drop of I or 2 orders 9 , l f (? +cR2).2).aR3 = 0 , (?-cZ3).aR4 = 0

Sti~xiiidiiiedirection using the steady Riemann variables:


A much simpler choice is taking the directions along the
stream I i ne
- 1 - 1 -
TI.? = 0 , I,= I?= I, (39)

This choice seems to have good convergence behaviour but


has a poor accuracy especially in supersonic inviscid flows
near discontinuities9. On the contrary very good results are
18-9

spliuiiig
~.~!vL.irrlo-Mriclirrrigle 6. RESULTS
This model is an algebraic continuation of tlic supersonic 6.1. 2D supersonic Lava1 nozzle
niachaiiglc dccompositioii i n Ihc subsonic range with Thc inviscid supersonic flow in a Lava1 nozzle is calculated at
continuity at M = l , developed i n the fluctuation splitting ;I Machnuinbcr of 2.91 on an H-type mesh with 128x32 cells.
approach 1 7 * ' '*I9. The corresponding directions are, The first order ininirnum cross diffusion scheme (4MCD) and
n: 5-point continuous zcro cross diffusion scheme (SZCD)
e, = e,. e,=efp+-,
2
e,=e, (47) combined with ininmod limiter and the ratio of subclass (la) is
~- investigated. The multi-D schemes arc compared with the
with p=arctan(I/,/IM 2 - I I) defined as the pscudo- classical 2nd order Flux Differcnce Splitting schcme (FDS2)
inachangle. Notice the 2 sets of propagation directions leading with mininod limiter. The classical scheme is tested on a finer
to 2 splittings of the residual. For the final residual the average mesh of 256x64 cells, as reference solution. The extension to
01' both splittings is taken. the Euler equations is based on the 2D characteristic
decomposition (34) with the 3 characteristic directions
Considerable research is still being performed to identify the +++
K,,K2,K3defined by the inachangle splitting. Both the
most suitable directions, see For i ~ i s t a n c e ~ ~for
. ~ ' )a recent
convective and source tcrins of (34) are discretiscd with the
survcy.
same scalar multi-D dissipation model.

5.2. Numerical Flux Formulation Figure 9 shows the isomachlines for the 4 solutions. The first
The space operators i n the characteristic coinpatibility order multi-D scheme performs well in comparison with the
equations (34),(43) and (45) discussed in section 5.1 are classical 2nd order scheme u p to the 2nd reflection of the
discrctiscd. In the case that there is no source term, the space shock structure. The 2nd order multi-D scheme is superior to
operator is expressed by the classical scheme on the same mesh. I t compares very well
with the reference solution on the finer mesh. Figure IO and
no source term : aik) . Vw (k) or i i C( k ) . VR (k) (48) I I show respectively the Machnumber and total temperature
distribution along thc symmetry-axis. The total temperature or
where the gradient acts on the characteristic variable or a total enthalpy should be constant in the whole field. The errors
steady Riemann invariant. Whcn a source term is present, the for the multi-D scheincs are much smaller than for the
space operator can be written as. sical results. even on the finer mesh.
Figure 12 shows the convergcnce history. Both first and
second order 2D results show a good convergence behaviour
obtained with a 3 level multigrid acceleration combined with a
whcre the convective and sourcc terms are written as an
S-stage Rungc Kutta prodecure and residual smoothing with a
advection of respectively a charactcristic variable and a
CFL of respectively 10.0 and 8.0.
'source' variable along the associated dircctions (35) and (36).
In hoth cases thc sourcc and convcctive terms have the sarnc
I'orm and can be treated by thc same multi-D scheinc or 6.2. 3D supersonic corner llow
dissipation model. An inviscid supersonic corner flow4 (M=3.0) is considered,
which is generated by two unswept compression ramps with
In the formulation used i n previous work10y12.31*33the 9.S deg. wedge angle as illuslrated in figure 13. The first order
scalar multi-D dissipation models were applied only to the 3D minimum cross diffusion scheme is tested in comparison
convective terms while the source terms were discretised by a with classical first and sccond order (ininmod limiter) upwind
central approximation without artificial dissipation. In the schemes on a uniform mesh with 3 2 x 3 2 ~ 3 2cells. The
present approach also the source terms can be treated with a accuracy of the 3D scheme is investigated for both 3D and 2D
multi-D scheme based on the associated speed (36). flow phenomena appearing i n this testcase. The extension to
Discretising both convective and source terms leads to two the Euler equations is pcrformed using the 3D extension of the
numerical fluxes or dissipations for evcry scalar equation. characteristic variables (3 I ) and equations (34). see rcf.I2.
+-++
Next the scalar multi-D discretisation is re-transforined to the The three characteristic directions K , ,K2,K3 are taken along
conscrviitivc residual by use of the right eigenvectors. The the pressure gradient direction. When the pressure gradient
resulting inviscid numerical flux on ccll face i+1/2,j is dcfincd goes to zcro a blcnding is performed with the velocity
by direction.
Figure 14 shows the convcrgence history. No freezing of the
dircctions was needed to rcach Convergence with the 3D
scheme. Convergence is obtained with the use of multigrid
where d, and ds represent respectivcly the scalar multi-D acceleration and residual smoothing with a 5 stage Runge
dissipation ofthe convective and source part for each of the 4 Kutta procedure with CFL =IO.
characteristic equations. The old formulation wlicre thc
coiivectivc term is treated by a multi-D schcinc and the source lsoinach lines are shown in figure 15. The classical first order
term by ;I central scheme without artificial dissipation is easily scheme shows smeared out shocks and no contact
recovered by putting the dissipation for the source term i n (50) discontinuities while the first order 3D result shows an
to zero. accuracy comparable with classical 2nd order. From the
isomachlines near the solid walls one can conclude that the
multi-d result shows less entropy creation than the classical
schemes.
18-10

7. CONCI.USIONS Dcconinck. 1-1.. Struijs. R.. and Roc. P. L. (1991).


Genu incl y 3 D / 2 D in ti It i d i mcnsiorial upwi ntl scliciiics ;ire 'FI uc t t i at ion s pl iI t i ng lo r mu I t idi in ens iona I convcct ion
de vc Io ped l.or t he Eu Ier/N a v ie r- S t okes cq u;it io 11s. T h e problem: ;in :iltcrn:itive to finite voluine and finite element
schcnics ;ire I'ormulatcd in thc framework 01' cliiiiciisioii;il-si)lit methods.' VKI Li~./rrr.i.Scric,s IYYO-.I 011 C o r i r / ) i i / ~ l / i t ~Flrrid
/i~r/
ccntr:il 01' tipwirid dissipation niodcls. leading to ;I iicw 11yrui/ric.v. V o i i Karnian Institute. Hrusscls. Belgium.
concept o I' c'o nipact 3 D / 2 D iii t i It i d i niciisioniil 11 pw intl
Hcntschcl. R.. and Hirschcl. E.H. (1994). 'Self Adaptive
dissipation.
FI o w COi n pti t ;it ions on SI ruct urcd Grids. ' Proc. ECCOMA S
A uiiil'ictitioii 01' 2D coiiipiict linear schciiics i s showii hnscd 2 ~ Errropetrii
1 Fliritl /?wi~ii/iic.s Coi!/erLwco.
Co/rii)ir/ii/io/i~r/
on two cI;isscs ol' 6-point stencils. Each class h a s ;I two- Wilcy. September 1994.pp.242-249.
pwinictcr sulicI;iss 01' zero cross diffusion scliciiics ond :I
Hirsch. Ch. (1990). 'Numerical Computation of Internal and
uiiicluc scc'o~icloi.dcr scheme. Both families liavc ;I 4-point
External Flows.' Vol. 2. John Wiley B Sons, Chichcstcr.
~ ~ i h c I ; i s si n coninion that has ii unique minimum cross
England.
diflbsion schcinc and a zero cross diffusion scheme.
6 Hirsch. Ch. (1991). ' A General Analysis o f T w o -
A class 01' 3 D scalar convection schcnics hascd on an 8-point
Dimcnsionnl Convection Schemes.' VKI k c / i i r e Series / Y Y / -
compact stencil i s derived that rcduccs to the 4-point suhclass
0 2 ou Co~ri~~rittrtio~iirl
Fliritl D y r c i i r l i c s , Von Karinan Institute,
i n 2D. It 1i;is ;I unique first order schciiic with minimum cross
Hrllsscls. 13Ll,'
' "llllll.
clil'fiisiori a n c l ;I one-parameter subclass of x r o cross dil'l'trsion
schcnics hcing second order accuratc for ;I hoiiiogcneotis Hirsch. Ch.. and Lacor. C. (19x9). 'Upwind A l g o r i t h m
convection ccIuation. l3ascd on a D i a g o n a l i z a t i o n Procedure for the
M u II id ime nsionu I EuI c r Er1uat i o n s.' A / A A Ptiper. 89- I9SX.
Second order inonotow schemes :ire explored. The dissipation
A I A A 9th Computational Fluid Dynamics Conferencc.
i s written iis tlic I st order mininiuiii cross difl'usion dissipation
p l u s additional anti-diffusive terms. Thrcc- ;ind two- Hirsch. Ch.. I,acor. C.. and Dcconiiick. H . (19x7).
(li iiicnsioii;il I i miters. dcpciiding 011 ratios 01' 1.1 t ~ xdi fl'crcnccs 'Convection Algorithins H~iscd o n a Diagonalization
i i i dilTciuit iiicsli directions. iirc introduced. 111 2D two Procedure for the Multidimensional Euler Equations.' Proc.
flimilics 01' mtios related to two types o f triangles arc dcfincd. AIAA Pcrpt!r 87- 1/63, pp.667-676. A l A A 8th Computational
In each cI;iss two suh-l'ainilics arc considered rclalctl to Fluid Dynamics Conference.
variations along the incsh line or along ii diagonal. Hirsch. Ch.. aiid Van Rnnshecck. P. (1992). 'Cell-centered
Extensions to the Eulcr-N~ivicr/Stokcsccluations ;ire ohlaincd M u It i d inicn s ion a I U p w ind A Igori t h m s ii n d St rue t u red
through ;I cI1;ir;ictcristic tlcconiposition using c h a r x t c r i s t i c Meshes.' Prnc. ECCOMAS I s / Eirropc,trri C o ~ i i ~ ~ i i t ~ i t i o r i t r l
v;iriahlcs with 3 tlifl'crcnt propagution directions. A rcvicw is Fluid 11,vrrtuiiic.v Co/!/iwrice.Elscvier. Vol. I. pp. 53-60.
givcii 01' di~'l'crcntchoices l.or tlic directions. For supcrsonic
I o Hirsch. Ch.. and Van Ransbeeck, P. (1994).
flow the coiiihiii;itioii of prcssurc gradient mid velocity sccins
'Multidimcnsional Upwinding and Artificial Dissipation.'
to he ;in accuratc choice but the inachanglc splitting i s iiiorc
Puhlishcd i n Frontiers of Coinputational Fluid Dynamics.
robust. Application of the 3 D niiiiiiiiuni cross dil'fusion
Procec~tli/i~q.s,fi)r /lri) .s,v/irpo.siiiiri C o ~ r ~ p i i / i i itlw
g Fiitiir.c>:
schciiic in cornhination with the prcssure gradient approach LO At1iwice.s trritl Pro.spi~c~.v J ) r Corri/,ir/~rtioncrlA eroc/wtrrirics.
;I 3 D supcrsoiiic tcsic;~seshows comparahlc accuracy with a
Wilcy. pp. S97-626. .
classical 2nd order dimensional-split upwind scheme.
I I Hirsch. Ch.. and V a n Ransbeeck. P. (1995). ' A General
For suhsonic and supersonic flow the strcaiiilinc direction i s
A n a Iy s is o I' M u It id inicii s i ona I Convection Sc hc incs. '
n o t yet accuratc enough and so research i s s t i l l needed to
Appendix i n 'State of the Art of C F D i n Industry' (Ch.
idcntily more el'fective choices. Hirsch). V K I Lecrirre Series 1995-07 o r i /ndir.s/ritil
A iicw forintilation i s introduced where both the convcctivc Coirri)ir/fr/io/ittI Flrrid D,v/imrics. V on Karinan Inst itu le.
aiid source tcrins cmi he discrctisccl with ;I niulti-D schcinc I3russcls. Belgium, pp.6 I -8X.
using i t s associated chaixteristic speed. Application o f this
I Hirsch. Ch.. and P. Van Ransbeeck ( 1 9 9 5 ) .
new ;ipproach in combination with the machanglc splitting
'Multidimcnsion~ilUpwinding and Artificial Dissipation I'or
directions to ;I 2D supersonic tcstcasc shows hcttcr accuracy Ilie E ti Ie r/N ;Iv ier- Sto kcs Eq u ;it io n s . ' A / A A Paper 9.5 - I 70-3.
with lower total tciiipcraturc error 11i;iii cI;issic;iI 2nd order
Pi.oc.. A IAA /?//t Co/rri"r/fr/io/rtr/Fluid I?\VILIIII~CS Co/lli.rc~c:cj.
diiiiciisioii;il-slilit upwind schciiics.
I Janicsoii. A. ( 1984). ' A Non-Oscillotory Shock Capturing
Schcinc using Flux Limited Dissipation.' M A E Report 1653.
ACKNOWI,ED(;EMEN'rS
Princeton University. New Jersey. Puhlishcd in Lec/irws i i ~
This rcsc;ircIi i s supported by the Cominissiori of the European
Apldietl Mrr/lrcwrtr/ics. Encltiist. Osher and Somerville Eds,
Coinm tin il y 11ndcr Contract AERO-CT-O04O/PL-2037 in A rea
A.M.S., Part I. pp 345-370. 19x5.
5 (Aeronautics) 01' the Hrite/Euram Pro, " ~,l l l l l l l c .
l 4 .I;inicson. A. (1993). 'Artificial Diffusion. Upwind Biasing.
Limiters and their Effect on Accuracy and M u l t i g r i d
REFERENCES
Convcrgcncc i n Transonic and Hypersonic Flows.' A I A A
I Datloiic. A . . aiid Crossman, H. (1991). ' A rotated upwind
Ptrper Y3-.?3SY. AlAA I I / / i Co/iri)iff~rliorrciIFluid Dyurrrics
~ c h c i i i cfor the Eulcr equations.' AlAA Pupor 91-063.5.
Cor!/ere//ce.Orlando.
Dcconinck. H..Struijs. R.. Hourgois, C.. and Roc. P.L.
Levy. D.W.. Powell. K.G.. and van Leer. H. (1989). ' A n
( 1994). 'I-Iigh Resolution Shock Capturing Cell Vcrlcx
Implcmcntation of a. Grid-lndcpendcnt Upwind Schcinc for
Aclvcctioii Schciiics on Unstructured Grids.' V K I Lec/irrc,
the Euler Ecluatioiis.' AIAA Ptrper XY- IY3I-CP. P i ~ o c t ~ ~ d i r i ~ q ~
.Soric,.v / 994 - 11.5 ( N I Coiripii /o/ ior rtr 1 Flrr id I1,vrrmrr ics. V 011
Karin:iii Instilute. 13rtisscls. Belgium.
18-11

AlAA 9th Cumpututional Fluid Dynamics Conferencr. Dimensions.’ SIAM 1. Nwner. Anal.,Vol. 29, No.6, pp. 1542-
Buffalo,.pp. 8-24. 1568.
l 6 Mesaros, L.M.. and Roe P.L. (1995). ’Multidimensional 25 Sidilkover, D. (1989). ’Numerical Solution to Steady-State
Fluctuation Splitting Schemes Based on Decomposition Problems with Discontinuities.’ P1i.D Thesis Dept. .
Methods.’ AlAA Puper 95-1699, Proc. AlAA 12th Mathematics. Weizmann Institute of Science, Israel.
Cumpututionril Fluid D.vnmics Conference. San Diego.
26 Sidilkover. D. (1994). ‘A genuinely multidimensional
Paillire, H. Carette. J.C.. and Deconinck. H. (1994). upwind scheme and efficient multigrid solver for the
‘Multidimensional Upwind and SUPG methods for the compressible Euler equations.’ Submitted for Journal of
Solution of the Compressible Flow Equations on Unstructured Computational Physics.
Grids.’ V K I Lecture Series 1994-05 on Cumpurationul Fluid
27 Sidilkover, D., and Roe, P.L. (1994). ‘Unification of some
I>ynueiic.v. Von Karman Institute. Brussels, Belgium.
’* Paillire, H. Deconinck, H.. and Roe P.L. (1995).
‘Conservative upwind residual-distributionschemes based on
advection schemes i n two dimensions.’ Submitted for
publication.
28 Spekreyse, S.P. ( 1987). ‘Multigrid Solution of Monotone
the steady characteristics of the Euler equations.’ AIAA Paper
Second order Discretizations of Hyperbolic Conservation
Y5-17(JO. Prnc. AlAA 12th Compututionnl Fluid Dynumics
Laws.’ Math. Coinp..vol.49,pp.135-155.
Cuiference, San Diego.
29 Swanson, R.C., and Turkel, E. (1992). ‘On Central-
l9Paillire. H. (1995). ‘Multidimensional Upwind Residual
Difference and Upwind Schemes.’ Journal of Computational
Distribution Schemes for the Euler and Navier-Stokes
Physics, 101. pp.297-306.
Equations on UnstructuredGrids.’ Ph.D. Thesis, Von Karman
Institute. Brussels, Belgium. 30 Van Leer, B. (1992). ’Progress i n multi-dimensional
upwind differencing.’ ICASE Report No. Y2-43.
20 Rice, J.G.,and Schnipke, R.J.(1985). ‘A Monotone
Streamline Upwind Finite Element Method for Convection 3 1 Van Ransbeeck, P.. and Hirsch, Ch. (1993). ‘New Upwind
Dominated Flows.’ Computer Methods in Appl. Mech. und Dissipation Models with a Multidimensional Approach.’ AlAA
Engineering. Vol. 48, pp.313-327. Puper 93-9304. Proc. AIAA I Ith Computntional Fluid
Dynumics Conference, pp. 8 1-9 I .
2 1 Roe. P.L. (1986). ’Discrete Models for the numerical
analysis o l time-dependent multidimensional gas dynamics.’ 32 Van Ransbeeck. P.. and Hirsch. Ch. (1994). ‘Solution
Journul qf Computationnl Physics, Vo1.63. Adaptive Navier-Stokes Solvers using multidimensional
upwind schemes and multigrid acceleration.’ VUB pan of
22 Roe, P.L. (1990). “Optimum’ upwind advection on
Progress Report BriteEuram contract AERO-CT-OWO/PL-
triangular mesh.’ ICASE Report No. 90-75.
2037. VKI CR 1994-26.
23 Roe. P.L. f 1994). ‘Multidimensional Upwinding-
3 3 Van Ransbeeck. P.. and Hirsch. Ch. (1994).
Motivation and Concepts.‘ V K I Lecture Series IYY4-05 on
’Multidimensional Upwind Dissipation Models for the 2D
~ i i ~ i I Dynmics. Von Karman Institute,
C ~ i i n / ) ~ i t i i t i ~ Fluid
Navier-Stokes Equations.’ Prm. ECCOMAS 2nd Europmn
Brusscls. Belgium.
Computational Fluid Dynnniics Conference. Wi ley.
24 Roe. P.L.. and Sidilkover, D. (1992). ‘Optimum Positive September IY94.p~. 655-662.
Lincar Schemes for Advection i n Two and Three

FIGURES

2nd order FDS + MINMOD 128x32 - 2nd order FDS + MINMOD 256x64 -

Multi-D 1st order 128x32 - Multi-D 2nd order + MINMOD + RATIO IA 128x32 -
Figure 9 Supersonic Lava1 nozzle (M=2.91), isornach1ines.i .41<M<2.91. AM=O.O2.
18-12

2nd order FDS, 128x32 2nd order FDS, 256x64


*
23 ,I 111 11 I, m
m0

Multi-D 1st order, 128x32 Multi-D 2nd order, 128x32


m
I, L1 In0 U L, ,no

Figure 10 Supersonic Laval nozzle (M=2.91), Machnumber distribution along symmetry-axis.

Multi-D 1st order, 128x32

Figure 11 Supersonic Laval nozzle (M=2.91). Total temperature distribution along symmetry-axis.
18-13

FDSZ 256x64, C b l . 0
FDSZ 128x32, Cn=8.0
4YCD 128X32, CFL=IO.O
5ZCD 128x32, C b 8 . 0

Irn. 2ua
CYCLE

Figure 12 Supersonic Lava1 nozzle (M=2.91), Convergence histoly.

t’

4 0

Figure 13 Supersonic corner flow, flow behaviour4 Figure 14 Supersonic corner flow, convergence history RMS.

(a) FDSl (b) BMCD (c) FDSZ-Minmod limiter

Figure 15 Supersonic corner flow, isomachiine comparison ( 2.0<M< 3.0, AM=O.02).


A PCG/E-B-E ITERATION FOR HIGH ORDER AND FAST SOLUTION
OF 3-D NAVIER-STOKES EQUATIONS

A. Riistem A s h , fJlgen Gasat and A y h Mulrhoglu


Faculty of Aeronautics and Astronautics
ITU, 80626, Maslak, Istanbul, Turkey

SUMMARY dimensional Navier-Stokes equations received


considerable attention. However. for a numer-
A second order accurate (both in time and
ical technique to fulfill the demands of SO’S, the
space) an explidt/implicit scheme is imple-
accuracy of the scheme must be at least second
mented for the solution of three-dimensional
order for both in time and space discretizations.
incompressible Navier-Stokes equations involv-
ing high Reynolds Number flows about complex In this study, a second order accurate (both
configurationa. A fourth order accurate artifi- in time and space) an explidt/implicit scheme
cial dissipation term on the momentum equa- is implemented for the solution of three-
tions are used for stabilizing. Finite Element dimensional incompressible Navier-Stokes equa-
Method (FEM) with an explicit time marching tions involving high Reynolds Number flows
scheme is used for the solution. and element by about complex configurations. To do so, a
element (E-B-E) technique is employed in order fourth order accurate artificial dissipation term
to ease the memory requirements needed by the on the momentum equations are used for stabi-
storage of the stiffness matrix of FEM. The cu- lizing. Finite Element Method (FEM) with an
bic cavity problem, laminar flow past a sphere at explicit time marching scheme is used for the
a high Reynolds number and an incompressible solution, and element by element (E-B-E) tech-
visLous flow around the fuselage of a helicopter nique is employed in order to ease the memory
are succesfully solved using the first and the sec- requirements needed by the storage of the stiff-
ond order accurate schemes. Comparison of the ness matrix of FEM[I]. Since the scheme is time
results are also provided. accurate, the transient nature of the flow field i b
properly resolved.
1. INTRODUCTION For the calibration of the code the cubic cavity
Recent advances in iterative solution techniques problem is solved using the first and the second
enabled CFD researchers to solve large scale order accurate schemes. The comparison with
problems in acceptable computation times with the existing literature[2] is satisfactory even for
the fast processors of 90’s. The iterative solvers a coarse grid. The solution with the fourth order
have become the CFD’s convenient tools which artificial viscosity adequately resolves the redr-
do not require excessive memories on comput- culating region as opposed to the solution with
ers for either implicit time marching schemes the second order artificial viscosity.
or inversion of elliptic equations. For finite el- As the second study, laminar flow past a sphere
ement computations, element by element ( E B - at a high Reynolds number, Re = 162 000, is
E) iteration schemes demand the least amount of solved with the both schemes. Finally, in order
memory. The conjugate gradient (CG) method, to test the capabilities of the code, an incom-
which is the Iirylov subspace technique applied pressible viscous flow of Re=50 000, around the
on symmetric operators, becomes an efficient, fuselage of a helicopter is studied.
indeed the fastest converging, iterative method
when applied with preconditioning (PCG)to the All the computations are performed ona per-
discrete form of the equations. sonal computer equipped with a i860 Number
Smasher board with 32 Mbytes of memory.
During the last two decades, solution of three-

Paper presented at he AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
19-2

2. FORMULATION the potential and velocity, pressure values are in-


terpolated using piecewise constant functions at
2.1 Governing Equations each element. Application of the conventional
The Navier-Stokes and the continuity equations Galerkin integral[4] to the equations and the
for the unsteady, incompressible flow of a viscous boundary conditions gives integral finite element
fluid, in the absence of body forces are: formulations for one brick element[3,5].
2.2.1 First order explicit formulation
_DV_ - -vp + -vw
1
(1)
Dt Re Let VI and Vz denote following velocity differ-
v.v=o ences in vector form:
v1= vm+f - v m
The equations are written in vector form (bold- vz=vm + l - Vm+f
face type symbols denote vector or matrix quan- Using a forward difference operator for the time
tities). The variables are non-dimensionalized derivative in equation (1) and letting V"' and
using a reference velocity and a characteristic pm be solutions at the known time level m,
length, as usual. Re is the Reynolds number, the first order explicit fractional step algorithm,
Re = U l / v where U is the reference velocity, over a single time step and in fully discrete ma-
I is the characteristic length and v is the kine- trix form, is given by in a direction as follows:
matic viscosity of the fluid. V corresponds to
the Cartesian velocity components u,v, and w.
Pressure is symbolized with p and the time is
Ev?= [B,+P,C,- (&+D)v,Im (5)
with t.
A q5=E,e+'
For a well posed problem, the governing equa-
tions are complemented with the following ini-
tial (t=O)
V(x,O)= V"(x) and p(x,O) =p"(x) (3)
+
where is the a d a r y potential function, M
and boundary conditions which have to be spec- is the lumped element mass matrix, D is the
ified on related surfaces: advection matrix, A is the stiffness matrix, C
1 is the coefficient matrix for pressure, B is the
V=G and -pn+-(VV).n=F (4) vector due to boundary conditions and E is the
Re
matrix which arises due to incompressibility. El-
where x i s the position vector, G and F are pre- ement potential & is defmed as
scribed boundary values, and n is the unit vector
normal to the boundary.
2.2 Numerical Methods
where R is the flow region to be solved, r is the
The governing equations are integrated in time boundary of R and N, are the shape functions.
using both first and second order accurate Details of the formulation can be found in [5].
schemes. The h s t order scheme follows that
2.2.2 Second order explicit formulation
of (31 which constitutes a time marching scheme
based on Helmholtz decomposition. A potential The second order time accurate scheme is some-
function with a single degree of freedom at each what similar to that of [6] wherein a new inter-
node is introduced and a Poisson equation for mediate velocity field is introduced. Both ex-
the potential is directly discretized. Eigth-node plicit and implicit versions of the algorithm are
isoparametric brick elements and trilinear inter- devised. The explicit formulation resembles the
polation functions for the velocity and the aux- first order explicit scheme except that the frac-
iliary potential are used. The pressure is defmed tional step velocities are calculated in two steps.
a t the centroid of each element. In contrast to Let VI, V2 and V, denote following velocity
19-3

vector differences:
v1= vm i f - Vrn
vz = V” - V” -C
v3 = v m + 1 - v* P,m + l = m
At (8)
the second order explicit fractional step algo- For the implicit solutions of equations (14)-(17).
rithm, in fully discretised matrix form, over a Element By Element (E-B-E) technique[l] is em-
single time step is defined by: ployed in order to ease the memory requirements
g V f = [B, + P.C, - (a+ D) V,Im (10)
needed by the storage of the stiffness matrix
of FEM. The iterative solution is fully vector-
ized[7]. The right hand side values of these equa-
tions are scaled with the square of the time step
to increase accuracy. These scaling is found to
reduce the number of iterations by almost 50%.
2.3 Artificial Dissipation
In the present study, a fourth order accurate ar-
tificial dissipation term on the momentum equa-
tions are used for stabilizing. The diffusion term
;
The factor appearing in (12) and (13) is used is added explicitly to the right hand side of equa-
for second order accuracy in time. In the formu- tion (1).Formulation given in reference[8]is ex-
lation given above, V* is a velocity vector which tended to three-dimensions. The artificial vis-
is not selonoidal. cosity term is computed in two steps at element
2.2.3 Second order implicit formulation level. First a second-order differencing is accom-
plished:
Tho implicit fractional step formulation follows -
X

the same steps as does the explicit one. How- 0’= - SV,
ever. the formulation is obtained by adopting a J=1
Crank-Nicolson representation for the diffusion These values give the second-order distributions
terms. but otherwise retaining the explicit for- to cell corners (i) for the momentum equations.
mulation as before. Then, fourth order distributions to cell corners
Using the same velocity diffrence formulas de- are formed using the above values:
fined for the second order explicit formulation
above. the second order implicit Galerkin frac-
tional step algorithm, in fully discretised matrix
form. over a single time step is defined by: J=1

These fourth order viscosity terms are multi-


(g+ &)Vf = plied by a certain coefficient when added to the
momentum equations. All the velocity com-
ponents are multiplied by the same coefficient,
e 5 1/24. No dissipation term is added to the
Poisson equation for the potential.
(2+ &) V? = 3. RESULTS AND DISCUSSION
m
(Bot P,C, - &V.) - (D V,)’”+f For the calibration of the code the cubic cavity
problem is solved using the first and the second
order accurate schemes. The grid used is fairly
$A# = E, V; coarse, l l x l l x l l . The first order scheme, with
19-4

second order artificial dissipation, gives low ve- Shown in Fig.1 c and d is the symmetry plane ve-
locit?' gradients in the vicinity of the Wall5 as locity vectors obtained with the first and second
seen in Fig.1 a and b. The second order accurate order schemes respectively. The fiow Reynolds
scheme. on the other hand, predicts the velocity number is 1000 and the dimensionless time level
profiles. even with a coarse grid, in agreement is 30, the steady state is practically reached.
with the results given with spectral methods[2].
0.40

0.40

0.20

V 0.88

-0.20

-0.40

-0.60
0
U X
a) vertical centerline b ) horizontal centerline.

I t 1 1

J "

O.m 0.25 0.a 0.75 1.m 0 .m 0.25 0 .sa 0.x 1.m

c) Solution with second order dissipation, d) Solution with fourth order dissipation.

Fig.1 Cubic cavity velocity profiles for Re=1000 in comparison with the results of Ref[2] on the symmetry
plane at steady state (a-b). Present solutions with fourth and second order artificial dissipations are
shown. Flow velocity vectors on the symmetry plane(c-d). llxllxll stretched grid.
19-5

a) Full grid. b) Near body detail.


Fig.2 Numerical grid on the symmetry plane of the sphere.
As the second study, laminar flow past a sphere ter. The grid around the fuselage is shown in
at a high Reynolds number, Re = 162 000, is Fig.5, where 11280 brick elements with 12915
solved with both the f i s t and the second order nodes are used to resolve the symmetric half of
schemes. The full grid around the sphere con- the flow field. The flow Reynolds number. based
sists of 19127 points and 17640 brick elements on the height of the body taken as a characteris-
as shown in Fig.2a. Fig.2b shows the details of tic length is 50 000. Shown in Fig.6 aand b is the
the grid around the body. In Fig.3 a and b the symmetry plane velocity vector fields at about
velocity vectors at the symmetry plane at about the steady state obtained with the first and sec-
time level of 4 is plotted. The length of the sepa- ond order schemes, respectively. The results of
ration bubble predicted with the both approach the second order scheme indicate a longer sep-
is almost the same, however, the width differs aration region in the wake. Fig.6 b. A detailed
significantly. The second order accurate scheme picture of the wake is depicted in Fig.: a and b,
predicts the separation angle close to the value wherin the separation bubble obtained with sec-
given in 191. As seen in Fig.3 a and b, the flow ond order scheme is twice longer than the buble
is symmetric with respect to the mid plane and, obtained with the first order scheme. Also seen
at the upper half of the plane there is a ma- in Fig.6 b is a small separation region at the bot-
jor clockwise recirculating bubble. The details tom of the fusalage where there is an unfavor-
in the separation region, however, is predicted able pressure gradient. The rust order scheme
with the accurate scheme as seen in Fig. 3.b, can not predict that separation region because
wherein a smaller bubble with clockwise rotation of high artificial diffusion. The detailed picture
is present at the upstream of the major one. The of this unfavorable pressure region is provided
more detailed picture of right after the shoulder in Fig.8 a and b for the first and second order
is given in Fig.4 b, where there is a very small schemes, respectively.
counterdockwise rotating bubble in between the The presssure distribution on the body surface
major and the minor clockwise rotating bubbles. at the symmetry plane is given in Fig.9. Ac-
All these details are smeared out with the first cording to this figure the pressure values follow,
order method as seen in Fig.4 a. even with h e r in general. the same trend for the both solu-
resolution in radial direction. tions, however the unfavorable pressure gradient
The third problem solved is related to an insti- region a t the bottom surface indicate where the
tutional project for developing a generic helicop- two solutions do not agree.
19-6

a)Second order dissipation


-

b)Fourth order dissipation


- -
-
\

. -
_ C L . - e

- 5 - I
c

--------- -
d

-_I.--
. _- 4

Fig.3 Velocity vectors on the symmetry plane of the sphere, Re=162 000.
19-7

4--

a) Solution with second order dissipation. b) Solution with fourth order dissipation.
Fig.4 Velocity vector details at the symmetry plane on the right shoulder of the sphere.

a ) Full grid.

b) Near body detail.

Fig.5 Numarical grid on the symmetry plane of the helicopter fuselage.


19-8

--
a) Solution with second order dissipation.

- d
A
7- -- -.
:e : .

----
w-

- w-

b) Solution with fourth order dissipation.


Fig.6 Velocity vectors on the symmetry plane of the fuselage, Re=50 000.

Shown in Table 1 is the drag coefficient values The drag coefficient values evaluated for the
for the sphere compared with the experimen- helicopter fuselage with both schemes are also
tal data[9]. As seen from the values, the first given in Table 1.
order scheme over estimates the coefficient val-
ues whereas the second order scheme under esti- CONCLUSION
mates them as compared to experimental values. A computer code based on a second order accu-
19-9

a) Solution with second order dissipation, b) Solution with fourth order dissipation
Fig.7 Velocity vector details at the wake region of the fuselage, Re=50 000.

si--- /-+
-/ -----FA
y- A -

-+---e A
a) Solution with second order dissipation, b) Solution with fourth order dissipation.
Fig.8 Velocity vector details at the unfavorable pressure region of the fuselage.

T=3.5, Re=50000
Expllclt Schema
5 @ @
-1 .OO

X/D , DISTANCE
Fig.9 Piessure coefficient (Cp) values on the body surface at the symmetry plane.
19-10

Geometry R e number Scheme cd


Sphere 162 000 Experiment [9] 0.47
First order 0.52
Second order 0.38
Fuselage 50 000 First order 0.20
Second order 0.11

rate scheme is developed and implemented for [4] O.C. Zienkiewicz and R.L. Taylor, The Fi-
flows involving large separations and strong re- nite element method, volume 1, McGraw-
circulations about arbitrary shapes. Hill book company, London, 1989.
The results obtained for various test case are in
[5] A.R. Aslan, F.O. Edis, U. Giilpt and
good agreement with the existing numerical and
E. Gurgey, Prediction of Geneml Viscous
experimental data.
Flows Using a Finite Element Method.
The code is implemented satisfactorily to pre- Proceedings of the 8th International Con-
dict the drag coefficient of a generic helicopter ference on Numerical Methods in Laminar
fuselage. and Turbulent Flows, Swansea, U.K., July
18-23, 1993.
ACKNOWLEDGEMENT
[6] M.F. Webster and P. Townsend, Develop-
This work is partially supported by ITU, Re- ment of a Transient Approach to Simu-
search Fund under project number 494. late Newtonian and Non-Newtonian Flow.
NUMETA’SO, Numerical Methods in En-
REFERENCES gineering: Theory and Applications, Edts:
G.N. Pande and J . Middleton, V01.2, El-
U. Giilqat, An Ezplicit FEMfor 3-D Gen- sevier Publications, U.K., pp.1003-1012.
em1 Viscous Flow Studies Based on G B - E 1992.
Solution Algorithms, Comp. Fluid Dyn..
vo1.4, pp.73-85, 1995. [7] U. Giilqat, A.R. Aslan and A. M I S ~ I O ~ ~ J U ,
Aerodynamics of Fuselage and Store-
Hwar C.Ku, Richard S. Hirsh and Thomas camage Intemction using CFD, Agard
D. Taylor, ” A Pseudospectral Method for 76th Fluid Dynamics Panel Symposium on
Solution of the Three-Dimensional Incom- Aerodynamics of Store Intergration, 24-27
pressible Navier-Stokes Equations”, Jour- April, Ankara, Turkey, 1995.
nal of Computational Physics, 70, pp.439-
462, 1987. [8]Y. Kallinderis and K. Nakajima, Finite El-
ement Method for Incompressible Viscous
[31 A. Mizukami and M. Tsuchiya, A Fi- Flows with Adaptive Hybrid Grids, AIAA
nite Element Method for the Three- Journal, vo1.32, No.8, pp.1617-1625, Au-
Dimensional Non-Steady Navier-Stokes gust, 1994.
Equations, ht.J.Num Meth.in Fluids,
vo1.4, pp.349-357, 1984. (91 S.W. Churchill, Viscous flows, the pmc-
tical use of theory, pp.360-400, Butter-
worths series in chemical engineering, But-
terworth publishers, USA, 1972.
20- 1

Convergence Acceleration of the Navier-Stokes Equations


through Time-Derivative Preconditioning

Charles L. Merkle. Sankaran Venkateswaran. and Manish Deshpande


Propulsion Engineering Research Center
The Pennsylvania State University
University Park, Pa 16802

SUMMARY
Chorin's method of artificial compressibility is extended to the Mach number, while others [5,8]expanded the
both compressible and incompressible fluids by using equations in terms of the first power of the Mach number.
physical arguments to define artificial fluid properties that
make up a local preconditioning matrix. In particular, In parallel with these perturbation procedures, local
perturbation expansions are used to provide appropriate preconditioning methods in which the time derivatives of
temporal derivatives for the equations of motion at both the equations of motion are multiplied by a matrix to
low speeds and low Reynolds numbers. These limiting control the eigenvalues have also been used to enhance
forms are then combined into a single function that convergence [8-161.Unlike the perturbation equations,
smoothly merges into the physical time derivatives at high these preconditioned equations are valid at all speeds, and
speeds so that the equations are left unchanged at transonic. so have a potential for providing uniform convergence
high Reynolds number conditions. The effectiveness of over all Reynolds and Mach number regimes. Two distinct
the resulting preconditioning procedure for the Navier- philosophies have been followed in developing these
Stokes equations is demonstrated for wide speed and preconditioning methods. One uses the perturbation
Reynolds number ranges by means of stability results and procedures described above and deals with the full Navier-
computational solutions. Nevertheless, the preconditioned Stokes equations and includes the Euler equations as a
equations sometimes fail to provide a solution for special case [ll-141.The intent of this approach is to
applications for which the non-preconditioned equations improve convergence at low speeds and Reynolds numbers
converge. Often this is because the reduced dissipation in only, while leaving it unaltered at high Reynolds numbers
the preconditioned equations results in an unsteady and high speeds (transonic and above) where it is already
solution while the more dissipative non-preconditioned quite efficient. This method has been applied extensively
equations result in a steady state. Problems of this type to a wide variety of practical applications.
represent a computational challenge: it is important to
distinguish between non-convergence of algorithms, and The second approach [15.161 provides a rigorous method
the non-existence of steady state solutions. for developing a preconditioning matrix for the Euler
equations, but equally rigorous extension to the Navier-
Stokes equations appears doubtful. This preconditioning
1 INTRODUCTION procedure is intended to provide optimum convergence over
the entire Mach number regime, but limited applications
Time-marching techniques have proven to be very effective have thus far demonstrated convergence enhancement only
for the computation of high Reynolds number flows in the in the low Mach number regime [ 161. Even there, this
transonic, supersonic and hypersonic regimes. These second method is generally less effective than that
methods, however, become inefficient at low speed or low provided by the perturbation-expansion-based methods.
Reynolds number conditions including the near wall Further. the convergence enhancement to be had at
regions of high Reynolds number flows. For this reason, transonic and supersonic speeds is very limited because
incompressible and low speed computations were time-marching methods are already efficient there so that
dominated by pressure-based procedures [11 for many years. substantial improvements are unlikely.
Chorin's pseudo-compressibility method [2],which has
become widely accepted for incompressible flows [3], The purpose of the present paper is to demonstrate how a
opened one avenue for applying time-marching procedures viscous preconditioning procedure can be developed from
to incompressible flows but there was little realization that the basic physics of the flow using low speed and low
this procedure could be broadened to enable computations Reynolds number perturbation expansions. As a part of
at all speeds until recently. this development, the link between our compressible
preconditioning method and the artificial compressibility
Extensions of time-marching methods to low Mach number method of Chorin is shown. Following some
compressible flows became possible with the realization representative examples of convergence enhancement for a
that it was the stiffness of the eigenvalues that slowed wide variety of problems, the paper closes by addressing
convergence at low speeds. Low Mach number perturbation the issue of the robustness of preconditioning methods.
procedures were first used to remove these problems [4]and One specific example is given in which the preconditioned
were used in pressure-based methods to compute low speed methods fail to provide convergence to a steady state.
compressible solutions. The implementation of time- Detailed investigation shows that the physical problem is
marching methods to the low Mach number perturbation unsteady and a steady solution fails to exist. The reduced
equations were first reported by Gustafsson [SI.followed by artificial dissipation in the preconditioned solution makes
extensive applications by the present authors [6]. this unsteadiness more apparent. The prospect of
Perturbation expansion methods have also been extended distinguishing non-convergence from the non-existence of
to combustion problems [7]. Of these perturbation steady state solutions is thus raised as a challenge facing
expansion methods, some (6.71 used the more CFD techniques.
conventional expansion procedures based on the square of

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
20-2

2 THE EQUATIONS OF MOTION


The equations of motion can be written m conservative
form as:

where the viscaus terms are given by the operator Lv,and where p p , P T , and h~ are partial derivatives. For a perfect

[i]I]=['I
the vectors Q. Q, and E are gas PT = - p / T ; p p = 1/m, m / ( y - l ) where 7
the ratio of specific heats. Note that hT is the specific heat
at constant pressure.

Other matrices of interest include the Jacobian. AV =


Q = PO 8, = E aE/@v,

puh'

with an analogous defintion for F and G . Here, p


represents the density, p is the pressure, and U. v and w are
%=
the Cartesian velocity components in the x, y and z
directions respectively. The total energy, e. is the sum of
the internal energy, E , and the kinetic energy,

e = p e + -1
(u2+u2+w2)
(3) with analogous expressions for B, = aF / aQ, and
2
C, = X / a Q , .

The enthalpy, h, is related to the internal energy and the 3 LOW MACH NUMBER SCALING
pressure,
The eigenvalues of (6) determine the convergence rate of
ph = PE +p (4) the time-marching algorithm. These eigenvalues are
obtained from the roots of the fifth order polynomial:
and for a perfect gas can be expressed as a function of the
temperature alone, h = h(T). The stagnation enthalpy is
($&-U) = 0 (9)
.
defined as ho = h + (2+ 9 + d)/2 The formulation is
completed by the perfect gas equation of state which we
write as. which are readily found to be U. U. U. U f c where the acoustic
speed, c is given by,

to emphasize that the density depends on the temperature


and pressure. This form makes it possible to include
incompressible fluids and perfect gases in a single
procedure. The speed of sound reduces to the familiar relation. c2 =
yRT. for a perfect gas. while, for an incompressible fluid
The "viscous" vector, Qv. that appears in Eqs. 1 and 2 where p p = p~ = 0, and the speed of sound becomes
represents the dependent variables that appear naturally in infinite causing the time derivatives in the continuity
the diffusion terms. Because the fnst cell of this variable equation to vanish so that continuity reduces to V V = 0.
(corresponding to the continuity equation) is null, we
choose to fill it with the pressure, p. This choice makes Qv To ensure uniform, efficient convergence over all speed
a unique function of the conservative variable Q . For ranges, we replace the matrix (@/aQv) in (9) by a
convenience, we use this set of primitive variables as our preconditioning matrix, Tu,and consider the solution of
primary dependent variable set, but we retain the
conservative fluxes. the modified equation,

The variables in the time derivative can easily be


changed hom Q to Q, by means of the chain rule.

Wedefine Tu in a form analogous to @/aQv. by replacing


the fluid properties, p p , pT and h ~by. the artificial
,
quantities, p b p$ and & respectively. These quantities
where @/&Iv represents the Jacobian, represent a three-parameter preconditioning system whose
values can be chosen to ensure well-conditioned
20-3

eigenvalues at all speeds, thereby ensuring fast, efficient The quantity h.+ is, as yet, free.
convergence. The d e f ~ t i o nof the three parameters in the
preconditioning matrix, Tu. will be obtained from With the special values for p ; and p$ given in (15)and
perturbation analyses of the equations of motion at low
speeds and low Reynolds numbers. Their presence (16). two eigenvalues of &'A$,become equal to the
intmdws an artificial speed of sound, c'. that ensures that particle speed u.. The third eigenvalue also equals U. if h~ =
eigenvalue stiffness is avoided. Additional restrictions on h.+ or if the physical properties p p and PT are zero as in
the preconditioning matrix have been given by Viviand [9] incompressible flow. For these conditions, the full set of
and Choi and Merkle [12].
eigenvalues of G'A,is:
To overcome the difficulties at low speeds. we expand Q, in
the power series.

8, = Quo + E Q , ~ +... (12)


where the quantities U and b are given by,
where €=U2. Upon substituting this expression into (11).
we obtain to order l / ~po, = constant, which says that the
thermodynamic pressure is externally imposed. Scaling
the temporal and spatial derivatives of pressure to order
unity then causes the term,pi, to appear in the zeroeth
order equations. To reflect this, we defrne the vector. 810,

so that the equation system that is valid to order unity


becomes (for simplicity here, we write only the one-
dimensional equations).

Inspection of the generalized acoustic eigenvalues in (19)


The conespondmg matrices Tuo and A, are given by shows that the physical properties. p p and PT that cause
evaluating Tu and AV with the values Q d . the speed of sound in incompressible flows to be infinite
no longer appear in the denominator; only the artificial
Requiring that the temporal pressure derivative be retained properties p'p and p i . do. Replacing these physical
in the continuity equation in the low speed limit implies properties by properly defined artificial properties
that p p must be of order one. or that p ; , is given by alleviates the decoupling between the pressure and
momentum terms in incompressible flows, and makes time-
marching practical for both incompressible and low speed
compressible flow computations. For incompressible
flows, this approach leads to the artificial-compressibility
where A is a constant of order unity and V, is an method of Chorin as is shown below.
4
appropnate reference velocity.
Replacing p'p by kp/V," and p$ by h p as ~suggested
To ensure that the energy equation is uncoupled from the by the low-Mach number scaling provides eigenvalues that
continuity and momentum equations in the incompressible are well -conditioned for low speeds. For a perfect gas, the
limit, we make the variable p$ proportional to PT , coefficients, U and b. become:

where @ is a quantity whose value is less than or equal to


one. This replacement causes p$ to vanish when PT goes
to zero. Specific values for are 0 or 1.

Placing Eqs. 15 and 16 in T u . gives well conditioned


eigenvalues in the limit of low Mach numbers,

(20b)
where Mr = VAC is the reference Mach number. The
behavior of this eigenvalue is difficult to determine from
20-4

the algebraic form, but it is seen that as the Mach number scaled for low Reynolds numbers to see how our three
goes to zero. the eigenvalues approach a constant times the parameters p'p, pk and 4 must behave in the diffusion-
particle speed. Numerical checks verify that this constant
is of order unity and the eigenvalues are well-conditioned. dominated limit.
I Stability results for this condition are presented later.
For low Reynolds numbers, we scale the momentum
For incompressible flow, the coefficients U and 6 become: equations such that the temporal derivatives and the
pressure gradient remain of the same order as the viscous
terms as the Reynolds number goes to zero. This defmes
U = I n b= (V,"I kpu2 + 11 4 ) (21) the proper scaling for the pressure (Pr = /Lr VJL) and the
time (t, =prL2 I p r ) , but imposes no conditions on any
which are clearly well-behaved. Choosing 9
= 2 gives of our three preconditioning parameters.
eigenvalues whose ratio is no worse than 2. This choice is
identical to the artificial compressibility method of Chorin Using this reference pressure and time and requiring that the
[2]. We also note that for incompressible flow, it is not temporal term in the continuity equation balance the
necessary to set e= h+ to obtain simple algebraic convective terms at low Reynolds numbers, results in the
eigenvalues, and the third "particle" eigenvalue becomes k condition on p'p,
= uhT/h+. so the parameter h+ can be selected to control
convergence in the energy equation if desired.

4 LOW REYNOLDS NUMBER EQUATIONS To prevent the temporal derivative of the temperature from
appearing in the momentum equation, we also require,
Having obtained some understanding of the way the Euler
equations scale, we now turn to the Navier-Stokes equations p$ = k+prRe I T , (27)
and consider their proper scaling in the limit of low
Reynolds numbers. Here. we use a similar perturbation In these expressions, kp is a constant of order unity, while
expansion, but we let the small parameter. E. be the
Reynolds number, Re. We begin by premultiplying (1 1) is less than or equal to one.
by the matrix PL', Scaling the energy equation in a manner consistent with
these definitions results in the requirement.
1 0 0 0 0
-U 1 0 0 0

whexe cPr is a reference specific heat, Pr is a reference


Randtl number, and kh is another constant of order one.
-h+;V2 -U -U -w
The resulting low Reynolds number equations in one-
This multiplication gives the matrix dimension then become:

k' -ap
+ P - +aU P T U - =aT
P ~ U - +aP O
p at ax ax ax

0 o o p aU ap a
P -a+t- = -ax
-P-
4 au
-1 0 0 0 ph+ ax 3 ax
and results in the convective terms, aT
k i -= VekVT + @
at
~ aU + -, -
ap pu-,aw pu-ah u-a P r
(aaxp l ,pu-
axax ax ax ax Note that in the energy equation, we have assumed that the
quantity V,"I cprTr is small. Retaining it adds a pressure
gradient term to the energy equation, but does not affect the
for the x-direction. Multiplication by Pi1 does not affect requirements placed upon our three parameters. Equations
(30)are the creeping flow equations.
the viscous terms in the momentum equations, but the
corresponding terms in the energy equation reduce to the
conduction term plus the viscous dissipation. The modified
5 SUMMARY OF PRECONDITIONING
energy equation becomes:
PROCEDURE

The correct asymptotic form of the three parameters, p'p ,


p$ and h+,as determined from the low Mach number
scaling and low Reynolds number scaling is summarized in
where Q is the viscous dissipation, and we have omitted Table I. Use of these values ensures that the equations are
convective terms in y and z. These equations can then be properly scaled in these two limits. To use these
parameters for computations at other than the limiting
20-5

transitions smoothly between these limits while also viscous time step and has proven effective in many
approaching the non-preconditioned equations for high problems [ll]. Clearly, this corresponds to switching
Reynolds number, transonic flows. This functional form from the inviscid to the viscous value when the Reynolds
will be developed by combining these limiting values into number goes below unity. The most appropriate Reynolds
a single continuous function and then verifying the results .number for this switch is the cell Reynolds number.
first by means of stability theory, then by simplified
computational problems, and finally by practical The function in J2q. 30 can likewise be made to merge
applications at low speed, low Reynolds number and smoothly with the physical properties at transonic
transonic conditions. conditions by noting that at Mach one. v,"
= c2. so that if
Preconditioning the Euler equations is relatively easy, but we choose kP = ki, = y ,Eq. 30 degenerates to pb = pp at
preconditioning the Navier-Stokes equations is more Mach one. (In computations for incompressible flow we
difficult for several reasons. First of all. the appropriate have generally chosen k,, = k; = L33). The remaining
Reynolds number must be determined. Stability results
show that the cell Reynolds number, UWV (where U artificial properties can be made continuous by setting
represents the local velocity, Ax represents the grid b = & = o . a n d b y s e t t i n g k h = l a n d k j , = P r . This
spacing and v is the kinematic viscosity), is the latter choice does not precisely satisfy the viscous
appropriate viscous scale, and that diffusive effects become matching condition. but since the Randtl number for most
dominant at cell Reynolds numbers less than unity. The gases is near one, it is close enough to give good results.
transition from inviscid- to viscous-dominated flows thus All the examples we give are based on this combination of
depends on both the flowfield and the grid. Viscous flows artificial properties.
can switch from convection-dominated to diffusion-
dominated because of increased grid resolution or The second procedure is similar, but instead of using a
stretching. The second reason for difficulty arises because function with a discontinuous slope for p'p we make both
the presence of boundary layers at high Reynolds numbers the function and its derivatives continuous. Here we define
requires high aspect ratio grids with fine resolution normal the three parameters as:
to the walls. Correspondingly, there are two cell Reynolds
numbers of widely differing magnitude. The one based on
the normal grid spacing is generally diffusion dominated,
while the one based on the streamwise spacing is generally
convection-dominated. The issue in viscous
preconditioning is to deal with near-wall cells that are
viscously-dominated in one direction and convectively-
dominated in the other. while simultaneously treating
convectively-dominated cells in regions away from the
walls.

We demonstrate two ways in which the limiting forms of These functions reach the proper limits at low Reynolds
the artificial properties in Table I can be combined into a numbers, low Mach numbers, and at high Reynolds
single function that can be used over the full Reynolds- number, transonic conditions. In particular, the function
Mach number domain. The parameter p'p is the primary for h+ switches continuously from unity to 1/Pr as the cell
quantity in controlling eigenvalues, and we begin by Reynolds number goes through unity. When the Mach
considering this quantity. The simplest procedure is to number approaches unity, Tu.approaches the physical
choose p'p as the minimum of the viscous and inviscid
values,
Jacobian, a/@,. and the preconditioned equations
=
become identically the physical equations. Choosing
0. as in the first example gives simpler pretonditioned
(
p'p = kp / V;)Min{ 1,Re} (30) equations, but only causes the modified eigenvalues of the
equations to approach the physical eigenvalues as the
Mach number goes to unity. The equations remain distinct.
where we have used the same constant at both conditions.
This is equivalent to using the smaller of an inviscid or In summary ,we scale the time derivatives at high cell
Reynolds numbers to keep the convective eigenvalues
well-conditioned, whereas at low cell Reynolds numbers,
Table I. Reconditioning Parameters Dictated by we scale so that the equations reduce to simple diffusive
Reynolds and Mach Number Scaling equations. We also scale the dominant convective speed so
that it is the same order as the diffusive time-scale. The low
LOW LOW Reynolds number scaling causes the convective terms to
Term Mach Reynolds become stiff, but because they are small, this doesn't slow
Numk NUmk convergence. To assess this scaling, we use Fourier
stability theory for the full Navier-Stokes equations using
p'p kplV," k; Re2I V," Reynolds number and Mach number as parameters.

6 STABILITY AND CONVERGENCE OF THE


Pi. ~TPT ki.PT Re PRECONDITIONED EQUATIONS

We begin by comparing the stability characteristics of the


two-dimensional Euler equations with and without
preconditioning at a Mach number of 0.01 and a flow angle
20-6

0 r[
0
"I x
Figure 3: Euler: LGS-4, I/IIl, M=0.01, No Precon-
Figure 1: Euler: CD-ADI, M=0.01, N o Precondi-
ditioning, CFL=20
tioning, CFL=5
of 45'. Figure 1 shows results for the non-preconditioned preconditioned equations at a Mach number of 0.7.
case, while Fig. 2 is for the case with preconditioning. suggesting that convergence with the preconditioned
Both of these stability predictions are for central system will be similar to the efficient convergence
differencing in space and AD1 approximate factorization in observed with the non-preconditioned system at high
time. The stability results without preconditioning subsonic Mach numbers. The non-preconditioned
indicate the amplification factor is nearly unity (0.9999) eigenvalues in Fig. 1, however, indicate that this case will
over the mid- and low-wave-number regions, thereby converge very slowly, an indication that if verified by
vividly demonstrating the stiffness that is encountered at computations. This demonstrates the ease with which the
low speeds. stiffness in the Euler equations can be removed.

By contrast, the amplification factors in Fig. 2 for the To further demonstrate the effectiveness of the
preconditioned case are quite reasonable with damping rates preconditioning, we show stability results for similar
of around 0.9 over most of the mid- and low-wave-number conditions in Figs. 3 and 4, except that upwind
ranges with sharp fall-off along the axes (except at the differencing is used for the spatial discretization and line
corners) indicating that the preconditioned system will Gauss-Seidel approximate factorization is used for the
provide fast, efficient convergence at this low Mach solution procedure. Figure 3 shows the non-preconditioned
number condition. We do note that the amplification factor stability results for M = 0.01. These eigenvalues again
goes to unity in all four comers. but these peaks are easily contain an unacceptable stiffness in the low-wave-number
removed by a small amount of artificial dissipation. region. This stiffness is, however, removed by the
Companion stability results (not shown) indicate that this preconditioning as shown in Fig. 4. Again, this
preconditioned stability result is independent of Mach preconditioning renders the stability results essentially
number, and that it is nearly identical to that for the non-

(1.)

0 K 0
x

Figure 2: Euler: CD-ADI, M=0.01, With Precondi- Figure 4: Euler: LGS-4, I/III, M=0.01, With Pre-
tioning, CFL=5 conditioning, CFL=20
20-7

e
le+M '
1
7
Original
.H
5

Inviscid Preconditioning 0)
I P Y

100
le45 le-04 l).l)Ol IJ.01 0 I I
Mach Nunilrn

Figure 5: Effect of inviscid preconditioning on the


number of iterations to converge t o machine zero
versus the Mach number.
independent of Mach number, and gives uniform 0 x
convergence at all Mach numbers. (*X
The significance of these stability results is easily shown Figure 7: Navier-Stokes Eqns.: CD-ADI, M=0.001,
by applying them to a simple flowfield consisting of /~'ra,~rO.l,Viscous Preconditioning, CFL=5,
inviscid flow in a straight duct. Figure 5 shows the VNN=5
convergence of the AD1 system from an initial condition
corresponding to a small perturbation &om the exact To complete the stability survey for the Euler equations, we
(uniform flow) solution. Although this problem appears note that the stability results for the artificial
trivial, the return to uniform flow at low Mach numbers compressibility version of the incompressible equations
takes thousands of iterations without preconditioning, but are identical to the low Mach number results in Figs. 2 and
is independent of Mach number with preconditioning. The 4. These results clearly demonstrate the ability of the
number of iterations required for convergence without preconditioning method to apply equally well to
preconditioning is inversely related to the square of Mach compressible and incompressible solutions.
number, and at M = some 1 6 iterations are required
to reach convergence to machine error. When Stability results for the preconditioned Navier-Stokes
preconditioning is used, the number of iterations required equations are given on Fig. 7. These results are for the
for convergence is independent of Mach number and is centraldifferenced AD1 system at a cell Reynolds number of
similar to the number required for the non-preconditioned 0.1 and a Mach number of 0.01. (Note results for high cell
case at transonic conditions. The actual convergence rates Reynolds numbers are identical to the Euler results given in
for some of these cases are shown in Fig. 6. Similar Fig. 2.) The viscous preconditioning is not quite as
preconditioned and non-preconditioned results are observed effective in controlling the stability eigenvalues as in the
for the line Gauss-Seidel, upwind system. Applications to case of the Euler equations, but it still improves the
a wide range of practical problems have been demonstrated stability map dramatically as compared to non-
elsewhere [ll-141, giving ample evidence that the Euler preconditioned results. Eigenvalues over most of the
equation problem is well in hand. domain are around 0.9 with some increase toward the higher
wavenumbers that arises because of the absence of diffusion
in the continuity equation. The addition of artificial
diffusion in continuity eliminates this difficulty and
provides good viscous convergence as is shown next.
-2.0
Comparison with stability results based on the non-
preconditioned equations or preconditioning with pb set
-4.0
-3 -6.0
to its inviscid value rather than its viscous value shows a
substantial deterioration in eigenvalues for either case.
P, Without preconditioning, the eigenvalues become very
g- -R.O
stiff, and while inviscid preconditioning changes the
stability eigenvalues, it doesn't improve them. Clearly,
w viscous preconditioning is needed as the cell Reynolds
-1 -10.0 number decreases.
-12.0 Figure 8 demonstrates the effectiveness of the viscous
preconditioning for the Navier Stokes equations for a second
-14.0 simple problem, that of fully developed flow in a pipe.
Again, the initial condition corresponds to the exact
-16.0 4 , , , , , , , I I I solution plus a small perturbation. The figure shows the
0 200 400 600 ROO 1000 number of iterations required to converge to machine
No. Iterations accuracy for cell Reynolds numbers ranging &om to
10. With viscous preconditioning, convergence is seen to
Figure 6: Convergence of the inviscid straight duct be independent of cell Reynolds number over the entire
case at various Mach numbers iising the original spectrum. Solutions with inviscid preconditioning show a
equations and the preconditioned equations. dramatic slowdown in convergence at the smaller Reynolds
numbers, while computations with no preconditioning were
20-8

that the preconditioned solution gives reasonable results


after only 200 iterations, while after 400 iterations, the heat
flux is indistinguishable fiom the machine-accuracy results.
The standard algorithm produces very different results. After
2000 iterations (which corresponds to three orders of
magnitude reduction in the global residuals), the wall heat
flux is only about half its final converged value, and it takes
more than 20,000 time steps to come within plotting
accuracy of the fully converged result. When converged to
machine accuracy, both the standard and the preconditioned
algorithms give identical results. This shows that low
Reynolds number cells near the wall (which determine the
wall heat flux) can also totally control the overall
convergence of the solution.
Figure 8: Convergence of the viscous straight duct
case a t various Cell Reynolds numbers using inviscid As a final example. we present a simulation of the flow in a
uni-element gaseous rocket. The flowfield is generated by
preconditioning or viscous preconditioning. two co-annular jets entering through the left end of a
cylinder whose diameter is 50 mm. The gas in the inner
stream is oxygen, while that in the outer stream is
very irregular, and fiequently did not converge. The flow hydrogen. The outer diameter of the hydrogen jet is 12 mm.
Mach number for these calculations is taken as lo9. giving a 38 mm backstep past which the jets exapand. The
diameter of the oxygen jet is 8.4 mm. the hydrogen annulus
is 1 mm. and the two are separated by a sleeve of thicknces
7 REPRESENTATIVE SOLUTIONS FOR 0.8 mm. An overall picture of the flowfield is given in Fig.
PRACTICAL PROBLEMS 11. The back step generates a large recirculating region near
the outer wall. The two gaseous streams begin to mix upon
Thus far we have shown how to transform the time exiting the injector. but the finite thickness of the sleeve
derivatives so that the equations are well conditioned over generates a small wake in which the hydrogen and oxygen
the entire Reynolds-number / Mach-number regime. We first start to mix. This mixing region is very important to
have then used stability results for both the Euler and the the computation because it acts as the primary flame- I
Navier-Stokes equations to verify that these preconditioned holding mechanism for the resulting diffusion flame
equations provide effective damping factors. In addition, we (although the present results are for non-reacting flow).
have also shown that the preconditioning equations provide
uniform convergence for simple problems at all Reynolds Initial attempts at computing this flowfield with
numbers and Mach numbers. The ultimate proof of preconditioning showed very poor convergence. Although
convergence enhancement must, however, rest upon
demonstration of effectiveness in practical problems. Over .02
the past several years we have applied these systems to a

:
Re = 100
broad variety of applications including low speed
compressible flows, combustion problems, incompressible
flows, supercritical fluids and extrusion modeling. Space .oo
does not permit a complete demonstration of all these .oo .02 .04 .06 .08 .10
examples, but we present some representative results to
demonstrate the capabilities.
I .O
Figure 9 shows results for laminar flow over a backstep at a
Reynolds number of 200. The v-velocity contours are -1.0
shown, along with the convergence rate with the
preconditioned and non-preconditioned cases. These -3.0
computations are done with the line Gauss-Seidel algorithm. -
Clearly, viscous preconditioning provides a major 3
-0
-5.0
.-
enhancement to the convergence rate. m

-7.0
As a second example, we consider the flow through a -
011
converging diverging, rocket nozzle. The turbulent a -9.n
boundary layers in this nozzle are very thin because of the
high Reynolds number, and strong wall cooling. The - I 1.0
corresponding strong grid stretching (aspect ratios larger I
than 106) required near the wall introduces important low - 13.0
Reynolds number effects in this otherwise high Reynolds
number flow. With standard algorithms, the solution - I 5.0
converges at a reasonable rate for about four orders of
magnitude (which would appear to be sufficient). and
switches to a very slow rate of convergence. With
preconditioning, the convergence continues to machine
zero at a rate that is faster than the initial convergence of the Figure 9: Contours of velocity and convergence for
non-preconditioned solution. The heat flux to the wall is the backward-facing step a t a Reynolds number of
shown in Fig. 10 as a function of axial distance for both
calculations at several time steps. The lower plot shows 100 using the four-sweep Line Gauss-Seidel scheme.
20-9

Standard Algorithm

-24 -2.3 -2.2 -2.1 -20 -19 -IY


Axial DtsLYIcD

Enhanced Algorithm

Figure 11: Unsteady Velocity Field near Injector


Post for Hydrogen/Oxygen at O / F = 4.

region is given in Fig. 12. The details show that in this


rrcirculuing region, the heavier oxygen from the lower
ltrum makes up the primmy content of the recirculating
mgion. The hydrogen mixes with the oxygen only along
h e upper side of the rccirculating region. when the
recuculuion region sheds a vortex, it induces a substantial
unstudineas in the liihtez hydrogen stream, which is then
pupagated into the rccudation region downsrnenm of the
backstep so that the entire flowfield oscillates in response
-.IRII+OX to this nanow wake region.
-2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8
Axial I l i ~ l ~ u ~ c c Plots of the time nte of change of the velocity at a particular
point near the wake region are given on Fig. 13. Even in
the unsteady solution, the prcwnditioned results show
1 Figure 10: Temporal convergence of the heat flux larger amplitudes than do the non-preconditioned solutions.
along the nozzle wall f o r huth tlir standard algo- This is agnin becluse of a diminished mount of artificial
rithm aiid the enhanrrd algorithm. damping. The impacts of increased arti6cid dissipation are
shown by the fmtader upwind r e d s which are nearly
rtedy. Compuisons with mdyticd solutions for simple
shear layers indicate thu the pnconditioncd resulu are more
I .ccuI.1c.
the non-poeonditiomd system did not wnverge well it did .. .. . ~.
~. - .. .
converge slightly better than the peconditioned one.
~

Careful investigation m e d e d that the rueon for the


convergRLcc difticulty was b s a w a steady solution failed
to exist. The resulting flowfield was oscillltory in nanue as
delcrmined by experimental observ.tiona. In ddition, the
complutions showed the unrreadincss increased in s&eng!h
as the grid was r e f d The r e m the p.0onditiom.d
system showed poorer mnvergencc was that it introduced a
smaller amount of utificial dissipuion than did the non-
p r d i t i d system. (All computations were nm with
upwind flux difference splitting.) The velocity contours in
the resulting unsteady solution is presented in Fig. 11 st
thr.=e different instants of lime.

The ~ C the unsteadinws rppws to originate in the


I C of
rccirculuing zone in the wake of the tinite ~ c k n e s sleeve
s Figure 12: Velocity Vector/Streamline Field near In-
betwem the two inlet s u e m . A close-up view of this jector Post for Hydrogen/Oxygcn Calculation
20-10

One issue with r e g d to peconditid systems is their present poeedure povides much impoved convasencc
impact oa code robusmess. We have ~ u x ~ l t a m e dy n t u in m e of the m d i t i d prohlan arcds for time
experiences in which preconditioning i m p o v u robustness m c h i n g methods, While having M & -a CffeCfS in
(ia.. cases whae the n o n - p c a m d i t i d code Ida to ~ ~ g i m cws h thc mdhods already work efficiently.
converge while preconditioning makes mnvagence very
reliable). It must, however. be rsognizcd that ACKNOWLEDGMENT
p e c o n d i t i ~inrmses the local time step d r d d l y
and this lnge time step mmy require some ruuiction Y urly Ihk work w u Npportsd lmdQ NASA grrmr (NAGW-
6mgr.s of thc oomputuMn (although thc rsrrricud time step 1399,(NAS 8-3886). and @CC 8-46).
may stin be larga than the corresponding non-
p r s o n d i t i d lime step). The present example. howcva, REFERENCES
demoluIntu that thsrr m aomc c- for which the
peconditioning may mt impmve wnvagencc hecaw a 111 Patanka. S.V. (1980) N u m u i d Heat Trmsfcr md
study solution dou not exist. In thuc cuu. the Fluid Flow S a k s in Computational Methoda in
peamditioncd system proves its wnth in an unsteady. M s h h md 'Iharml ScienCn. McOraw-Hill
imuive solution poccdure. Book Company.
__
I21 chnin AJ. (19671. . "A N u n w i d Method for
Solving Incompusihle Viscous Flow Problems.
8 CONCLUSIONS
J o d of Compul.lional Physics, 2 12-26.
The poper limiting fnma of the equuions of motion i t [31 Rogers. S.E.. Kw& D. md W. C. (1989)
low speeds and in diffusiondominated regions have becn N u m a i d Solution of Ihe Jnwmpessiblc Navia-
OM by putwhation expuuions md used as the h i s stokC9 EqUathS fM Steadyat* urd T h e -
for defming ipmnditioning matrix for convagmcc Dcpcndent Prohlans. A I M Paper 89-0463.
mhmcanenL The expmsion r d t s show h t pp is thc 141 Rehm R.G. md E m ,H.R. (1978) 1. R e d of
mmt important vuiahle in w n t r o l l i mnvergencc while the N.tiorul Bureau of Standards, 83.297.
.
PT " h p ~0frscondnyimpat.nce convergence 151 Cucrrn, J. md Gust.fssas E.(1986) A Numerical
Method for Inannpnrsihk md Compressible Flow
control can be obtained by replacing these physical Problem with Smooth Solutions. Journal of
d a i v u i v u by utifiial ones in Ihe time daivuives. while Compuutional Physics, 63. pp. 377-396.
retaining the physical quantities in the flux tcmu 10 the 161 M d k . C.L. and Choi Y.H. (1985) Computntion of
wlutiolu pc unchanged. Appropriate replacement trims Low-Speed Compcssihle Flows with Time-
f a these qlmtitiu obuimd Irom the expmsion poccdrnca Marching Racdurcs. Intunational Journal for
m then genaalized so that they .pporh the physical Numaical Methad. in Engineering. 25: 293-311.
quantities in the lMsonic and supersonic regimes. 171 Withingum, J.P.. Shua. J.S. md Yang. V. (1991)
A Time Accurue, Implicit Method for Chemically
Following the developnart of a g m d i z e d precolditiona R d g Flows at AU M A Numbss, AIAA Paper
that CNUIW that the condition numbs of the Jacobian 91-0581.
nutricu of the equations of motion ruruinof Dlda one at 181 Rtamul A, Turk4 E. and Vasa, V. (1995)
all M r h n u m b , thc resulting wnvagencc chnraclaistics Ruanc Updating Methods for Ihe Steady-State
m fmt checked by means of subility h r y . The Fluid Equations, AIM Paper 95-1652.
effcaivcnus of the methods is then v m f d by 191 V i v i d H. (1985) Pseudc-Unsteady Systems for
mmput.tions of a variety of poblems. starling fmt with Study Inviscid Calculations. Numerical Methods
simple qplications and thm going to p r t i c a l exunples. fathe E& & p t i o n s of Fluid Dynamics SIAM,pp.
Effiiimt. uniform w n v a g a r c is dunonmated for ivuiuy 334-368.
of lpplicuions mvaing irange of Reynolds md Mach [lo] Turkel. E. (1987) Rcsonditioneed Methods for
n u m k d t i o n s . O v d l . it is dunonsuated that Solviq the Incompressible and Low Speed
=agetux mhuranent of the Eula cquuions at low Comprcsrihle Equations. Journal of Computational
rpcsds is quite usy d M bc rudily a w e d . E x m i o n to Physics. 72. 277-298.
the Nivicr-Stokes equations requires more u r e . hut the [ll] Vcnluteswum. S.. Weiss. J.M..Merkh C.L. md
Choi. Y.H. (1992) Propubion-Related Flowfields
Using the Reconditioned Navicr-Stokes Equations.
AlAA Papa 92-3437.
[12] Choi. Y.H.md M d e . C.L. (1993) The
Appliution of Rewnditioning to Viscous Flows.
l o u d of Compuutiod Physics, 105: 207-223.
1131 Shwa J.S., Chm, K.H. md Choi. Y.H.(1992) A
Timc-Accurnk Algorithm for Chemical Non-
Equilibrium Vigaus Flows at All Speed% AIAA
Paper 92-3639.
[14] Weis* JM.and Smith, W.A. (1994)
Reconditioning Applied to Vuiible md Constant
Dmdity Flows on UnsUucturcd Meshes,AlAA Paper
94-2209.
[15] van k, E.. Lk. W.T. md Roc. P.L. (1991)
Ch~ctcristicTime-Stepping or Local
Reconditioning of the Eula Equations. AIAA Paper
92-1552-CP.
Figure 13: Time History of Axial Velocity at one [I61 Codfrcy. A.G..Wallers. R.W. andvanLeer. B.
point in the Injector Flowlield for various differenc- (1993) Reconditioning for the Nivicr-Stokes
ing schemes Equations. AMA P.pn 93-0535.
Practical Aspects of Krylov subspace Iterative Methods In CFD
Thomas H. Pulliam:
Advanced Computational Methods Branch

Stuart Rogers t
Design Cycle Technologies Branch

Timothy Barth t
Advanced Computational Methods Branch

NASA Arnes Research Center Moffett Field, CA 94035-1000, USA.

September 13, 1995

Abstract for two-dimensional Navier-Stokes codes. Most of the


conventional implicit schemes used today are effec-
Implementation issues associated with the applica- tively approximate-Newton methods. The approxi-
tion of Krylov subspace iterative methods, such as mations appear in the form of simplifications in the
Newton-GMRES, are presented within the frame- functional Jacobian or some form of underlover relax-
work of practical CFD applications. This paper will ation strategy, see e.g. [ll] or [14. In practice these
categorize, evaluate and contrast the major ingre- simplifications are employed for reasons such as effi-
dients (function evaluations, matrix-vector products ciency, implementation ease, or non-analyticity of op-
and preconditioners) of Newton-GMRES Krylov sub- erators (e.g., discrete limiters in differencing schemes
space methods in terms of their effect on the local based on Riemann solvers). Over a wide range of
linear and global nonlinear convergence, memory re- numerical methods developed for the solution of the
quirements, and accuracy, The discussion will focus multidimensional Navier-Stokes equations, the rigor-
on Newton-GMRES in both a structured multi-zone ous application of Newton’s method would require
incompressible Navier-Stokes solver and an unstruc- the inversion of a large block banded matrix, which
tured mesh finite-volume Navier-Stokes solver. Ap- even by today’s standards, poses many obstacles in
proximate vs. exact matrix-vector products, effective terms of memory requirements and speed. An alter-
preconditioners and other pertinent issues will be ad- native to direct matrix inversion is the use of itera-
dressed. tive matrix solution methods. In particular, the class
of Krylov subspace methods known as GMRES [I91
will be considered. Wigton [23] was the first to suc-
1 Introduction cessfully implement GMRES for a twwdimensional
Navier-Stokes code
Interest in iterative methods in CFD has been mo-
tivated not only by the requirement for better con- The dificulties associated with iterative methods
vergence and speed of numerical codes, but also by such as GMRES lie in the rapid expansion of mem-
the availability of faster, larger memory serial and ory requirements inherent in the embedded Arnoldi
parallel machines. The coupling of Newton’s method process (storing the Krylov subspace vectors), the
with iterative solvers is an effective approach for solv- need to perform the matrix vector “Ap” products
ing the large systems of nonlinear equations which (which sometimes requires the storage of the matrix
arise from discretized forms of the Euler and Navier- A), and the preconditioning of the system of equa-
Stokes equations. One of the main motivations for tions by some approximate invcrse of A to improve
the use of Newton’s method is the possibility of su- the convergence of GMRES.
perlinear (and in some cases quadratic) asymptotic The purpose of this paper is to focus on the im-
convergence. References (7,211 are examples of SUC- plementation dctails and specific results from the
cessful implementations of exact Newton’s method application of Newton-GMltES to both a struc-

Paper presented at the AGARD FDP Symposium on “Progressand Challenges in CFD Merhodr and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
21-2

Lured multi-zone incompressible Navier-Stokes code super-linear or quadratic convergence depending on


(INSZD, Rogers (171) and an unstructured mesh the characteristics of the Newton solver, e.g.[ZI] or
Navier-Stokes code due to Barth [5]. We use these [5]. It is convenient to recant Eq. 3 in the general
two codes as case studies, but the lessons learned, ap- form
proximations assumed, results obtained, and general
conclusion, are applicable to most implementations b-Ax=O, (4)
of Newton-GMRES to systems of PDE's.
where b = R(Q),x = AQ and A is a matrix opera-
tor.
2 General Formulation The numerical process involved in solving Eq' 3 will
be referred to here as the Inner Itcrntion at a par-
In a general form, we cast the discrete approximation ticular step n. The overall iteration of the nonlinear
to the steady multi-dimensional Navier-Stokes equa- system will be referred to as the Outcr Itcrntion
tions as . There are a number of successful approaclies to
the Inner Itcration. In the case of structured mesh
R(Q) = R(...,Qj,t-i,Qj-i,t, applications, A represents a sparse block banded ma-
Q , , t , Q j t ~,t,Qj,t-ir...) = 0. (1) trix which can be solved with various methods such
as point or line relaxation [17] or approximate fac-
where we are representing the support of the oper- torization, e.g.[14]. In the unstructured mesh case,
ator as involving neighboring points in a computa- A may not have a simple underlying structure, but
tional mesh and Q is the solution variable (typically the Inner Itcration can be successfully solved with
the conserved variables). Although this representa- a wide variety of relaxation techniques [Z], [5]. For
tion appears in a structured mesh form and is rather the present discussion we shall focus on the GMRES
compact (involving only three points in each compu- Krylov projection technique for solving the Iuncr It-
tational direction), we intend it to also represent an eration.
unstructured mesh template and possibly higher or- The GMILES (Generalized Minimal ILESidual)
der - higher dimensional - more broadband support. method way introduced by Saad and Schultz [I91
We shall refer to this as the Function Evnluation for solving large sparse systems of linear equations.
step in the overall process. For fixed point solutions The CMILES algorithm is a Krylov subspace method
(steady-state) we require the solution ofR(Q) = 0. where given a matrix A E S N x N a, vector U E 91RN
A time accurate approach to the solution awuiim >
arid an integer ni 1, tlie Krylov sul)space associated
the form of with A, U and in is dafined as
aQ + R ( Q )= 0
- K , ( A , u ) = xpurr(u,Au,A'u, ,Am-'u]. (5)
at
iri the (;MILES algorithm an initial g i i w 20 to the
which can
solution of the linear system is given from which the
either represent the artificial-comprtwxihility scheme
initial resicliial is clc:finod
for the incompressible Navier-Stokes equations [I71or
the full Navier-Stokes unstructured mmti sctierne [SI. ro = I) - Axo. Vi)
Applying implicit E u h lime dilferencing with the
usual Taylor serics linearization in time we have The CMILICS tnathocl tlicn atlernpts to liricl 2,. E
K,,,(A,r0) sucli that 1111:residual vector b - A(xa +
I,,,) is srriall . This is clone M , I that at each iteration
the residual norm is riiiniiiiizcd. Onc: important pa-

with A Q = Qnt' - Q", w" rairietcr for t h CMltES rriollioil in 1111:xiri: of the
the Jacobian ( A ) subspace rti. As it1 incrmws, thc: rni:rriory incrciuua

-
of the Function Evnluntion , R ( Q ) ,and I ) a pos- linr:arly and the cornliirtation qiiadratically. 'I'lic? pa-
itive diagonal matrix. For A1 'XI this is exactly rairietcr i n in unirally cli~rc:nbawd on storagc r(!qiiiri.-
Newton's method and for finite A1 a relaxed for111 of r r i m l s ancl clfmtivcni:sr of tlii: Inirc!r Iti?rntiiin. I n
Newton's method. In many applications of NCW~OII- the disciimiori Ixlow, we will have morcspwilir. things
like methods to the Eider and Navier-Stokr:s (:qua- to m y alroiit thin r(:quirairii:iil arid it's e l f d on lhi:
lions this time-like relaxation is used to start tho soh ovcrall proccss. 'Ib avoid 111~ iiicrenring rricrriory arid
lution process. A finitc lima stop A1 is I I S I : ~iiiitally corii~iutalionri:qiiiri:trients with incn:ariiig in, a corii-

then increawd to A1 -
to get pant the rathcr violcnl nonlinear startup arid moii rnoclilicatinri of (;M 11.1CS in lo apply rc:starts. Ail
'XI Icsdirig to rapid linoar, u p p w I m i n c l t i l , o i i 111 is cli(hw:n arid if coiivwgciici: is
21-3

not reached, the Krylov subspace process is restarted eration or more specific to this paper, GMRES is
with the current residual r,, replacing ro. In this used for the Inner Iteration.
case, the memory requirements are traded off against In the current implementation, the Jacobian A is
the convergence of the Inner Iteration process, and formed based on a first-order differencing of the con-
this will definitely affect the O u t e r Iteration . vective terms, whereas third-order differencing is used
Numerical experience has shown that success or for R(Q). In addition, approximate Jacobians of
failure of GMRES hinges critically on adequate pre- the Roe flux differences from the upwind-differencing
conditioning of the linear systems to he solved. A scheme are used in the definition of A, see [l]for more
preconditioning matrix M is usually applied in ei- details. The first-order difference operator is used
ther a left-preconditioning M ( b - Ax) = 0 or right- to reduce the bandwidth of the resulting A matrix,
preconditioning b - A M y = 0 (with y = M-'x) which has lower memory and computational require-
fashion. Ideally M should be chosen to be an approx- ments for the solution of Eq. 3. However, this use of
imation to d-l. Although, the most successful and approximate Jacobians can also slow the convergence
popular form of M appears to he ILU [13] (Incorn- to a steady state, that is, the O u t e r Iteration non-
plete Lower-Upper Factorization), we will also con- linear Newton process is affected.
sider alternate preconditioners in the next section. The GMRES implementation is preconditioned us-
The important ingredients of the Newton-GMRES ing block ILU(0) [13] and the matrix A is stored
method which we will focus on in this paper are the so that Ap products can be efficiently formed and
Ap products required to form the Krylov subspace the ILU process streamlined. For comparison, block
vectors K m ( A ,U), the choice of the preconditioner point relaxation and block line relaxation are used
M , the size of the subspace m and restart size m,, as both the Inner Iteration solver and as precon-
and the storage requirement influenced by all these ditioners for the GMRES Inner Iteration process.
factors. We will attempt to put the various trade offs Including a subspace size typically on the order of
in terms of memory requirements, convergence and m = 10 leads to additional storage requirements as
efficiency in perspective, (in particular for the two discussed below which are somewhat of a burden in
approaches discussed here, but also in general). two dimensions and would be a significant hindrance
in three dimensions. The use of the approximate Ja-
cobian (due to the first order form and the lineariza-
3 Structured Mesh Incom- tion errors associated with the Roe solver) produces
pressible Navier-Stokes an approximate Newton's method and therefore lin-
ear convergence is realized as opposed to the potential
Rogers [17] has implemented the Newton-GMRES for quadratic convergence.
algorithm into a twc-dimensional incompressible Rogers [17] examines a wide range of cases and
Navier-Stokes code (INS2D) and has made some options in his paper on the Newton-GMRES imple
significant comparisons with the conventional tech- mentation. Table 1 shows the characteristies of the
niques of implicit point and implicit Gauss-Seidel cases presented and itemizes the costs of the vari-
line relaxation. The INS2D flow code [la] solves ous schemes for each case broken down by the fun-
the Reynoldeaveraged incompressible Navier-Stokes damental steps in the algorithm. Base memory (B
equations using the method of artificial compressibil- MW) includes all overhead storage for the algorithm
ity, [9]. It is capable of handling multiplezone struc- including memory for either L (line relaxation), P
tured grids using either a patched multi-block (point- (point relaxation) or the Ap product in G (GMRES)
wise continuous) interface, or an overlaid (chimera) ( M 76 words/point). The additional memory (A
interface between zones. The boundary conditions MW) is composed of subspace size ( M 3 x (m 4) +
at the physical boundaries and at zonal boundaries words/point) and preconditioner (e9 words/point)
are applied in an implicit fashion during the solution contributions for GMRES. The timings are in ms/pt
process. A third-order, upwind-differencing scheme : milliiconds/point to convergence, (maximum di-
based on the method of Roe [E]is used to descritized vergence e lo-*). The standard approaches of point
the convective terms, and the viscous terms are dif- relaxation and line relaxation are compared directly
ferenced using second-order central differences. The with the Newton-GMRES scheme and are also as-
system of equations is integrated in pseud-time u s sessed as preconditioners for Newton-GMRES. The
ing an implicit Euler time discretization. Typically, first few cases are for a NACA 4412 airfoil an an-
the tirnestep is set to infinity (lo8) which results in a gle of attack (I = 13.87' and a Reynolds number,
Newton's method approach where the implicit point Re = 1.5 x lo6 and are computed on a set of refined
or line relaxation schemes are used for the Inner It- grids. The multi-element case is a three element air-
214

foil at a = 8' and Re = 9 x lo6, a schematic of the Iteration tolerance level z thereby solving the GM-
grid system is shown in Figure 1. RES step more accurately. In this case, to reach
Figures 2, 3, 4, 5 show comparisons for the above a certain convergence criteria, e.g.Outer Iteration
cases, where the symbols represent every 50 O u t e r residual to in the least number of iteration, re-
Iteration . It seems obvious from these results that quires decreasing E , E = gets there in 50 O u t e r
the GMRESILU comhinationis the more efficient in Iteration . On the other hand, the CPU time cost
terms of computation time. On the other hand, the (shown in milliseconds/point) and average subspace
negative aspect of the GMRES-ILU combination is size rn (which leads to addition memory requirements
the memory requirements. By examining the trade proportional to m) indicate that a loose tolerance,
offs between CPU time to convergence and memory say E w lo-' and small rn produce the most efficient
requirements optimal choices can be made. combination. This leads to m = 10 as the optimal
For example, Figure 2 shows the effect on CPU choice both in terms of CPU efficiency and memory
time to convergence for various choices of rn. A sub- requirements.
space size m = 10 seems to be optimal in terms
of computational costs, including reasonable added Memory estimates for the thredimensional code
memory requirements. Also, note that for the con- INS3D include a base memory of 146 words/point,
verging cases of m = 10,20,40, it required 50 O u t e r additional GMRES memory: 4 x (m+4) words/point
Iteration to reach the same level of convergence and preconditioner memory of 16 words/point.
(CPU times are larger reflecting the added computa- Thus for GMRES(lO)+ILU(O) the total memory
tional costs of a larger subspace size). This is not sur- is 218 words/point. Examples include a simple
prising since the inexact Jacobian used in this scheme wing: 0.2 million points (Mpoints): 43.6 MW, a
limits the Inner Iteration process to linear conver- wing+slattllap: 1.6 Mpoints: 349 MW, and a C17
gence. Therefore, after some point, it does not pay Aircraft: 25 Mpoints: 5450 MW = 5.45 GW. These
to converge the Inner Iteration past some toler- requirements are excessive in three-dimensions and
ance level without incurring additional cost in terms need to he reduced if these codes are to be used in
of CPU time and memory. practice.

case I Method B MW AMW ms/pt


0.28 0.006 1.98
0.28 0.001 2.31
0.28 0.161 3.17
0.28 0.156 2.12
0.28 0.188 1.14
1.10 0.011 3.68
1.10 0.002 5.13
1.10 0.618 3.77
1.10 0.609 4.48
1.10 0.737 1.45
4.35 0.023 8.79
4.35 0.004 12.56
4.35 2.920 3.91
5.17 0.015 49.7
5.17 0.003 14.7
5.17 3.468 5.37

Table 1: Cost comparisons of iterative methods for


INS2D for various cases and schemes. L(n): Line
Relaxation for n iterations, P(n): Point Relaxation
for iterations,. G ( ~ ) +x : G M ~ with
~ S Figure 1: Grid around a three-element airfoil.
. I

size m using scheme X for preconditioner.

Figure 6 shows the effect of decreasing the Inner


21-5

CPU seconds

CPU time Figure 4 kesults for Grid 2 Comparing Various


Schemes.
Figure 2: kesults for Grid 1 showing effect of sub-
space size GMRES(m) on Newton-GMRES.

............. ~ . .~... .

0 500 1000 1500 2000 25003000


CPU seconds

CPU seconds Figure 5: kesults for Multi-Element Airfoil Compar-


ing Various Schemes.
Figure 3: kesults for Grid 1 Comparing Various
Schemes.
defines the Function Evaluation . Details of the
flow algorithm can be found in Barth [6, 2, 31. The
4 Unstructured Mesh Com- details of the Newton-GMRES implementation in-
clude exact Ap products for the second order die-
pressible Navier-Stokes cretiaation with afirst or second order approximate A
used only to construct the ILU preconditioner. Barth
Barth [5] has implemented the Newton-GMRES al- presents three methods to compute Ap products, one
gorithm into a two and three -dimensional unstruc- in which the exact Jacobian is stored (requiring a
tured mesh Navier-Stokes approach. In this case the significant increase in memory requirements), a nu-
flow equations are solved using an edge-based un- merical evaluation using ROchet derivatives [SI which
structured mesh quadrature scheme characterized as is a matrix-free approach, and another matrix free
an approximate and/or exact Roe Riemann solver approach using an exact product form where proper
based on piecewise polynomial reconstruction, this linearization of the Riemann solvers and the recon-
21-6

0 100 150 200


N
Figure 6: Comparison for various Inner Iteration
tolerance levels for Grid 2 caw. Tolerance levels are
given along with milliseconds per point to conver-
gence and average subspace size used m. Figure 7: Multi-element airfoil triangulation, 22,000
vertices.

struction/quadrature mechanism of the residual vec-


tor assembly are used to produce the Ap product.
Since exact Ap products are used, quadratic conver-
gence can be realized. The preconditioned Inner It-
eration is also fairly efficient, employing a subspace
size on the order of m, = 12 and a modest number of
restarts. The resulting scheme can be mapped very
successfully onto a parallel processor environment.
Figures 7,8,9,10 show an example computation for
viscous flow with turbulence about the multiple-
element airfoil geometry. This geometry has been tri-
angulated using the Steiner triangulation algorithm
described in (41, see Figure 7. The mesh contains a p
proximately 22,000 vertices with cells near the airfoil
surface attaining aspect ratios greater than 1OOO:l.
This example provides a demanding test case for CFD
algorithms. The experimental flow conditions are
M, = 2 0 , a = 16’, and a Reynolds number of 9-6.
Experimental results are given in [ZO] and computed
Figure 8: Multi-element airfoil solution isomach con-
results are shown in Figure 8. Even though the wake
tours, M, = 0.2, cx = 16.0°, Re = 9.0 million.
passing over the main element is not well resolved, the
surface preasure coefficient shown in Figure 9 agrees
quite well with experiment.
The convergence history shown in Figure 10 is typ- 4.1 Storage Requirements
ical for aerodynamic high lift computations.
In practice we will be solving systems of I coupled
Some of the more practical aspects from Bartb’s [5] equations so that each nonzero entry of the matrix
implementation of Newton-GMRES are discussed b e is actually a small I x I block. The schemes em-
low. ployed require data from distanceone neighbors in
21-7

tries in the i-th column and j-th row and similarly


the j-th column and i-th row. In addition, nonzero
entries will be placed on the diagonal of the matrix.
From this counting argument we see that the number
of nonzero block entries, nnz, in the matrix is exactly
twice the number of edges plus the number of vertices,
2 E + N (approximately 7 N in 2D). Table 2 (based on
ua a similar counting argument) shows approximate re-
quirements for storing distance-one and distance-two
neighboring information as a sparse matrix.
Note that the entries of the sparse matrix asso-
ciated with Newton's method (for solution of the
Navier-Stokes equations and an associated 1 equa-
tion turbulence model) are actually small 5 x 5 and
L '
6 x 6 blocks in two and three dimensions respectively.
-0.250.b 0.25 0.50 0.75 1.00 1.25 At first glance, this storage requirement appears pro-
X/C hibitively large. While this may be true to some ex-
tent today, the memory capacity of computers is ex-
Figure 9: Comparison of computational and experi- panding a t a rapid rate. It is quite reasonable to ex-
mental surface pressure coefficients. pect that in the foreseeable future sufficient memory
will be available for solving most problems of engi-
neering interest. Even so, it is possible t o reduce, and
in some cases eliminate, the explicit storage of the
Jacobian matrix without compromising the favorable
convergence characteristics of Newton's method.

Dim. nnz (Distance-1) nnz (Distance-2)


2 7N 19N
3 14N 55N

Table 2: Storage Estimates for Sparse Matrices.

4.2 Calculating Analytic Jacobian


Derivatives
In this section we address the task of computing Jaco-
bian derivatives for Newton's method. In the follow-
I" 0 10 20 30 40 50 60 ing section we consider the related task of multiplying
Newton Iteration an arbitrary vector by the Jacobian matrix.
A major task in the overall calculation of the Ja-
Figure 10: Solution convergence history. cobian derivatives for the finite-volume discretization
is the linearization of the numerical flux vector with
respect to the two solution states, e.g. given the Roe
flux function [16]
the graph (mesh). In addition, the higher order accu-
rate schemes require distance-two neighbors in build- 1
ing the scheme, see Barth [5, 3, 61. First consider the 2
+
h(uR,u L ;n) = - (f(uR,n) f(uL,n)) (7)
situation in which the scheme requires only distance- 1
one neighbors. The number of nonzero entries in each - -IA(uR,
2 uL;n)l (uR- uL)(8)
row of the matrix is related t o the number of edges in-
cident t o the vertex associated with that row. Equiv- we require the Jacobian terms $ and $. Here, f
alently, each edge e(vj, v j ) will guarantee nonzero en- is the flux function, n a geometric normal, A = E,
21-8

the physical flux Jacobian] evaluated a t some combi- 4.3.1 Sparse Matrix-Vector Multiply
nation of the right and left states of the flow variables]
uR,uL. Exact analytical expressions for these terms The most straightforward strategy is t o analytically
are available [l]. In constructing the Jacobian ma- compute and store the Jacobian matrix using a com-
trix for the entire scheme it is useful to conceptualize pressed storage scheme designed for sparse matrices.
the finite-volume scheme in composition form: This strategy has the added benefit that a copy of the
matrix can also be used as a preconditioner for the
R(Q) = Li(C2(Q)), (9) iterative solver. In addition, the explicit storage also
permits the formation of the transposed matrix prob-
with C1 representing the flux quadrature and accu- lem which is often encountered in optimization pro-
mulation step and representing the data recon- cedures coupled with Newton's method. Obviously,
struction step. In this form, each operator requires a drawback of this approach is the large storage re-
distance-1 information. The Jacobian matrix can quirement.
then be written as
d R - --
_ dCldC2
dQ - dC2 dQ (10) 4.3.2 Approximate F'rkchet Derivatives
An alternative to analytically calculating Frkchet
with the critical observation that the Jacobian matrix
derivatives is to approximate them using finite differ-
can be calculated as the sparse product of two ma-
ences] [12] [8] [lo]. The required Frdchet derivative is
trices. This could potentially be an expensive task,
a limiting form of the difference approximation
but because of the special form of C1 and CZ,the
resulting sparse product produces a t most distance-2
fill and can be computed a t reasonable cost.

4.3 Exact and Approximate Jacobian The primary concern with this approach is the accu-
Mat rix-Ve ct o r Pro ducts racy of derivatives and the optimal choice for e . If
derivatives are not computed accurately then meth-
Consider the standard matrix equation b - Ax = 0. ods such as GMRES iteration may stall or fail. Using
Iterative matrix solution algorithms for this problem a forward difference approximation, 6 must be care-
requires the computation of matrix-vector products fully chosen. In general it is insufficient to choose 6
of the form d p for some arbitrary p vector. In the as a constant such as the square root of machine pre-
approximate Newton algorithm cision. Johan [12] also mentions this fact and gives

I%-:[
A=
some analysis for choosing but this analysis assumes
that R(Q) is well scaled. A common choice for 6 is
given by
where D is a positive diagonal matrix. In practice the IIQII
diagonal entries are locally scaled as a exponential € = 60 + 61 -llPll
function of the norm of the residual
with suitably chosen constants 60 and 61. An alter-
native to forward differencing is to use higher order
accurate formula such as central differencing at dou-
so that when llR(Q)ll 4 0, cfl,,, -+ 00 and the ble the computational cost.
scheme approaches Newton's method. It should be The clear attraction of this approach is the low
emphasized that by using this strategy, the scheme memory requirement. On the other hand, the nu-
is technically an approximate Newton method which merical computation of Frdchet derivatives does not
becomes exact only in the final few iterations of the produce a matrix approximation which can be used
computation. to precondition the system.
A major step in the matrix-vector product d p is
the computation of Jacobian derivatives in the direc-
4.3.3 Exact Product Forms
tion of p (a Frdchet derivative)
In this section we will present a technique for con-
D dR
d p = -p - -p. structing matrix-vector products which is an exact
At dQ calculation of the Frdchet derivative. Extension t o
Several possible strategies exist for computing the systems and the inclusion of diffusion terms are also
needed Fr6chet derivatives: handled using this technique.
21-9

Let G(E,V ) denote the triangulation in 2D or 3D Finally, the linearized fluxes are assembled using the
with n vertices and m edges. Next we define the same procedure as the residual vector assembly. In
incidence matrix actual calculations, the conservative flow variables
-1 if vi is the origin of edge 1 are not .reconstructed, thereby necessitating that a
1 if vi is the destination of edge 1 . change of variable transformation be embedded in the
(0 otherwise formulation. This is not a serious complication.
I141 \ I

Let h = h(uL,uR;n) denote the numerical flux func-


tion as defined by Equation 8. For a system of 1 cou- 4.4 Matrix Preconditioning
pled differential equations, the Jacobian matrix en- In the present applications, we consider a precondi-
tries are actually small 1 x 1 blocks. For ease of expo- tioning matrix based on the incomplete lower-upper
sition, we tacitly treat these small blocks as scalar en- (ILU) factorization of the matrix A. ILU precondi-
tries. Under these simplifications, the desired matrix- tioning is a popular and robust preconditioning pro-
vector product is given by cedure for use in iterative matrix solvers. ILU fac-
torization is a modification t o the standard Gaussian
dnp=cq-.$1
dQ [$I+[-$.. [ q P (15) elimination for which the nonzero fill pattern is ei-
ther preimposed or determined dynamically based on
the size or location of fill elements. In this way the
where [ g] E !Xrnxrn with nonzero diagonal elements, amount of storage required can be specified and in
and [g] E !Xrnxn. If we do not incorporate mono- some instances minimized. Technical aspects of ILU
tonicity enforcement into the reconstruction proce- factorization such as existence and spectral properties
dure then a considerable simplification occurs in the have been proven for M-matrices, but the general ap-
calculation of matrix-vector products. The main idea plicability is much broader and well documented in
is given in the following almost trivial lemma. the literature. The triangular solves required in the
application of ILU preconditioning generally give the
Lemma: Let v = R ( U ) = R(u1, u2, ..., U , ) denote
method global support. This is usually considered a
an arbitrary order reconstruction operator. If R de-
favorable characteristic of the method.
pends linearly on ui then
The finite-volume scheme with high order data re-
construction suggests two possible matrices suitable
for incomplete factorization.

Proof: Linearity implies that Distance-1 matrix preconditioning. Construct


n the preconditioning matrix from the Jacobian
v l , ..., U , ) =
= ~ ( u u2, aiui matrix associated with the lower (first) order ac-
i=l curate discretization of the flow equations. This
so that -&= ai. The desired result follows immedi- matrix involves distance-1 neighbors in the trian-
gulation. Matrix-vector products are computed
ately
“exactly” using the Jacobian matrix associated
with the full second order accurate scheme.

Distance-2 matrix preconditioning. Use the Ja-


cobian matrix of the entire second order accurate
This lemma suggests the following procedure for cal- scheme for both matrix-vector products and pre-
culation of matrix-vector products, from Eq. 15. conditioning.

4.5 Performance of GMRES


The viscous multi-element test problem given above
This amounts t o a reconstruction of the vectors provides representative matrices for evaluating the
pL and pR from p using the same reconstruction op- GMRES algorithm. We construct approximate New-
erator used in the residual computation. Next, the ton matrices corresponding to flow CFL numbers of
linearized form of the flux function is computed: lo3 and 10’. In addition, distance-1 and distance-2
preconditioning matrices are used to accelerate the
algorithms.
21-10

Figures 11-12 graph the convergence histories for


the GMRES algorithm and the two choices of pre- 1 .............................. ... !.................................. !............ .....................
CFL=1OO,M)o,OOO
~

conditioner. Since the matrix-vector products and 101


preconditioning solves dominate the iterative calcu-
lation, convergence histories are plotted against the
number of matrix-vector products required. Each
10 L ;
0 .................... ;...................i ................... i.................... j ...................;

GMRES iteration requires one matrix-vector prod-


uct. The GMRES algorithm is clearly adversely af- zE
fected by the distance-1 preconditioning. For this
case the distance-1 preconditioned system requires 8
roughly twice as many iterations as the distance-2 €
preconditioned system. In fact for CFL = lo8, the z"
convergence is unacceptably slow. In general we find
that when using the distance-1 preconditioning ma- ...................i...................i................... i ......i;...........f ...................i
trix, an optimal CFL number exists for convergence
and efficiency, which is large but not infinite.

10 25 50 75 100 125
Icn=looo]....................
................................................................ ~ '..................... Matrix-VectorProducts
10)

10 4
1 ................... i...................;...................i.................... i................... Figure 12: Viscous Flow matrix solution conver-
gence histories for the GMRES(30) algorithm at
C F L = 10' using ILU(0) distance-1 and distance-2
preconditioning matrices.

the best strategy appears to be an inexact Jacobian


(a first order accurate approximation to the third
order accurate Function Evaluation ) for the Ap
products, a consistent ILU(0) preconditioner, a small
subspace size and fairly loose tolerances for Inner
Iteration convergence. In the unstructured mesh
I I
approach, exact Ap products are successfully cou-
I 1 I

25 50 75 100 125 pled with a first order approximate ILU(0) precondi-


tioner and tighter tolerances levels for Inner Itera-
Matrix-Vector Products tion convergence. In both cases, an optimal strategy
is found producing enhanced efficiencies. Although
Figure 11: Viscous Flow matrix solution conver- these conclusions are not universal, they do provide
gence histories for the GMRES(30) algorithm a t guidelines and practical suggestions for general im-
C F L = lo3 using ILU(0) distance-1 and distance-2 plementations.
preconditioning matrices.

References
[l] T. J . Barth. Analysis of implicit local lin-
5 Summary earization techniques for upwind and tvd a l g e
Practical aspects for Newton-GMRES algorithms rithms. Technical Report AIAA-87-0595, Reno,
from working Navier-Stokes codes have been pre- NV, 1987.
sented. In particular, implementation issues, such [2] T. J . Barth. A Three-Dimensional Upwind Eu-
as memory requirements, accuracy requirements for ler Solver of Unstructured Meshes. Technical Re-
Ap products, tradeoffs between full Newton and re- port AIAA 91-1548, Honolulu, Hawaii, 1991.
lax Newton and other pertinent approximations, have
been discussed. Two approaches have been high- [3] T . J . Barth. Aspects of Unstructured Grids and
lighted. In the incompressible Navier-Stokes code, Finite- Volume Solvers for the Euler and Navier-
21-1 1

Stokes Equations, March 1994. von Karman In- [15] T . H . Pulliam. Implicit Methods in CFD. Claren-
stitute Lecture Series 1994-05. don Press, March 1985. The Institute of Math-
ematics and Its Publications Conference Se-
[4i T. J . Barthe Steiner f o r Isotropic ries,Proceeding of the ICFD 1988 Conference on
and Stretched Elements. Technical Report AIAA Numerical Methods for Fluid Dynamics.
95-0213, Reno, NV, 1995.
[16] P. L. Roe. Approximate Riemann Solvers, Pa-
[5] T . J . Barth. An unstructured mesh newton
rameter Vectors, and Difference Schemes. J.
solver for compressible fluid flow and its paral-
Comput. Phys., 43, 1981.
lel implementation. Technical Report AIAA-95-
0221, Reno, NV, 1995. [17] S. E. Rogers. A comparison of implicit schemes
for the incompressible navier-stokes equations
[6] T . J. Barth and D. C. Jespersen. The Design and
Application of Upwind Schemes on Unstructured with artificial compressiblity. Technical Report
Meshes. Technical Report AIAA 89-0366, Reno, AIAA-95-0567, Reno, NV, 1995.
NV, 1989. [18] S. E. Rogers and D. Kwak. An upwind dif-
[7] R.M. Beam and H.E. Bailey. Newton’s method ferencing scheme for the time accurate incom-
applied to finite-difference approximations for pressible navier-stokes equations. A I A A Jour-
the steady-state compressiblenavier-stokes equa- nal, 28(2):253-262, Feb 1990.
tions. Journal of Computational Physics, [19] Y. Saad and M. H. Schultz. GMRES: A general-
93(1):108-127, 1991. ized minimal residual algorithm for solving non-
[8] P. Brown and Y . Saad. Convergence Theory of symmetric linear systems. SIAM J. Sci. Stat.
Nonlinear Newton-Krylov Algorithms. SIAM J . Comput., 7( 3):856-869 , 1986.
Optimization., 4:297-330, 1994. [20] W. Valarezo, C. Dominik, R. McGhee, and
[9] A. J. Chorin. A numerical method for solving W . Goodman. High Reynolds Number Confgura-
incompressible viscous flow problems. Journal of tion Development of a High-Lift Airfoil. Techni-
Computational Physics, 2( 1):12-26, Aug. 1967. cal Report AGARD Meeting In High-Left Aero-
dynamics 10-01, 1992.
[lo] S. Eisenstat and H. Walker. Globally Conver-
gent Inexact Newton Methods. SIAM J. Opti- [21] V. Venkatakrishnan. Newton solution of inviscid
mization., 4:393-422, 1994. and viscous problems. Technical Report AIAA
Paper 88-0413, AIAA 26th Aerospace Sciences
[ll] D. C. Jespersen and T. H . Pulliam. Approxi- Conference, Reno, NV, 1988.
mate newton methods and flux vector splitting.
Technical Report AIAA Paper 83-1899, AIAA [22] V. Venkatakrishnan. A perspective on unstruc-
6th Computational Fluid Dynamics Conference, tured grid flow solvers. Technical Report AIAA
Danvers, MA, 1983. Paper 95-0667, AIAA 33rd Aerospace Sciences
Conference, Reno, NV, 1995.
[12] Z. Johan. Data Parallel Finite Element Tech-
niques for Large-scale Computational Fluid Dy- [23] L.B. Wigton, N.J. Yu, and D.P. Young. Gm-
namics. PhD thesis, Stanford University, De- res acceleration of computational fluid dynamics
partment of Mechanical Engineering, 1992. codes. Technical Report AIAA Paper 85-1494,
AIAA 7th Computational Fluid Dynamics Con-
[13] J. A. Meijerink and H. A. van der Vorst. Guide-
ference, Cincinatti, OH, 1985.
lines for the usage of incomplete decompositions
in solving sets of linear equations as they occur
in practical problems. Journal of Computational
Physics, 44(1):134-155, 1981.
[14] T . H. Pulliam. Efficient solution methods for the
navier-stokes equations. 1985. Lecture Notes
for the von KBrmBn Institute For Fluid Dy-
namics Lecture Series : Numerical Techniques
for Viscous Flow Computation In Turbomachin-
ery Bladings, von KBrmBn Institute, Rhode-St-
Genese, Belgium , 1985.
22- 1

HEXAHEDRON BASED GRID ADAPTATION FOR FUTURE LARGE EDDY SIMULATION

J.J.W. van der Vegt and H. van der Ven


National Aerospace Laboratory NLR
P.O. Box 90502,1006BM Amsterdam, The Netherlands

SUMMARY @IC limiter function defined on K


This paper discusses a new numerical method which enables the % components of limiter function on K
future application of Large Eddy Simulation to high Reynolds c)
J J ( ( ,q, polynomial basis functions on ? l
number aerodynamic flows. The new numerical method uses dJ basis functions on K
local grid refinement of hexahedral cells and the discontinuous $,(<, q, C) trilinear element shape functions
Galerkin finite element method. This method offers maximum RK residual in element IC
flexibility in grid adaptation and maintains accuracy on highly R!{, Rl-,R L indicator functions for grid adaptation in
irregular grids. The method is demonstrated with calculations of (, q and C directions
inviscid transonic flow on a genericdelta wing. The calculations Euclidian n-dimensional space
are done on two parallel shared memory computers and the density
performance results are used to give estimates of the computing linear span
time and memory requirements for a Large Eddy Simulation of time
a clean wing on a NEC SX-4supercomputer. final time
triangulation of R
LIST OF SYMBOLS conservative flow variables
bK external boundary face of element K average of U in element IC
B boundary operator conservative flow variables specified at dR
c' [O, rl space of one time differentiable initial conservative flow variables
functions on the interval [0, U restricted to element IC
Kronecker delta symbol numerical approximation of U
specific total energy U at cell face taken as the limit from the
face of polyhedron IC exterior of IC
flux vector in Cartesian coordinate U at cell face taken as the limit from the
direction j interior of IC
@(U) inner product of nTand F maximum U in K and it's neighboring cells
3P) matrix with columns FJ minimum U in IC and it's neighboring cells
FK mapping between elements fi- and K components of uh at Gauss quadrature points
Y ratio of specific heats in cell faces of R
U, ro path in phase space between Up'(K) components of polynomial expansion of U in I(
and Ue"'(")
h limited components of polynomial expansion
l U h4 K )) monotone Lipschitz flux
h(Upt(K) coefficients Om in K
K polyhedron element in 7th limited flow field U in each element
IC' neighboring elements of polyhedron IC urn
vector with limited moments of flow field
fi- master element of polyhedron IC Cartesian velocity components
meas( IC) measure of polyhedron IC primitive flow variables
dK boundary of polyhedron IC vectors with each componentpi E Pk(K)
M maximum number of polynomial terms vectors which belong to spacevk
in expansion of u h position vector
[MKl mass matrix of element IC components of position vector, j = { 1,2,3}
N+ set of positive natural numbers coordinates of comer points of element K
N(K) set of neighboring elements of K length of cell in local (-direction
"(K) indices of neighboring elements of IC local coordinates in element R
in the (-direction for all
n unit outward normal vector nabla operator
R flow domain subset
aR boundary of R element
P pressure composite mapping
p"fi-1 space of polynomial functions of tensor product
degree 5 k on fi- transposed
Pk(IC) space of functions whose images under
FKare functions in P k ( R )

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
22-2

INTRODUCTION bination with local grid refinement because no problems with


Computational Fluid Dynamics (CFD) is used for increasingly hanging nodes occur and the scheme maintains it’s accuracy on
complicated problems. Many advanced applications of CFD, highly irregular grids, which generally occur after several grid
such as Large Eddy Simulation (LES). can only be done with refinement steps. In this paper the spatial accuracy is limited
sophisticated grid adaptation algorithms and require significant to second order and the moments represent the flow field gra-
computer resources. The aim of this paper is to demonstrate a dients. A disadvantage of using the moment equations is that
new grid adaptation algorithm for future application to Large more memory is needed to store the additional moments of the
Eddy Simulation. With LES the filtered Navier-Stokes equa- flow field. For future LES applications in wall bounded flows
tions are solved which represent the part of the turbulent flow these disadvantages are, however, more than compensated by
field that can be resolved on the grid. The turbulent length scales the increased computational efficiency of the adapted grid.
which can not be resolved have to be modeled with subgrid scale
turbulence models. This approach is quite successful in most The DG method makes it easy to mix different types of ele-
parts of the flow field, but as already mentioned by Chapman ments. As basic elements hexahedrons are used, but whenever
131,fails in the near wall region which is critical for LES. Chap- necessary due to topological degeneracies, prisms, tetrahedrons
man proposed to use successively finer grids close to the wall and other degenerated hexahedrons are used. The initial coarse
to capture the viscous sublayer. This reduces the need to model grid is obtained from a multi-block structured grid, generated
the near wall region where the basic assumption of LES, namely with the NLR ENFLOW system. This grid is transformed into
the separation of the flow field in large and small scales, is not an unstructured grid using a face-based data structure, van der
valid. Vegt [ 14).This data structure is more suited to anisotropic local
grid refinement than the commonly used o c h e data structure.
Despite the significant progress made in LES since Chapman’s Anisotropic grid refinement is important because many flow
paper the proper solution of the near wall flow field is still one phenomenaare locally pseudo two-dimensional, eg. shocks and
of the key elements preventing LES to be applied to more gen- shear layers, and can not be efficiently captured with isotropic
eral problems in aerospace, Moin and Jimenez [ 101. The use grid refinement.
of successively finer grids can only be done efficiently with so-
phisticated grid adaptation techniques and requires a numerical The DG method combined with the face based data structure
scheme which is accurate on highly irregular grids. In this paper is extremely local in nature and makes it a good candidate for
a new algorithm is presented, using a combination of local grid parallel computing. Parallel computers offer the possibility to
refinement and the discontinuous Galerkin (DG) finite element overcome the physical limits on single processor speed, but
method. This method is capable of efficiently resolving local require a significant effort to optimize numerical schemes and
phenomenasuch as shear layers and shocks and has the potential coding. LES requires significant computer resources and the
to be applied to LES of wall bounded turbulent flows by properly performance of the DG method on two different types of parallel
resolving the near wall region. Hexahedroncells are used as ba- shared memory computers, namely a two processor NEC SX-3
sic elements because they suffer less from loss of accuracy due and a four processor SGI Power Challenge, will be discussed in
to successive refinements than the more commonly used tetra- this paper. The choice for parallel shared memory computers is
hedron cells and are more suited to viscous flows. This paper, made initially to limit the effort in modifying codes.
however, will be limited to inviscid flow in order to demonstrate The outline of the paper is as follows. After a brief description
the basic algorithm. of the governing equations, the DG method will be discussed
The discontinuous Galerkin method with Runge-Kutta time in- followed by a description of the grid adaptation algorithm. The
tegration (RKDG) was originally proposed by Cockburn and algorithm will be demonstrated on the flow field around ageneric
Shu (4,6, 51 for hyperbolic conservation laws. They proved delta wing. Next, several aspects of using parallel shared mem-
that the RKDG method is TVB stable and satisfies a maximum ory computers will be discussed and performance results will
principle for multi-dimensional scalar hyperbolic conservation be presented. These data will be used to give an estimate of the
laws. This work was mainly theoretical and limited to one and computational complexity of a LES of a clean wing. The papers
two-dimensional flow fields. The extension to three dimensions finishes with concluding remarks.
was recently presented by van der Vegt [14].The discontinuous GOVERNING EQUATIONS
Galerkin method uses a local polynomial expansion in each cell The Euler equations for inviscid gas dynamics in conservation
which results in a discontinuity at each cell face. This disconti-
form can be expressed in the flow domain R as:
nuity can be represented as a Riemann problem which provides a
d
natural way to introduce upwinding into a finite elementmethod. at
-U(x, t) A+ FJ(U)
dXj = 0,
The DG method can therefore be considered as a mixture of an
Here x and t represent the coordinate vector, with com-
upwind finite volume method and a finite element method.
ponents xi,i = { 1,2,3},in the Cartesian directions, and
A key feature of the DG method is that also equations for the time, respectively. The Euler equations are supplemented with
moments of the flow field are solved. In this way a completely initial condition U(x, 0) = U,(x)and boundary condition
local higher order accurate spatial discretization can be obtained u(x, t)lsn = B(U, U,,,); where B denotes the boundary op-
without the need to use neighboring cells in the discretization. erator and U,,, the prescribed boundary data. The vectors with
An alternative to obtain the flow field gradients is to use Gauss’ conserved flow variables U and fluxes FJ,j = { 1,2,3},are
identity, but this method requires grid regularity to be accurate.
The use of the moment equations is extremely useful in com-
22-3

defined as:
+ lvw;f(x)F(uh)da, (2)

with T = F', j = { 1 , 2 , 3 } , and e K C aK\aR and


b1; c aZC n aR the faces of element I< in the interior and
where p. p and E denote the density, pressure and specific at the boundary of the domain R, respectively. The vector nT
total energy and U, the velocity in the Cartesian coordinate represents the transposed unit outward normal vector at dlC.
directions zI, I = { 1 , 2 , 3 } and 6,' the Kronecker delta symbol.
The summation convention is used on repeated indices. This The flux at the faces e K , namely nTF(U)E F(u), is not
set of equations is completed with the equation of state: p = clearly defined, because the flow field u h is discontinuous at
(7 - l ) p ( E - i u , u , ) , with y the ratio of specific heats. the cell faces. The flux is therefore replaced with a mono-
tone flux function h(Upt(K), which is consistent,
DISCONTINUOUS GALERKLN APPROXIMATION
h(U,U) = $'(U). Here and UeZ'(K) denote the
The flow domain R,which is assumed to be a polyhedron, is
value of U at aZC taken as the limit from the interior and ex-
covered with a triangulation 7 h = { It'} of hexahedrons. which
terior of I ( . More details can be found in Cockbum et al. [5].
are related to the master element R through the mapping FK:
8
The use of the monotone Lipschitz flux h introduces upwinding
into the Galerkin method by solving the (approximate) Rie-
FK : X ( < IV , 6) = c x k d ' , ( < Vr
, C)
I= I
mann problem given by (upt(K), Suitable fluxes
with @,(<, 9, C) the standard linear finite element shape func- are those from Godunov, Roe, Lax-Friedrichs and Osher. In
tions and x ) the
~ coordinates of the vertices of the hexahedron this paper the Osher approximate Riemann solver [ 111 is used,
IC. because of it's good shock capturing capabilities, and the pos-
sibility to easily modify the Riemann problem to account for
Define on the master element l? = [- 1 , lI3 the space of poly- boundary conditions. An important additional reason for the
nomials: P k ( R )= span{&(<,q, C ) , j = 0,. . . , M} and the use of the Osher scheme is that it gives an exact solution for
related space P k ( K )as the space of functions whose images a steady contact discontinuity, and therefore it has a very low
under FKare functions in P k ( R ) :Pk(Itr)= span{d,(x) = numerical dissipation in boundary layers, [13], which is impor-
0, o FGI,j = 0,. . . ,M}. In this paper k = 1, which yields tant for future extension of the algorithm to the Navier-Stokes
a second order accurate spatial discretization with polynomials equations. The Osher approximate Riemann solver is defined
4 E { 1, <,V , C} with M = 3. as:

Define Vt(1C) = {P(IC) + R'lp, E P ' ( K ) } , then


u ( x ,t) I I ~can be approximated by Uh(X,t ) E 8 vt(K)
CI[O, T ] as:
3

Uh(X,t ) = Um(t)drn(x). (1)


m=O where UT
,, is a path in phase space between and
U r t ( K )Details
. of the calculation of this path integral in multi-
The expansion of U is local in each element and there is no con- dimensions can be found in [ 111. At the boundary surface the
tinuity across element boundaries, which is a major difference path Ta must be modified to account for boundary conditions.
with node based Galerkin finite element methods. The element In this way a Riemann initial-boundary value problem is solved
based expansion has as important benefit that hanging nodes, instead of an initial value problem, [ 111, and a completely unified
which frequently appear after local grid refinement, do not give and consistenttreatment of the flux calculations is obtained, both
any complications. Degenerated hexahedrons, such as prisms at interior and exterior faces.
and tetrahedrons, which are necessary to deal with topological
degeneracies in the grid, are allowed without further complica- The first order accurate discontinuous Galerkin method with an
tions because the degenerated surfaces do not contribute to the (approximate) Riemann solver yields monotone results, but sec-
flux balance. ond and higher order discretizations need a slope limiter to pre-
vent numerical oscillations around discontinuities and in regions
The discontinuous Galerkin finite element formulation of the with steep gradients. Cockbum et al. [5] derived a local pro-
Euler equations is given by: jection limiter on B-triangulations for multi-dimensional scalar
conservation laws, which gives a second order accurate scheme
and satisfies a maximum principle when combined with a TVD
Runge-Kutta time integration method [12]. The extension to
quadrilaterals is presented by Bey and Oden [2], but tumed out
to be very dissipative.

In this paper a different approach is followed. The second order


discontinuous Galerkin method strongly resembles a MUSCL
upwind scheme, with as main difference the procedure to de-
termine the flow gradient. In the DG-method the gradient
is determined by solving equations for the moments U,,
m = { 1,2,3}, whereas the MUSCL scheme determines the
22-4

gradient using data from surrounding cells. The same limit- local gradient would violate conservation of U in IC, which can
ing procedure can, however, be followed. In this paper the be corrected by modifying the coefficient 00:
multi-dimensional limiter from Barth and Jesperson [l], with
the modifications proposed by Venkatakrishnan [15], is used.
The limiter from Barth and Jespersen has as benefit that it is a
truly multi-dimensional limiter and yields a positive scheme. This relation is obtained from the condition
I J uh(X)dn = U K .
my(K) K
The limited flow field in cell
The limiter from Barth and Jespersen can, however, seriously IC is then equal to:
degrade convergence to steady state. This was analysed by 1

Venkatakrishnan [15] and the two main causes for this phe-
nomenon are the non-smoothness of the limiter, which uses m=O
min- and max-functions, and the fact that the limiter is active in
smooth parts of the flow, eg. in the far field. The final discontinuous Galerkin finite element discretization
is now obtained by evaluating the integrals over the element
Thelimiter according to Venkatakrishnan[l5] is directly applied
IC and it's boundary d K in equation (2). This is done using
to the conservative variables, which saves the considerable ex-
the transformation FK. between K and the master element R.
pense of computing the local characteristic decomposition.
The integrals J wluhdn, are calculated analytically, which
Define for each component O& of the cell average UK = requires quite some algebra, whereas the other integrals are
I
maS(3C) JIC Uh(x)dn: calculated with Gauss quadrature rules. Cockbum et al. [5]
proved that if the quadrature rules for the surface integrals in
+
equation (2) are exact for polynomials of degree (2k 1) and
exact for polynomials of degree 2k for the volume integrals
+
then the spatial accuracy of the DG method is k 1. In order
to preserve uniform flow it is necessary to use quadrature rules
which are exact for polynomials of order 3. For k = 1 the
with N ( K ) the set of neighboring cells which connect to cell
IC. In order to maintain monotonicity the approximate flow field surface integrals are calculated with four point Gauss quadrature
u h must satisfy uh(x) E U;;""],vx E IC, which is rules. The volume integrals require six point Gauss quadrature
accomplished with the limiter function @IC defined as: rules.

The use of four and six point Gauss quadrature rules is, however,
unnecessarily expensive. The number of flux calculations in the
approximation of the surface integrals can be reduced from four
to one using the following approximation, which is second order
1 if U& -U& =0
accurate in the mean:
Here U;. are the components of uh at the Gauss quadrature
points in R, used to evaluate the integrals in equation (2). The
function d J ~ ( yreplaces
) min(1,y) in the original Barth and
J,dn(x)nTF(U)dn = J,ijn(x)nTF(U)Jedfi
Jesperson limiter and is defined as:
r(u)lC/$,(x)nTJ,dfi
sn
-. with F(U)lccalculated at the cell face center and J , the Ja-
Defining A = U k . - U&, A+ = U k , - U k and A- = cobian of the transformation of the cell face dR to dfi on the
U;, rnin - U& and replacing A i with A i +e2 a smoother limiter master element R. The integrals JsA&(x)nTJedfi are pre-
is obtained: calculated with four point Gauss quadrature rules, which are

I
P:+ck+2AA+
if A>O exact using elements defined with linear shape functions, and
therefore free stream consistency is preserved with this approx-
imation. A similar approximation can be made for the volume
integral JK vw~(x)F(uh)dfl, with F(U) calculated in the
1 if A=O
center of R and the geometrical part of the volume integral pre-
The coefficient tK is set equal to LK = ASK)^, with ASK calculated with a six point Gauss quadrature rule. This formula-
the minimum distance between the cell face centers of two op- tion requires about four times less computing time than using the
posite faces of element K. The constant C determines the more accurate evaluation of the flux integrals and yields similar
balance between limiting and no limiting and thereby influences results. The discretization using four and six Gauss quadrature
the convergence to steady state. If C = 0 the original Barth and points for the surface and volume integrals yields, however, a
Jespersen limiter is obtained. In this paper C = 1 is used. slightly more robust scheme on coarse grids. This is mainly due
to the fact that the cross-coupling terms in the moment equations
The limiter @K is applied independently to each component of
are retained in this case.
the flow field: O& = @Lo&, m = {1,2,3}. This is slightly
less robust then using @ K = mini @k,but gives significantly
less numerical dissipation. The coefficients om, m = { 1,2,3}
For each element IC a system of ordinary differential equations
is now obtained:
in equation (1) represent the gradient of the flow field with
d -
respect to the local coordinates in I?. This modification of the [MKI - u K = RK
at
22-5

with Adaptation Step Cells Grid Points Faces


- a vector with the moments of the flow field in each
element, U,, m = (0,. . . ,3}, and RKthe right-hand side of 0 19152 20790 59594
equation (2). The equations for are integrated in time 1 33094 38277 132038
using the third order TVD Runge-Kutta scheme from Shu [ 121. 2 49088 63357 203400
For steady state calculations convergence is accelerated using 3 73091 104435 307783
local time stepping. 4 124030 197424 538109
5 211578 357752 933616
A significant difference with node based FEM is that the mass 6 322708 592441 1447763
matrix [ M K ]is uncoupled for each element K and can be easily
inverted. Table 1: Number of cells, grid points and faces afrer each
adaptation step
DIRECTIONAL GRID ADAPTATION
The use of increasingly finer grids in LES in the near wall re-
gion, as proposed by Chapman [3], and in other regions with
strong shear layers or shocks can be most efficiently done using one cell on each side. There are no limits on the number of
local grid refinement. The grid is locally enriched by subdi- neighboring cells and using advanced searching algorithms a
viding cells, independently in each of the three local grid di- very efficient scheme is obtained, which can establish all face to
c,
rections, <, q or of R. This anisotropic grid refinement is cell connections in O(Nlog,(N)) operations with N the num-
more efficient in capturing local flow phenomenathan isotropic ber of faces. The fluxes are. calculated in one loop over all the
refinement, because many flow features are frequently pseudo faces, which can be fully vectorized using a coloring scheme.
two-dimensional. A coarse initial grid is used, which is gener- The face based data structure does not put any limitations on
ated with a multi-block structured grid generator, and transfemd the number of neighboring cells, but if the number of cells con-
into an unstructured hexahedron grid. If necessary degenerated necting to one face becomes too large then the number of colors
hexahedrons, such as prisms and tetrahedrons, are allowed to significantly increases. This reduces the efficiency on vector
deal with topological degeneracies. After calculating the flow and parallel computers and will be a topic of future research. In
field, the grid cells are split in the local (-direction if: the grid adaptation process cells are added and deleted which is
done efficiently using AVL-trees, for more details see van der
" ~ > tolerance Vegt [14]
m a x v h - ~ ~R,
,
with the sensor function RL for the cell I( defined as: DISCUSSION AND RESULTS
The grid adaptation algorithm has been tested on the flow around
a generic deltawing. The geometry is acropped-deltawing with
Here A(I< is the length of cell I( in the local <-direction, a 65-degree sweep angle and a sharp leading edge. A constant
V = ( p , U, U, ~ , p the) vector
~ with primitive variables and airfoil section in the streamwise direction is used (modified
&''(I() the indices of the neighboring cells of cell IC in the NACA 64A005 profile; straight line aft of 75% chord) with 5%
(-direction. Equivalent expressions are used for the q and C relative thickness, no twist and camber. More information about
directions. This sensor is based on an equidistribution principle, the geometry and experimental results can be found in Elsenaar
see for instance Marchant et al. [9]. An important advantage et al. [7]. A transonic flow test case is used with angle of attack
of this sensor is that it prevents regions with discontinuities a = 20' and free stream Mach number M , = 0.85. The initial
from constantly dominating the local grid refinement. After grid consisted of 19152cells and 20790 grid points. The grid is
several refinements the relative contribution of regions with adapted six times, independently in all three directions and the
discontinuities reduces, because A<, in equation 3 becomes final grid consists of 322708 cells and 592441 grid points, see
progressively smaller. Table 1. During each adaptation step approximately 15 % of
the cells is deleted, after which the number of cells is increased
DATA STRUCTURE between 70 % and 90%. The removal of grid cells is important,
The discontinuous Galerkin method with local grid refinement because initially on the coarse grid the refinement sensor is less
of hexahedrons requires a significantly different data structure accurate and some unnecessary refinement takes place. Local
than the frequently used edge based data structure. The edge time stepping is used and significantly improves convergence to
based data structure is very efficient for unstructured vertex steady state, see Figure 1. The sharp peaks in the convergence
based schemes using tetrahedrons. The discontinuous Galerkin plot are caused by the grid adaptation, except for the first peak,
method is a cell based algorithm and the primary calculations which results from freezing the slope limiter after 750 time
are the evaluation of fluxes through cell faces. This can be done steps to improve converge. Freezing of the slope limiter is not
efficiently using a face based data structure. A face based data necessary after grid adaptation.
structure also has as important benefit that there are no limita-
tions on the number of cells which can connect to one cell face Figure 3 shows the pressure field and grid lines on the leeward
and is crucial for local grid refinement. The alternative would be side of the delta wing. The flow field is dominated by a strong
an octree data structure, but this data structure does not combine primary vortex which starts at the apex and moves downstream
well with anisotropic grid refinement. In van der Vegt [14] an under an angle of 20 degrees with the streamwise direction.
algorithm is presented to determine all face to cell connections Vorticity is generated at the sharp leading edge in a thin vortex
efficiently. The main element in this algorithm is that cell faces sheet and rolls-up into the primary vortex. The velocity under
are split into smaller subfaces until each face connects only to this vortex, just above the upper surface, becomes very large
and a strong shock develops between the primary vortex and
22-6

I Small I Medium I Large I Long


vector length I 8000 1 2000 I 1000 I 120,000
iterations 100 100 300 100
adaptations 0 1 1 0

0 A four processor SGI Power Challenge with a peak perfor-


mance of 4 x 350 MFlop/s, main memory of 256 MByte and
16 KByte primary and 4 MByte secondary cache.
The parallelization uses microtasking, adding parallelization
compiler directives, for both machines, and macrotasking, ex-
plicitly assigning tasks to different processors. (Implementation
on the SGI Power Challenge is done with the CONCURRENT
CALL assertion). The advantage of microtasking is that the
0 m m code remains portable. The advantage of macrotaking is that
iterations
large tasks can be assigned, even if the tasks have no do-loop
structure, and memory can be used more efficiently.
Figure 1: Maximum residual inflowfield.
The above described algorithm consists of two parts, namely
grid adaptation and flow computation. The grid adaptation part,
the upper surface, see also Figures 4 and 6. The benefits of which consists predominantly of scalar operations, requires a
anisotropic grid refinement are very clear in Figure 3, where the domain decomposition for parallelization and is not considered
grid is strongly adapted along the primary vortex in the first 85% in this paper. The flow computation has as most important com-
of the delta wing, where the flow field is approximately conical. ponent the calculation of cell face fluxes and consists of loops
At achord length of 85%, where the sharp leading edgeconnects over the cell faces. The result is added to the residual in the
tothetip, theprimary vortexandrelatedshockhaveasharpkink, two cells connected to each cell face. The loops use indirect ad-
see Figures 3 and 6. B o shocks develop in the primary vortex. dressing and in order to vectorize these loops a coloring scheme
One normal to the leading edge and connected to the kink in has been applied.
the shock structure under the primary vortex and another one The initial flow field and the flow field after three and six adapta-
from the same location on the leading edge and connected more tions is used to test the parallel performance of the flow solution
upstream to the shock under the primary vortex. A similar shock algorithm, see Table 1. These cases are denoted Small, Medium
structure, although slightly more downstream, was observed by and Large. The average vector length and number of iterations
Hoeijmakers et al. [8] using a much finer structured grid. This are presented in Table 2, which shows that the average vector
shock structure has a strong influence on the primary vortex, length decreases with problem size. This is caused by the in-
which completely blows up behind it, see Figure 5, and is very creasing number of colors after grid adaptation. A reduction
well captured by the grid adaptation. Also visible in Figure 5 is in the number of colors is possible by limiting the number of
that the grid is adapted to the trailing edge vortex. The primary neighboring cells connected to each cell face. In order to inves-
vortex significantly grows after 85% chord and merges with the
tigate the dependence of the performance results on the vector
tip vortex, see Figure 4. Also visible is the start of roll-up of the
length a special case, labeled Long, is also tested, see Table 2.
wake, which develops into a mushroom type vortex structure. In
addition to the shock structures in and around the primary vortex NEC SX-3
there is also a shock starting at about 75% downstream at the The two computationally most intensive parts of the flow solu-
center line and connected to the trailing edge at approximately tion algorithm are the routines Limit and Flux. Limit applies
mid span. A better view of this shockcan be obtained in Figure 6 a slope limiter to ensure monotonicity and Flux computes the
which gives a perspective view of the delta wing and the grid and fluxes thmugh cell faces. The suffixes 1G and 4G in Tables 3
flow field at approximately 70% chord. Figure 5 clearly shows and 4 refer to the number of Gauss quadrature points used in
the strong primary vortex and the shock between the vortex and the evaluation of the flux integrals. The two routines constitute
body. Also visible is the significant refinement in this region 90% of the total computing time. They have roughly the same
and the vortex layer starting at the sharp leading edge. structure: a nested loop, first over all colors and then over all
faces of one color.
PARALLELIZATION
The above described algorithm has been implemented in the MFlop rates on a single processor NEC SX-3 are reported in
program Hexadap, which is parallelized on shared memory ma- Table 3. The rates are based on flop counts and elapsed times.
chines, namely: The decrease in overall performance for the Medium and Large
A two processor NEC SX-3/22 with a peak performance of problems is caused by the larger number of colors after grid
2 x 2.75 GFlop/s, a main memory unit (MMU) of 1 GByte adaptation which results in a reduced vector length. The case
and 4 GByte Extended Memory Unit (XMU) of which 1.2 Long does not suffer from this reduction in performance. Also
GByte can be efficiently used to store run-time data, indicated in Table 2 is if the grid is adapted.
22-1

1 11
&I Medium( 1G)
Large(1G)
Long(4G)
Flux Limit

406
371
484
258
241
445 452
sx-3
Small(4G)
sx-3
Medium(1G)
C
F
C
F
1 Flux
1.5
1.6
1.5
1.3
Corn I Limit I Total Corr I MFlopls I
1.6 1.6
1.6 1.7
1.8 1.2
1.6 1.4
1.4 1.5
1.3 1.3
1.5 1.6
1.5 1.6
624
566
364
376
Long(1G) 463 318 314 sx-3 C 1.4 1.8 1.2 1.2 1.3 356
Large(1G) F 1.1 1.4 1.2 1.1 1.2 322
Table 3: Megaflop rates on single processor NEC SX-3 (based sx-3 C 1.5 1.5 1.4 1.3 1.4 614
on elapsed times) Long(4G) F 1.7 1.7 1.6 1.5 1.6 701
sx-3 C 1.5 1.8 1.3 1.3 1.4 440

B o parallelization strategies have been tested. The first strat-


Long(1G) F 1.6 1.9 1.6 I 1.5 1.6 I 495 I

egy executes the loops over the colors in parallel and vectorizes Small(4G) F 3.7 - 2.9 2.3 - 94
the inner loop over the faces. Part of the inner loop over the SGI LL 1.5 - 2.0 1.5 - 37
faces consists of an update of the residuals at the cell centers. Medium(1G) F 2.9 - 2.3 2.0 - 51
Within one color all faces connect to cells with different cell ad-
dresses, but this is not assured between different colors, causing Table 4: Speedups relative to single processor performance
a data dependency. Hence, in the above parallelization strategy, (based on elapsed times); SX-3 two processors; SGI four pro-
the residual updates have to be performed in a critical section, cessors; C: parallel loop overcolors(microtasking); F: parallel
where only one processor is active at a time. The second strat- loop overf w e s within one color (macrotasking); U :Low level
egy divides the loop over the faces within one color over the microtasking
available processors. The main problem with this approach is
that sufficient vector length should remain after loop division.
for each processor, in macrotasking the local data can be defined
The MFlop rates and speedup results are presented in Table 4. per task, and thus approximately halved with respect to the se-
The timings and speedups are influenced by the use of the ex- quential program. Memory use for the medium sized problem
ternal memory unit XMU of the SX-3. The XMU allows for is 498 MByte for the sequential program, 540 MByte for the
fast access to data which cannot be placed in core memory. Se- microtasked program and 5 15 MByte for the macrotasked pro-
quentially, the use of the XMU instead of core memory hardly gram. Speedups for the macrotasked program are presented in
decreases performance. During parallel execution, however, Table 4 and labeled 'F'.
locks applied during U 0 seriously deterioriate the performance.
If we compensate for the time spent during U 0 to the XMU The decrease in parallel performance with increased problem
speedups increase, the corrected speedups are labeled Con in size can be attributed to the reduced vector length. This is
Table 4. The MFlop rates in Table 4 are based on the corrected clearly demonstrated by the results of test case Long, which
speedups. has an average vector length of 120000 in the loops over the
cell faces. This problem reaches the highest parallel perfor-
The results for the first parallelization strategy, namely parallel mance, with a speed-up of 1.9 in routine Flux. Another factor
execution of loops over the colors, are obtained using micro which significantly reduces the performance of the flow solu-
tasking and are labeled 'C' in Table 4. The speedups are with tion algorithm on a NEC SX-3 computer is the limited memory
respect to elapsed times. It is clear from the results that the effi- bandwidth. This is especially important for the large number
ciency of the parallelization is rather low. This has two reasons. of indirectly addressed loops and a main reason for the big gap
First, the critical section consumes 20% of the computing time, between sustained and peak performance. The memory band-
and second, the parallel system overhead is about 10%. This width limitations are the most evident in Limit, where the ratio
large sequential part limits the maximum attainable speedup on between computations and loadstores is rather low.
more processors to 5.
SGL Power Challenge
The second parallelization strategy, namely parallel execution The SGI Power Challenge has scalar processors and therefore
of loops over the faces within one color, does not suffer from a no problems with data dependencies within a processor. The
critical section. At first the code was parallelized using micro- code was therefore parallelized using the second parallelization
tasking. The program structure is such that the flux computation strategy, namely parallel execution of the loops over the cell
is split into many different loops in different functional subrou- faces. Only the Small and Medium problems were tested, since
tines. Therefore the computational load per loop is low, less than the other problems did not fit in memory.
1.5 msec. It turned out that this load is too low to be efficient on
the NEC SX-3: the parallel overhead was as large as, or even 3 3 ~ 0implementations are made, one by paralleliziig each loop
larger than the parallel gain and no speedup was obtained. separately (low-level), and one using the same macrotasking
structure as described in the previous section. Parallelization is
Using macrotasking the parallel overhead could be reduced sig- straightforward using the parallel code of the SX-3. Directives
nificantly. Instead of parallelizing each loop separately,the work are changed to SGI directives. The macrotasking is accom-
is divided into two tasks in the subroutines Flux and Limit, each plished using the CONCURRENT CALL assertion.
task doing the same job as the subroutines, but on only half the
loop. This not only reduced the parallel system overhead, but Results of speedups and MFlop rates are presented in Table 4.
also reduced memory use. In microtasking local data is copied The low-level parallelization is labeled 'LL' and the macrotask-
22-8

+ +
of Hexadap is 8(12n 4 0 ) N 2 . lo8 Byte. With an avail-
able memory of 8 Gbyte and 8 bytes per variable the maximum
number of grid points N = 9 . lo6. Using the estimates given
by Chapman [3], this number allows for a LES with sublayer
resolution around a clean wing at a Reynolds number of approx-

i
\
\
\ imately lo6.
\
3.0 \
\ The computing time for one time step is estimated from the
0
5 - relation:
G n. N 1.3. 1.1. fF +-+f”))
fL
P
U - rF f L fR
\
OI
2.5 - \
,
\
\ -
-- with: SA a factor to account for grid adaptation, SA = 0.9,
SC the single processor speedup of the NEC SX-4 compared to
the SX-3, SC = 2. The suffixes S, F, L and R refer to the
following parts of the algorithm: S, serial part, F, subroutine
2.0 - Flux(lG), L, subroutine Limit, and R the remaining part of the
I
flow solution algorithm which is parallelizable. The variables
loop length f. denote flop counts in the respective parts of the algorithm
to advance one flow variable one time step in one grid point.
Figure 2: Cache dependency of speedups on the SGI Power The measured values are: fs = 90, f F = 1570, f L = 880
Challenge in routine Flux(4G) (- Small - - - - Medium) and fR = 180. The variables r , denote the measured flop
rates in the respective parts of the algorithm and are equal to:
f s = f R = 350. lo6, f F = 463. lo6and r L = 350. lo6 flOp/S.

The flop count in routine Flux is increased with 10% for the
ing results are labeled ’F’. Since the SGI has no XMU there
viscous contribution and 30% for a one-equation subgrid model
is no correction for the speedups: the entire program is run in
using the German0 approach. The parallel speedup, denoted by
core memory. The speedups for macrotasking are better than
,916 on a 16 processor NEC SX-4 is estimated as twelve. The
for the low level parallelization. The performance in MFlops of
computing time required to advanceone time step on a grid with
the SGI four processor Power Challenge, as listed in Table 4, is
9 . lo6 grid points is then approximately 28 seconds.
between 10% and 17% of the two processor SX-3 performance
and not sufficient for large scale computing. The percentage The time scale of the smallest eddies in the flow field will be
of peak performance is between 3% and 7% on the SGI Power approximately 100 times larger than the CFL limit for an explicit
Challenge and between 6% and 13% on the two processor NEC scheme. The CFL time step limitation can be removed with an
sx-3. implicit, time accurate temporal discretization using multigrid
acceleration. With these assumptions a Large Eddy Simulation
Results of the SGI Power Challenge are rather sensitive to cache
of a clean wing at a Reynolds number lo6 on a mesh with
misses. A parameter in the flow solution algorithm determines
9 . lo6 grid points which evolves 6500 time steps, which should
the number of cell faces in the flux calculation processed at one
be sufficient to obtain a reasonable statistical sample, would
time. Varying this parameter changes the amount of the data
require 50 hours on a 16 processor NEC SX-4.
being processed, and can be used to optimize the cache use of
the program. Significant differences can occur, and the optimal Conclusions of the parallelization
value of the parameter depends on the problem at hand. (see Provided that the vector length is sufficient, the most efficient
Figure 2). The speedups of Table 4 are computed using the parallelization strategy for the present flow solution algorithm is
optimal timing results. a high level parallelization of loops over faces of one color using
macrotasking. Macrotasking reduces parallel system overhead
Estimate of the computing time for a LES of a clean wing on
and memory use. Correcting for the XMU a maximum speedup
a NEC SX416 computer
of 1.9 is reached on a two processor SX-3.
The parallel performance on the NLR NEC SX-3/22 has been
used to estimate the problem size of a Large Eddy Simulation There are three causes for the not perfect overall performance
of a clean wing on a 16 processor NEC SX-4, which will be on the NEC SX-3:
delivered to NLR in 1996. The NLR NEC SX-4/16 is expected 0 U 0 between Main Memory and XMU in parallel processing
to have a peak performance of 32 Gflopk, a main memory of takes significantly more time,
4 GByte and 8 GByte XMU. With respect to the SX-3/22 its 0 Vector length decreases, and hence single processor speed,
architecture is more suited for indirect addressing and a single 0 Parallel system overhead.
processor speedup of 2 is expected for programs using indirect Concerning the latter cause, the balance between the two pro-
addressing. cessors is, when corrected for the U 0 between MMU and XMU,
The size of the LES is primarily determined by the available as predicted by the size of the parallel part of the algorithm.
memory. Let N be the number of grid points, and n the number Hence, the computational load is well balanced, and the remain-
ing performance loss can only be explained by parallel system
of flow variables. For a Large Eddy Simulation with a one-
overhead. Since the NEC SX-3 is not primarily suited for par-
equation turbulence model we have n = 6. The memory use
22-9

allel use. the relatively high parallel system overhead is not too REFERENCES
surprising. It is expected that the NEC SX-4 has significantly [I1 Barth, T.J.andJespersen,D.C. Thedesignandapplication
less overhead. of upwind schemes on unshuctured meshes. AlAA Paper
894366,1989.
Low-level do-loop parallelkation on the NEC SX-3 b i n s out
to be only sufficientfor loops with a computational load greater 121 Bey, KS. and Oden, J.T. A Runge-Kutta discontinuous
than 1.5 msec. finite element method for high speed flows. AIAA Paper
91-1575CP, 1991.
The parallel efficiency on the SDI Power Challenge is sim-
lar, the peneritage of peak performance is relatively low, even 131 Chapman, D.R. Computational aerodynamics develop-
compared with the NFL! SX-3. Moreover. the cache sensitivity ment and outlook. AIAA Paper 790129.1979.
makes the optimization problem dependent.
141 Cockbum. B and Shu, C.W. TVB Runge-Kutta local pro-
The present parallelization on the NEC SX-3 will not be suf- jection discontinuous Galerkin finite element method for
ficiently efficient on the 16 processor SX4. The parallel ex- conservation laws II: General framework. Math. Comp.,
ecution of the loops over the cell faces is inefficient since the 52411435.1989.
loop length will be too short to be divided over 16 processors.
151 Cockbum. B.. Hou. S. and Shu, C.W. The Runge-
This problem can be solved by limiting the number of neigh-
Kutta local projection discontinuous Galerkin 6nite ele-
boring cells connected to one cell face to at most four, which
ment method for conservation laws N The multidimen-
significantlyreduccs the number of colors and thereby increases
sional case. Math Comp..54545-581.1990.
vector length. The parallel executionof the loop over the colors
contains a sequential pM of 20%. and hence has a maximum 161 Cockbum. B., Lin. S.Y. and Shu, C.W. TVB Runge-
speedup of 5 . This sequential part can be eliminated using a do- Kutta local projection discontinuous Galerkin 6nite ele-
main decompositionof the grid. which also has as main benefit ment method for conservation laws UL One-dimensional
that the grid adaptation parl can be executed in parallel. systems. JCP, 849(L113,1989.
CONCLUDING REMARKS 171 Elsenaar, A., Hjelmberg. L.,BUtefisch, K.A. andBannink.
The discontinuous Galerkin finite element method with lo- WJ. The international vortex Bow experiment. AGARD
cal grid enrichment has been demonstrated on the three- SymposiumonValidation of ComputationalFluid Dynam-
dimensional, inviscid flow field amund a delta wing at ban- ics, Lisbon, AGARD CP 437, 1987. also AGARD Advi-
sonic speed. The use of anisobupic grid refinement of hexa- sory Report 303,1994.
hedron type cells is effective in capturing the shock shucture
and primary vortex on the leeward side of the delta wing. The 181 Hoeijmakers, H.W.M., Jacobs. J.M.J.W. and Van Den
discontinuous Galerkin method works well on highly irregular Berg, J.I. Numerical simulation of vortical flow over a
grids and is therefore a good candidate for Large Eddy Sim- delta wing at subsonic and transonic speed. Presented at
ulations. because it offers the oppolblNty to capture viscous 17th ICAS Congress, 1990. StockholmSweden, 1990.
sublayerswith successively finer grids thmugh local grid refine- I91 Marchant M.J. and Weatherhill. N.P. Adaptivity tech-
ment. An estimate of the required computational resources for niches for compressible inviscid flows. Comp. Meih. in
such a simulation is presented. The use of a face based data Appl. M e c h Md Eng., 10683-106.1993.
shucture works well in combination with local grid refinement
andallowsefficientvectorizationandpmllelizationofthecode. 1101 Moin, P. and Jimenez, 1. Largeeddy simulation of complex
On the NEC SX-3 the possible speedup thmugh parallelization turbulent flows. AIAA Paper 93-3099.1993.
sbungly depends on the vector length. A maximum sped-up of
1.9 on the two processorNEC SX-3 is obtained when sufficient 1111 Osher, S. and Chakravarthy, S. Upwind schemes and
boundary conditions with applications to Euler equations
vector length was available. A good parallel performance, with
in general geometries. JCP, 50447481,1983.
a speed-up of 3.7, is obtained on the four processor SGI Power
Challenge. but the results are sensitive to cache misses. 1121 Shu, C.W. and Osher, S. Efficient implementation of es-
sentially non-oscillatory shockcapturing schemes. JCP.
From the present results it is estimated that for future LES ap-
77439471,1988.
plications in wall bounded flows, the gain from the increased
computational efficiency obtained from highly adapted grids 1131 VanDerVegt J.J.W. Higherader accurateosherschemes
more than compensatesthe increased number of operations and with application to compressibleboundary layer stability.
memory use. A LES of a clean wing at a Reynolds number AlAAPaper93-3051,1993.
of IO6 will become feasible on a 16 processor NEC SX4 in
a lumamund time of one weekend. SigniGcant huther devel- I141 Van DerVegt JJ.W. Anisotropic grid refinemcutusing an
opments, such as the addition of the viscous contribution and unshuctured discontinuousGalerkin method for the three-
implicit time-accurate temporal discretization using multigrid dimensionalEulerequationsofgasdyuamics. AlAA Paper
acceleration (in progress). will. however, be needed to reach 95-1657,1995.
this goal.
1151 Venkatakiishnan.V. Convergenceto steady statesolutions
of the Euler equations on unstructured grids with limiters.
JCP. 118:12(L130.1995.
P

25

.,2
$14
-1,09
-1.04

-. ._.
0.616
0.563

0.457
0.m5
-0.352
-0.299
.-0.24/

e 0.194
0.141
0.0878
22-11

(M, = 0 . 8 5 , ~= 20')
F i g w 5. Toldpnssureloss M d a d q d 8 r i d i n cmss~Naonihmu~hprinoryvoncxcon.

:p t
-0.783
0.755
0.726
0.698
0.61i9
0.641
0.612
0 . 5B-I
0.556
0.527
0.499
0.47
0.442
0.413
0.38-1
0.356
0.321
0.299
0.271
0.242
F
23-1 I

Parallel algorithms for DNS of compressible flow


Martin Streng, Hans Kuerten, Jan Broeze and Bernard Geurts
Department of Applied Mathematics, University of Twente
P.O.Box 217, 7500 AE Enschede, The Netherlands

Abstract since the presently available turbulence models are


inadequate for more complicated flow phenomena
We indicate that the use of higher order accurate like shock-boundary layer interaction and massive
spatial discretization is necessary to obtain suffi- separation. A solution to this problem could be
ciently accurate DNS for the validation of subgrid provided by Large Eddy Simulation (LES). In LES
models in LES. Furthermore, we pay attention to only the large eddies are calculated, while the effects
the efficiency of the implementation of these dis- of the smaller eddies, which are thought to be uni-
cretizations on several parallel platforms. In order versal and not geometry-dependent, are described
to illustrate this, we consider compressible flow over by a subgrid model.
a flat plate. We give a priori test results for LES of However, before LES can be used as a tool in flow
this flow. simulation, the subgrid model has to be systemat-
ically validated. This validation is usually carried
out by comparing LES results with filtered DNS re-
1 Introduction sults for simple geometries and fairly low Reynolds
numbers. In Section 2 we present a priori test re-
One of the most challenging problems in Com-
sults for LES of compressible flow over a flat plate
putational Fluid Dynamics (CFD) is the accurate
for various subgrid models, including eddy-viscosity
and efficient simulation of turbulent flows for rel-
models, the similarity model and dynamic models.
evant industrial applications. The behaviour of
In the future also a posteriori tests will be carried
these flows is governed by the Navier-Stokes equa-
out fot this flow, as has been done e.g. by Vreman
tions. However, because these applications usu-
et al. [l]for the compressible mixing layer.
ally involve complex geometries and flow-fields, the
computational resources required for directly solv- The numerical methods to perform the DNS are
ing the Navier-Stokes equations are far beyond the discussed in Section 3. The a priori test results are
resources which will be available in the foreseable based on DNS performed using a second-order fi-
future. In this paper we will focus on turbulent nite volume spatial discretization. It is indicated
compressible flow-problems in simple geometries. that the use of higher order spatial discretizations
In order to tackle these problems with presently makes it possible to obtain more accurate DNS re-
available computers, three different aspects must sults. However, the use of higher order central dif-
be considered: the modelling of turbulent flows, ferencing discretizations, without numerical dissi-
the numerical methods used to perform calculations pation, is not without trouble. Besides the occur-
with these models, and the implementation of these rence of stability problems, higher order discretiza-
methods on suitable computer platforms. tions lead to wide stencils, which, in combination
As remarked above, direct solution of the Navier- with a domain-decomposition strategy, seriously af-
Stokes equations (DNS) is impossible for relevant fects the parallel efficiency of the resulting algo-
industrial applications, due to the high computa- rithm.
tional requirements. Therefore, one might use in- In Section 4 the parallel efficiency will be il-
stead the Reynolds averaged Navier-Stokes (RaNS) lustrated using some implementations of the DNS
equations in which only the statistically stationary solver on various parallel platforms, including dis-
flow is calculated and the effects of turbulence are tributed as well as shared memory systems, and a
modelled by a so-called turbulence model.. How- mixture of these types. Since many parallel plat-
ever, this leads in general to quite inaccurate results forms use cache-based processors, we consider some

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
23-2

dspects of implementation of the flow-solver on by applying a spatial filter to these equations. A


these processors. We show that careful use of cache filter operation extracts the large scale part f from
in the implementation of our type of discretizations a quantity f:
can lead to considerable performance gain.

2 Modelling of turbulent flow where R is the flow domain and A denotes the fil-
The equations describing compressible flow are the ter width of the kernel G which is assumed to be
well known Navier-Stokes equations, which repre- normalized, i.e. the integral of G over R equals 1
sent conservation of mass, momentum and energy: independent of x. For compressible flow Favre [2]
introduced a related filter operation f = pf/p.
&p + 8j(p.j) =0 The filtered Navier-Stokes equations contain so-
&(p~i) + 8j(pi.j) + 8ip - 8jTij = 0 (1)
called subgrid-terms, which cannot be expressed in
the filtered flow variables, and have to be modelled
&e + 8j((e + p)uj) - 8j(Tijui - qj) = O with subgrid-models. In this paper we will mainly
Here the symbols 8, and 8j are abbreviations of focus on the modelling of the subgrid-terms in the
the partial differential operators and 8/8xj momentum equations, which can be expressed in
respectively. The components of the velocity vector the turbulent stress tensor, defined as
are denoted by ui,while p is the density and p the
pressure which is related to the total energy density
e by: where ii is the filtered velocity vector. This tur-
1
P = (7 - l){e - -Pi%} (2) bulent stress tensor has several algebraic properties
2 which can be used in the construction and qualifi-
in which 7 denotes the adiabatic gas constant. The cation of subgrid-models [3, 41. Expressions for the
viscous stress tensor rij is a function of temperature subgrid-terms in the energy equation can be found
T and velocity vector U in ref. [5]. They can be neglected in simulations
2 at low Mach numbers, but have to be modelled at
Tij(T,U) = @&.U.
Re ”
+ 8%.- - 6 i j a k U k )
’’ 3 (3) high Mach numbers.
In total six models for the turbulent stress ten-
where p(T) is the dynamic viscosity for which we sor r i j as it appears in the subgrid-terms in the
either use Sutherland’s law for air or treat it as a momentum equations will be investigated and com-
constant. In addition q j represents the viscous heat pared in this paper. The first subgrid-model is the
flux vector, given by Smagorinsky model

where =s2 a!.?:j


with &j the compressible strain
where Pr is the Prandtl number. Finally, the tem- rate, based on the Favre-filtered velocity. Cs is the
perature T is related to the density and the pressure Smagorinsky constant, which we choose equal to
by the ideal gas law 0.17 as suggested in literature. A denotes the filter
width, which separates the resolved and subgrid-
T = 7 M 2P
- (5) scales. The major short-coming of the Smagorinsky
P
model ,is its excessive dissipation in regions where
These governing equations have been made dimen- the flow is laminar [6]. The similarity model, formu-
sionless by introducing a reference length LO,ve- lated by Bardina et al. [7], is based on a similarity
locity U O , density PO, temperature TOand viscos- assumption. Application of the definition of @,j to
ity po. The values of the Reynolds number Re = the filtered variables p and pUi yields the similarity
(pouoLo)/po and the Mach number M = uo/ao, model [7]:
where a0 is a reference value for the speed of sound,
are given separately. (9)
A Direct Numerical Simulation (DNS) is based on
a discretisation of (1) whereas the governing equa- The gradient model is derived with use of Taylor
tions for large eddy simulation (LES) are obtained expansions of the filtered velocity [8]. The lowest
23-3

order term in A in this expansion can be proposed cells, uniform in the stream- and spanwise direc-
as subgrid-model: tions and clustered near the isothermal, no-slip wall
in the normal direction. A second order accurate fi-
nite volume method was used.
lo", 1
The similarity and gradient model correlate much
better with the turbulent stress tensor than the
Smagorinsky model (see [9]and section 2.1). How-
ever, while the Smagorinsky model is too dissipative
in transitional regions, the similarity and gradient
model are not sufficiently dissipative in turbulent
regions.
The dynamic procedure overcomes the excessive
dissipation of the Smagorinsky model and adds suf-
ficient dissipation to the similarity and gradient
models. We consider three dynamic models. The
dynamic eddy-viscosity model [3] is obtained when
the model constant Cs in the Smagorinsky model is ,-.-.'
500 1000 1500 2000 2500 3000
replaced by a coefficient which is dynamically ob- t
tained and depends on the local structure of the 3
flow. In order to calculate the dynamic coefficient
7-J:) is substituted in the Germano identity, which
2.5.
is a relation between the turbulent stress tensor
for different filter widths [3]. The second dynamic
2-
model is the dynamic mixed model, in which a
relatively accurate representation of the turbulent
stress by the similarity model and a proper dissipa- 1.5-
I
tion provided by the dynamic eddy-viscosity con- I
I

cept are combined [lo]. The dynamic model coeffi- 1-


I
I

cient is obtained by substitution of the base mixed /


I

model, 7-j;' + T!;), in the Germano identity. An- _ _ - _ _ _ _ _- _- _ -


other dynamic model is the dynamic Clark model
[ll].In this case the base model is the Clark model, 0.51
0'
):7
: + 7-!:), and the model coefficient Cs is obtained ZOO0 2200 2400
t
2600 2800 3000

by substitution of this model in the Germano iden-


tity. Figure 1: Modes of kinetic energy (a) [(l,O):solid,
(2,O): dashed, (1,l): dotted, (2,2): dash-dotted] and
shape-factor (solid), skin-friction (dashed) versus time
2.1 Results t (b)
We consider flat plat flow at Re = 1000 based on the Results from a DNS on 1283 cells are shown in
initial displacement thickness 6, and the other ref- Figure 1. The persisting symmetry in the span-
erence scales are equal to the initial far-field values. wise direction was exploited in order to reduce the
We choose A4 = 0.5 and consider a temporal simu- computational effort. The evolution of the ampli-
lation in a cubic domain of size 30. A forcing term tude of some modes of the kinetic energy (Fig. 1)
corresponding to the compressible similarity solu- clearly displays the initial linear regime with an
tion of the boundary layer equations is added. The exponential growth of the instabilities. The corre-
mean initial field also equals this similarity solution, sponding large-scale structures which emerge sub-
to which the dominant 2D mode and a pair of equal sequently interact in the nonlinear regime and give
and oblique 3D modes are added with amplitude rise to a rapid transition in which many modes be-
and amplitude-ratios (1/2,1/4,1/4) respec- come simultaneously important. A broad spectrum
tively. For validation purposes the linear growth is generated and a developed turbulent flow results
rates of the instabilities were recovered with a rela- in which the individual modes display an erratic
tive error well within 1 percent on a grid with 1283 time-dependence. To represent this scenario in a
23-4

different way, the shape-factor and the skin-friction 3 Numerical method


are shown in Figure 1. The resolution is adequate
in the linear and transitional stages with a fall off As has been remarked in the previous section, the
of 10 decades or more in the spectrum of the kinetic DNS results in the turbulent regime are expected
energy. However, a t the onset of turbulent flow and to be only qualitatively correct, and further grid
in the developed stages a fall off of no more than 6-7 refinement is needed. However, the number of grid-
decades was observed. Hence, the results in the tur- cells used is already fairly large for presently avail-
bulent regime are expected to be only qualitatively able computer resources. Instead of refinement, we
correct and further grid refinement is needed. presently consider the use of higher order discretiza-
tion methods. The aim is t o obtain a more accurate
DNS with a moderate number of points. However,
this is not without problems. One drawback is that
0.018 -
high order methods lead to wide stencils, which de-
creases the parallel efficiency of the resulting code,
0.016-
as we will see in the next section. Another problem
0.014 - associated with these methods is that the discreti-
0.012-
sation of the convective and the viscous flux must
+
0
0.01 -
be carefully constructed in order t o avoid instabili-
ties. This is especially present in central differenc-
0.008- ing methods, and is not only related t o the occur-
0.006 - rence of 7r-modes, but also t o adequate damping of
aliasing errors.

3.1 Spatial discretization


1 2 3 4 5
$2
6 7 8 9 Consider an orthogonal grid with points X i , j , k ,
which is uniform in z and z direction. We use the
Figure 2: Dynamic coefficients : Germano (solid), following central differencing discretization of the
dynamic mixed (dashed) and dynamic Clark (dash- a
z-operator:
dotted).

In order t o obtain a first impression of the quality


of the various subgrid models for this flow we focus n=-d
on the correlation between p q 2 and the correspond- where
ing modelled component of the turbulent stress ten-
sor. We use a filter-width equal to four grid-cells
and a special filtering near the wall which prevents n,m=-d
the filter to extend inside the wall. The models are
tested both in the transitional and in the turbulent Here the weights w d i f f are derivative weights,
regime. The similarity- and gradient model as well and wav are average weights. Due to the unifor-
as the dynamic mixed and dynamic Clark model mity in x and z direction they only depend on j.
show a high correlation of about 0.9. The Smagorin- The quantities a represent the average of the func-
sky and dynamic eddy-viscosity models show a poor tion f over a stencil in j - k direction. For the con-
correlation of about 0.3. The eddy-viscosity contri- vective flux we use a stencil with N , points, and the
bution in the dynamic mixed and dynamic Clark weights waV are constructed such that 7r modes in
model does not destroy the high correlation. In the j and IC direction are filtered out, and moreover
Figure 2 we compare the dynamic coefficients for that polynomials up t o degree Nc - 1 are invari-
the three dynamic models at t = 2700. The coef- ant under the averaging. The derivative weights
ficients are averaged over the homogeneous direc- w d i f f are such that polynomials up t o degree N ,
tions. We observe that the Germano coefficient is are exactly differentiated. The resulting discretiza-
larger than the coefficient associated with the other tion has order N , on uniform grids. The 7r-modes
two dynamic models. Moreover, all coefficients drop in i-direction are damped by the viscous derivative.
to zero in the near-wall region which is appropriate The viscous flux is discretized using repeated dif-
for wall-bounded shear layers. ferentiation. The inner derivative is calculated on a
23-5

staggered grid. Both the inner and the outer deriva- 20 y I

tives are discretized analogously as in the convec-


tive flux, on N , points, except that now .rr-modes
are not filtered out. Both derivatives are then of
order N,, - 1, but due to symmetry, on a uniform
grid, the viscous flux is discretized up to order N,,.
Due to the nonlinearity in the convective flux,
high frequency modes arise from a low-frequency
initial state. In physical reality, these are damped
by the viscous effects in the fluid. In the numeri-
cal simulation, however, two difficulties arise. The
first is that both the convective and the viscous flux
are calculated inaccurately. In our central differ-
encing discretisations, on relatively coarse grids, a
situation may arise in which the numerical viscous
terms do not have enough dissipation to damp the
numerical convective terms, giving rise to instabil-
ities. The second difficulty is that due to the fi-
-20
0
' 5 10
- I

15 20
XI
nite grid-spacing, there is a maximum wavenumber
which can be represented on the grid. Modes with a
higher wavenumber appear as low-frequency modes Figure 3: Shock-capturing in 3D turbulent rnixing-
on the grid. Therefore, numerically, the effective layer.
energy contained in the low-frequency modes can
be increased during the onset of turbulence. One
remedy could be to take a grid that is sufficiently
fine to represent the highest mode which due to a third order accurate upwind scheme in the pres-
physics would emerge in the simulation. Another ence of a shock. See Figure 3. In this way it is possi-
possibility is to use upwind-biased discretizations ble to capture time-dependent shocks which appear
of the convective flux, as has been done by Rai and spontaneously after the transition to turbulence.
Moin [12]. We have used a discretisation of the
viscous flux with a wider stencil than necessary to
achieve the desired order of accuracy. In this way
we constructed a better approximation of the vis-
cous flux. As an example, we were able to calculate
a full transition to turbulence on 963 points using
3.2 Time integration
a fourth order method on a 53-points stencil for
the convective flux, and repeated application of a For the time integration of the resulting discretized
fourth order method on 63 points for the viscous equations we use an explicit 4-stage Runge Kutta
flux, resulting in an 113-points stencil, whereas re- method. We also studied the use of a second-order
peated application of a 43 points operator for the accurate implicit method. The system of equa-
viscous flux on this grid failed. At this moment, tions resulting from the implicit discretization is
further investigation is needed to understand this solved by means of pseudo-time stepping and ac-
phenomenon more clearly. celerated by local pseudo-time stepping and a non-
The DNS mentioned in the previous section has linear multigrid technique. Since we use central
been calculated at Mach number 0.5. In the future spatial discretizations and no artificial dissipation
we intend to perform DNS at higher Mach num- is added to the equations, the smoothing method is
bers. For that purpose we need to be able to capture less effective than in the traditional use of multigrid
shocks. This can be done by switching to upwind in steady-state calculations. In the laminar regime
discretizations in the presence of a shock, which has and in the first stages of turbulence the implicit
been applied succesfully to the supersonic compress- method provides a speed-up of a factor of 2 rela-
ible mixing layer, cf. ref. [13]. In that application tive to the explicit method on a relatively coarse
a fourth order central difference operator has been grid (643). At increased resolution this speed-up is
used for the convective term, which was replaced by enhanced correspondingly. See [14].
23-6

4 Parallel implementation of by the size of the stencil, but also the number of
floating point operations increases with increasing
the explicit solver stencil-size. To see why, recall the general form of
the &-operator, eq. (11)-(12). This derivative is
In this section we consider some implementational
aspects of the explicit solver. We use a simple computed as a one-dimensional derivative acting on
domain-decomposition technique to obtain an im- two-dimensional averages over y and z. For the
plementation on a parallel computer. This is ex- derivative in an internal boundary point these aver-
plained in the first subsection. In the next sub- ages have to be computed for points in the dummy-
section we discuss how the parallel efficiency of the layers as well. But these averages are also com-
resulting code depends on the spatial discretization. puted by the processors dealing with the neighbour-
We distinguish between the intrinsic efficiency of an ing block in order to contribute to the & derivative
of some points in that block. For a discretization on
algorithm, and the hardware efficiency. The former
is related to the algorithm only, whereas the lat- a stencil with N , x N, x N , points, careful counting
ter tells us how good a certain algorithm performs reveals that the number of floating-point operations
on certain hardware. The quantity which is usually for the computation of one derivative is
called the efficiency is the product of these efficien-
cies. We show that the intrinsic efficiency of the
(3N,N,N, + 4dN,N, + 2dN,N, + 4d2N,)(2d - 1).
algorithm decreases as the order of the spatial dis- Note that this expression is not symmetric in
cretization increases. We illustrate these concepts N,, Nu,N,. For the other derivatives the discrete
by some performance results obtained from imple- averaging and differentiation operators can be ap-
mentations on 3 different parallel machines, viz. the plied in such an order that the same expression is
Cray T3d, the Intel Paragon and the SGI Power valid. In the case N , = N, = N , = N , this reduces
Challenge array. Closely related to the concept of to
efficiency is the scalability. We discuss the scalabil- + +
(3N3 6N2d 4d2N)(2d - 1). (13)
ity in the sense of Amdahl and Gustafsson (see e.g.
ref. [15]). Now consider e.g. a given partition of the computa-
tional domain into B3 equal blocks, each containing
( N / B ) 3points. Then the total number of floating-
4.1 Domain decomposition point operations to compute a & for all grid-points
is
Suppose our computational domain consists of N , x
N
N, x N , gridpoints. This domain is divided into
B, x B, x B, blocks. For a distributed memory B
3((-)3 + 6 (N~ ) ' d+ 4d2-)(2d
N
B - l)B3,
computer, we assume that each block is allocated
on a separate processor. If the total size of the which is obviously greater than (13).
stencil used for the discretisation is (2d+ 1)3 (recall
that we use central differences, cf. (11),(12)),then a 4.2 Parallel efficiency
point which has a distance less than d + l grid-points To quantify the considerations of the previous para-
from the boundary of a block not coinciding with graph, we define the concept of intrinsic efficiency.
the boundary of the physical domain, is called an in- Consider a given partition of the computational do-
terior boundary point. This definition can easily be
main into B, x B, x B, blocks. Denote the to-
extended to other discretisation methods. For the tal number of floating point operations for a given
computation of the fluxes for the interior boundary number of timesteps by f(B,, B,, B,). Then the
points, some values of the flow-quantities which re- intrinsic efficiency flintr is given by
side on processors dealing with neighbouring blocks
are needed. To store these quantities, each block
is dressed with d dummy-layers. In order to retain
the second-order accuracy of the time-integration
method, at each stage in the Runge-Kutta time- Note that, on a shared memory machine, if we use
integration, these dummy-layers have to be trans- fine-grained parallellism (on do-loop level), we could
ferred between the various processors. It may be define aintr = 1.
clear that the amount of communication increases We can estimate the dependence of the intrinsic
with the size of the stencil. efficiency on the size of the stencil just by counting
Not only the amount of communication is affected the number of floating-point operations for various
23-1

block-sizes (by using expressions like (13)). In Fig- In general, due to the finite communications band-
ure 4 this has been done for several central differenc- width of the machine, the simulation will last
ing discretizations, using equal shapes and sizes for longer, say T(B,, By,B,) seconds. Then the
all blocks. From the pictures it can be seen that the hardware-efficiency ahw is
efficiency decreases rapidly if the stencil-size grows.
Due to the wider stencil, application of higher-order T(1,1,1>
discretizations results in more floating-point oper- BY,, Bz)BzByBzaintr(Bz, B,, ~
*hw = T ( B ~ z* )
ations, but this performance penalty is even more (15)
severe on distributed memory systems, where also The traditional (total) efficiency 0 is the product
a decrease of parallel performance occurs. As an
example, consider a central differencing second or- * = ahwaintr* (16)
der 2
operator on a 3-point stencil as compared Note that, in general, these efficiencies not only de-
to a central differencing fourth order operator 2 pend on the number of blocks in each direction, but
on a 5-point stencil. To compute the former deriva- also on the number of points per block in each di-
tive on a single-cpu machine costs approximately rection, i.e. on the actual shape of the blocks. This
5/9 M 0.56 times of the time to compute the latter, is not only due to the ratio of interior boundary
whereas on e.g. a 64 x 64 x 32 grid and 128 proces- points as compared to the interior points of each
sors on a distributed memory machine this ratio is block, but also because many processors perform
approximately 0.33. better on long inner loops in the code, due to vec-
torisation or pipelining.
algorithmic eftldency tor vatiws dlscretizations
1 The efficiency CT is related to scalability in the
sense of Amdahl, meaning that a problem which
is solved on one processor in TI seconds is solved
0.9
on P processors in TIlPu seconds. We define one
notion of efficiency related to scalability in the sense
0.8
of Gustafson. Suppose we solve a problem with N
gridpoints on one processor in TIseconds, and a
P0.7
:. problem with PN gridpoints in Tp seconds. Then
5 the efficiency UG is

These concepts are illustrated in Figure 5. Here


0.4' I we performed 5 timesteps on a 64 x 64 x 32 grid,
0 20 40 60 80
# blockp (processors)
100 120 140 with a 5 point central differencing discretization of
the convective flux, and a repeated application of a
Figure 4: Intrinsic efficiency for various spatial dis- four-point central differencing for the viscous flux,
cretizations resulting in a total stencil containing 7 x 7 x 7 points.
Plotted are the intrinsic efficiency and the total ef-
The intrinsic efficiency deals with the paralleliz- ficiency. Because it was not possible to execute the
ability of a given algorithm, regardless of any ma- program on 1 or 2 CPUs on the Paragon, the effi-
chine. In fact it gives the maximum speed-up that ciencies are based on the timings for the 4-processor
can be achieved for the algorithm. In a real im- run. We used 2 different distributed memory ma-
plementation the speed-up will be less, due to e.g. chines, viz. the Cray T3d and the Intel Paragon.
the finite bandwidth of the machine. To quantify On these machines, explicit message-passing has
this, we now define the hardware efficiency Ohw. been employed. The actual CPU-times for the runs
Suppose the CPU time to perform a certain num- are tabulated in Table 1 . A dash indicates that it
ber of timesteps on one processor using one block had not been possible to perform the run on the
is T ( l , l , l ) .Then, using BzB,Bz processors, the indicated number of processors, either because the
CPU time cannot be shorter than processors do not have enough memory (in the case
of 1 and 2 processors on the Paragon) or because
the indicated number of processors was not avail-
able on that machine. The CPU times are depen-
23-8

# proc. T3d Paragon


1 207.5 -
2 109.6 -
4 58.3 90.6
6 42.7 65.2
8 33.4 50.3
12 23.8 36.8 -
2.-
'Z

16 17.1 27.4
d0.7
E
'.\
.,
., .
24
32
13.5
9.9
20.4
15.7 0.6 -
'.*
'
'. *-
.
48 7.5 11.7
64 6.0 11.1 0.5- . - - _ _- - - _
...... . .. ..., . ....-

96 4.8 7.6
128 3.8 - 0.4 -
Table 1: CPU times in seconds (averaged over several
block-divisions). Figure 5: Efficiency for the T3d (dashed) and the
Paragon (dotted). The solid line is the intrinsic effi-
ciency.
dent on the actual shape of the blocks. Therefore
in Table 1 we averaged over some block-divisions
which give roughly the same (approximately best) means that increasing the algorithmic efficiency by
CPU-time. This dependency is illustrated in Table e.g. exchanging information between the processors
2 for the case of 8 blocks. All timings are accurate after every calculation of averages will not result in
to about 5 %. It can be seen that subdivisions with a substantially faster execution of the code. Fur-
an equal number of blocks in all directions are op- ther, all efficiencies eventually approach zero as the
timal. In general, better subdivisions are obtained number of processors approaches infinity. It can be
by using fewer blocks in z-direction. This is partly shown (using expressions like (13)) that the intrin-
due to the algorithm, since an asymmetry is intro- sic efficiency drops as B - 2 / 3 , where B is the to-
duced by the sequence of averaging-operators in the tal number of blocks. However, CTGremains nearly
derivative-calculations, and partly due to software- constant, as is shown in Table 3. Here each block
pipelining in the processors, which is reflected in contains 32 x 16 x 16 points. From this table it
the megaflop-rates (between parentheses). follows that, using this algorithm, doubling the size
of the problem and the number of processors re-
B, x Byx B, T3d Paragon sults in equal computation times. This can also be
1 x 1x 8 34.9 (77) 58.0 ( 47) 335 shown if in (17) the times Tp and TI are calculated
1x 8 x 1 34.3 (82) 54.4 ( 52) 353 as ideal, i.e. assuming no communications delays.
8 x 1x 8 39.2 (77) 67.3 ( 45) 378 Then CTG= 1.
1x 2 x 4 32.4 (80 ) 50.9 (52) 323
1x 4 x 2 31.6 (83 j 50.0 (53j 328 B, x By x B, T3d Paragon
2 x 1x 4 33.0 (79 ) 52.9 (50) 327 1x 2x 1 16.2 (21.1 ) 26.9 (12.7)
2x4x 1 31.7 (84 ) 51.5 (53) 335 1x 4x 1 16.3 (42.1 ) 27.0 (25.4)
4x 2x 1 32.9 (80 ) 54.6 (50) 327 1x 4x 2 16.4 (83.6) 27.3 (50.2)
4 x 1x 2 32.9 (82 ) 55.4 (49) 339 2x 4x 2 16.5 (166) 27.6 (99.4)
2x2x2 31.1 (84 ) 50.3 (53) 325 2x 8x 2 16.5 (332) 27.4 (200)
2x 8x 4 16.6 (661) 27.7 (396)
Table 2: CPU times for various subdivisions into 8 2x 8x 6 16.6 (991) 27.8 (592)
blocks. Between parentheses the Mflop-rates. The 4x 8x 4 16.6 (1322 1 -
last column is the number of millions of floating point-
operations to be performed for each block. Table 3: CPU times and Megaflop-rates (between
parentheses) for increasing domain-sizes iII ustrati ng
From the pictures it can be seen that on the T3d that DG remains approximately constant.
and the Paragon, the machine efficiency is some-
what lower than the algorithmic efficiency. This From the above results it can be concluded that
23-9 I

the T3d and the Paragon show comparable efficien- 4.3 Optimization for cache-machines
cies for this algorithm, the T3d being about 40 %
faster. In many parallel machines the processors use a hi-
erarchical memory structure, consisting of a small
amount of memory with a short access time (the
Besides the implementation on the T3d and the cache) and a large amount of main memory with
Paragon, we have made a preliminary implemen- much longer access time. This long access time is
tation on the SGI Power Challenge Array. This the main reason why the performance of these ma-
machine consists of 4 nodes each comprised of a 16- chines is way below their (often impressive) peak.
CPU shared memory parallel machine. We used In the implementation of a numerical algorithm, it
explicit message-passing between the nodes. On is essential to use the cache efficiently. Therefore,
each node, fine-grained parallelism has been em- the number of load and store operations should be
ployed using the vendor-supplied parallelizing com- kept to a minimum, and quantities which are loaded
piler. The combination of fine-grained parallelism from main memory should be reused as much as
and explicit message passing is not entirely triv- possible before being restored. Further, since el-
ial. On the one hand, using fine-grained paral- ements from main memory are loaded into cache
lelism results in an algorithmic efficiency of 1, since in chunks of a few consecutive elements, do-loops
no additional floating-point operations are intro- should be arranged such that main memory is tra-
duced. Therefore, this form of parallelism seems to versed linearly (as is also necessary for efficient use
be promising at first sight. On the other hand, how- of traditional vector-processors). Moreover, it will
ever, parallelizing a do-loop containing only a few enable software-pipelining on RISC-processors, re-
iterations (in the order of magnitude of the num- sulting in substantially faster execution.
ber of grid-points in one directions) causes much To illustrate this, we compare two different ways
system-overhead, and seriously affects pipelining ef- to calculate the viscous flux. In the first method
ficiency. Moreover, suboptimal speedup can arise (method A) the various derivatives of the velocity
due to the cache-coherency mechanism. The use fields and the temperature are calculated consec-
of explicit message-passing has two disadvantages, utively, and the viscous stress tensor and viscous
namely an algorithmic efficiency less than one, and heat flux are assembled and stored. Then the outer
usually a slow data-transfer. The advantage of ex- derivatives of the viscous flux are calculated, again
plicit message-passing as compared to fine-grained consecutively. The resulting code is very well vec-
parallelism is that parallelization takes place on a torizable and consists of very simple do-loops. In
(much) higher level, leading to less system over- the second method (method B), we use the follow-
head. ing observation. In the calculation of the deriva-
tives, some averages can be used to contribute to
various derivatives. Moreover, for all derivatives,
As an example, consider a problem with 64 x 64 x the averaging weights in one direction are equal.
32 grid-points (the same as discussed above). With Therefore we calculate all inner derivatives simul-
4 processors on one node working on one block, this taneously, which also has the advantage that e.g. a
yields an execution time of 23 seconds for 5 Runge- vector u1 needs to be loaded only once for the calcu-
Kutta timesteps, whereas on 4 nodes with 4 blocks lation of all its derivatives. An analogous fact holds
(1x 2 x 2) and one processor per node the execution for the weights. Further, the derivatives are not
time is 18 seconds. As another example, we com- stored, but directly used to assemble the stress ten-
pare the subdivision into 1 x 2 x 2 and 2 x 4 x 2 sor and the heat flux. After that, all outer deriva-
blocks, both running on 4 nodes. In the first case, tives are calculated simultaneously. This results in
each node deals with 1 block, and in the second case about 30% less floating point operations, and sub-
each node does the computations for 4 blocks, and stantially less load and store operations, resulting
uses 2 processors for each block. So in that case the in better memory-performance. The drawback is
distributed memory model is adopted also within the occurrence of (much) more complicated do-loop
each single node. It appears that the latter case bodies, which puts a severe demand on the compiler
has a shorter execution time. It may be clear that in order to obtain suitable pipelining. It appears
some restructuring of the code is necessary in order that on the T3d and the Paragon there is hardly
to obtain reasonable performance. This will be the any performance gain, and the performance is only
subject of another paper [16]. about 20 % of peak. On one R8000 processor in
23-10

the SGI Power Challenge (coupled to 4 MBytes of [9] S. Liu, C. Meneveau and J. Katz, “On the proper-
cache), the CPU-time of method B is half that of ties of similarity subgrid-scale models as deduced
method A, with a performance of about 37 % of from measurements in a turbulent jet,” J. Fluid
peak (110 Mflops). More details are to be found in 275, 83 (lgg4).
ref. (171. (101 B. Vreman, B. Geurts and H. Kuerten, “On the
Acknowledgement formulation of the dynamic mixed subgrid-scale
model,” Phys. Fluids 6 , 4057 (1994).
The time for the computations on the T3d was provided
by the Stichting Nationale Computerfaciliteiten (Na- 111 B. Vreman, B. Geurts and H. Kuerten, “Large
tional Computing Facilities Foundation, NCF) , which Eddy Simulation of the temporal mixing layer us-
is financially supported by the Nederlandse Organisatie ing the Clark model,” Memorandum No. 1213,Uni-
van Wetenschappelijk Onderzoek (Netherlands Organi- versity of Twente (1994).
zation for Scientific Research, NWO). One of the au- 12) M.M. Rai and P. Moin, “Direct numerical simu-
thors (HK) thanks the Institute for Fluid Dynamics at lation of transition and turbulence in a spatially
ETH Zurich for its hospitality during his stay there. evolving boundary layer,” J. Comp. Phys. 109,169
The use of the Paragon- has been made possible by cour- (1993).
tesy Of B. ETH Zurich* By courtesy Of [IS]B. Vreman, H. Kuerten and B. Geurts, llSho& in
Silicon Graphics Inc. we were able to use the Euro-
direct numerical simulation of the confined three-
pean Power Challenge Array of the SGI Supercomputer
Technologies Centre in Cortaillod. We would like to
dimensional mixing , physics of Fluids, to
appear (1995).
thank Ruud van der Pas (SGI) for his assistance.
(141 J. Broeze, B. Geurts, H. Kuerten and M. Streng,
“Multigrid acceleration of time-accurate DNS of
References compressible turbulent flow,” Copper Mountain
(1995).
I11 B. Vreman, B. Geurts, H. Kuerten, J. Broeze, B. 1151 E.F. van de Velde, “Concurrent Scientific Comput-
Wasistho and M. Streng, “Dynamic subgrid-scale ing,” Springer Verlag, New York (1994).
models for LES of transitional and turbulent com-
1161 M. Streng and R. van der Pas, “Implementation of
pressible flow in 3-Dshear layers,” Turbulent Shear
a compressible flow solver on the Power Challenge
Flow, (1995)
Array”, in preparation.
I21 A.Favre, Physics of Fluids, 26, 2851, (1983) (171 M. Streng and R. van der Pas, “Some performance
[31 M. Germano, “Turbulence: the filtering approach,” considerations for the R8000 Microprocessor”, in
J. Fluid Mech. 238, 325 (1992). preparation.
I41 B. Vreman, B. Geurts and H. Kuerten, “Realiz-
ability conditions for the turbulent stress tensor in
large-eddy simulation,” J. Fluid Mech. 278, 351
(1994).
I51 A.W. Vreman, B.J. Geurts and J.G.M. Kuerten,
“Subgrid-modelling in LES of compressible flows,”
Direct and Large-Eddy Simulation I, P.R. Voke,
L. Kleiser and J.P. Chollet (editors), Kluwer, 133
(1994).
I61 U. Piomelli, T.A. Zang, C.G. Speziale and M.Y.
Hussaini, “On the large-eddy simulation of transi-
tional wall-bounded flows,” Phys. Fluids A 2, 257
(1990).
I71 J. Bardina, J.H. Ferziger and W.C. Reynolds, “Im-
proved turbulence models based on LES of home
geneous incompressible turbulent flows,” Depart-
ment of Mechanical Engineering, Report No. TF-
19,Stanford (1984).
I81 R.A. Clark, J.H. Ferziger and W.C. Reynolds,
“Evaluation of subgrid-scale models using an ac-
curately simulated turbulent flow,” J. Fluid Mech.
91,1 (1979).
24- 1

A straightforward 3D multi-block unsteady Navier-Stokes solver for direct and


large-eddy simulations of transitional and turbulent compressible flows

P. Comtc, J.H. Silvestrini & E. Lamballais


Turbulence Simulation i 3 Modelling Team
L.E.G.I./Institut de Mecanique de Grenoble *
B.P. 53 X, F38041 Grenoble Cedex, France
Tel: (33) 76 82 51 21, Fax: (33) 76 82 52 71, E M a i l : comteOimg.fr

1. ABSTRACT - Large-Eddy Simulations, a half-way house (Leschziner,


A versatile and effective numerical code for direct and 1995) between DNS and one-point closures. It consists
large-eddy simulations of compressible flows is described. of simulating explicitely and in three dimensions all mo-
It is based on robust explicit finite-difference methods tion larger than a certain cut-off scale, accounting for the
which are second-order accurate in time and fourth-order contribution of the smaller scales through a simple alge-
accurate in space (Gottlieb & Turkel, 1976). An industrial braic model. This presupposes that the large scales are
application is presented, with comparison to a more fun- more important than the small ones, which is certainly
damental case, tackled with spectral and compact schemes. true for turbulence but is more doubtful for combustion,
for example. In any case, from the point of view of al-
2. I N T R O D U C T I O N gorithmics, the numerical methods used for LES are the
Traditionally, the algorithmic concern in CFD has been same as for DNS, except that the subgrid-scale turbulence
the (fast) convergence of steady calculations of flows over model induces non-linearities in the dissipative terms (in
complex objects. Unsteadiness in the CFD context is ge- addition to those which come from the dependence of mo-
nerally associated to changes of angle of attack or geo- lecular viscosity with respect to local temperature).
metry (in the case of store separation, for example), but
scarcely to the Tollmien-Schlichting waves or the hair- One controversial question (within the scope of this pa-
pin vortices which develop within the boundary layers nel) is the role that can play numerical dissipation in the
around these objects. Such events are generally conside- turbulence-modelling process, either through the nature
red as "turbulent fluctuations" and are either ignored or of the scheme or t h i mesh size. In our LES, the solu-
expected to be accounted for through one-point-closure tions to the equations solved do contain a certain level
turbulence models. This might yield acceptable predic- of kinetic energy in the smallest resolved scales. This is
tion of the overall drag over an aircraft, but fails at pre- sometimes criticized on the ground that all numerical me-
dicting, for example, the length of the transitional 'region thods behave badly in the small scales (even the spectral
in a boundary layer subjected to a given level of per- methods blur the phase information a t the highest wave-
turbations. The reason for this is that the physical me- number). Validation then has to be performed on physi-
chanisms (of transition, turbulence or separation) are not cal grounds, through comparison with experimental data,
understood at a fundamental level. There is therefore a predictions of stability theories or numerical results ob-
great need for numerical simulations of transitional, tur- tained with different methods. Note that Leonard (1974),
bulent or separated flows. Considering the computational who coined the expression Large- Eddy Simulation, pro-
resources currently available, two strategies are possible: posed a formalism thanks to which no energy would be
left in the smallest resolved scales.' To the other extreme,
- Direct Numerical Simulations, in which all turbulent some claim that numerical dissipation can play the role of
scales are simulated explicitely, in three dimensions of a subgrid-scale turbulence model, and sometimes that of
space, down to the Kolmogorov scale 9 (or nearly so). the molecular viscous terms too (approaches refered to as
This implies high-order unsteady schemes, small time- Monotonically Integrated LES, Built-In LES, and so on)
steps, very fine 3D grids and, in practice, low Reynolds
numbers. We would like to stress that it is not because Before giving our point of view, we will briefly recall the
a given scheme solves the complete Navier-Stokes equa- very classical numerical methods we use for the simula-
tions in three dimensions that its solutions automatically tion of compressible flows which do not develop strong
deserve the DNS label. If the mesh size in a turbulent
'The Navier-Stokes equations are first convolved through
region is larger than, say, 109, we cannot speak of DNS. a continuous low-pass filter which commutes with the time
Some use the expression "pseudo-DNS".The problem in and space derivatives.The resulting equations are then closed
this case is that the amount of dissipation brought about thanks to a subgrid-scale turbulencemodel. This closed system
by the grid being too coarse is not controlled. of equations is eventually discretizedonto a grid which is finer
than the cut-off scale of the filter, so that the result can be
'Institut National Polytechniquede Grenoble (INPG), UN- checked to be independent of the mesh size (but of course not
versitC Joseph Fourier (UJF) et Centre National de la Re- of the filter's cut-off scale).
cherche Scientifique (CNRS).

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
24-2

shocks. T h e subgrid-scale turbulence models that we cur- one can then re-write (1) as
rently used are briefly presented in section 4, although
they will be presented in more details in Lesieur & MC- -ai!
+ - + - +aP- = o ad aH
, (9)
tais (1996). T h e soundest of these models is then applied at a.5 a h ax3

t o the LES of an incompressible mixing layer performed


with
with spectral-like methods renowned for their low nume-
rical dissipation and dispersion. The same model is finally
applied to a more industrial mixing layer simulated with
the code described below.
-
F =7
1
[(g)+ (2) Fi Pz]

3. NUMERICAL SCHEME
- 1
In Cartesian co-ordinates, the compressible LES equations
can be cast, after several crude simplifications discussed 1
H = - F 3 ,
J (104
in Comte e t al. (1994), in the conservation-like form
using the chain rule (7) for the derivatives arising in the
fluxes P, G and H. Vector U is still a function of the
Cartesian co-ordinates x i and time 2 . In the limit of zero
with
viscosity and conductivity (Euler equations without SGS
U = V P , Put P V , P W ! P e ) , (2)
model), the fluxes F, - still defined by (3) - would be
in which pe stands for the resolved total energy defined, functions of U only.

For a given 2D geometry nearly-orthogonal curvilinear


(3)
grid & ( X I , 2 2 ) ; & ( x 1 , x 2 ) is generated by Ryskin method,
in such a way that each boundary of the domain corres-
ponds either to a line a t constant €1 or a t constant &.
This grid is then made 3D by spanwise translation. T h e
system (9) is solved on this grid by means of a (2,4) ex-
(4)
tension of the fully-explicit McCormack scheme devised
by Gottlieb and Turkel (1976), in the form

(5)

the deviatoric part of the resolved strain-rate tensor. Mo-


lecular viscosity is prescribed through Sutherland's law

J
p ( T )= ~ ( 2 7 3 . 1 5 ) - T 1 + Sl273.15
273.15 . 1 + SIT
V T > 120

(64
,

with ~ ( 2 7 3 . 1 5 )= 1.711 10-5PI and S = 110.4, and its


extension t o temperatures lower than 120 I< :
p(T) = ~ ( 1 2 0 TI120
) V T < 120 , (6b)
p( 120) being given by eq. (sa). T h e molecular conducti-
vity k(T) derives from the constant-Prandtl-number as-
sumption PT = c,p(T)/k(T) = 0.7. To be closed, this set
of equations requires the definition of vt and K t , eddy-
viscosity and eddy-diffusivity coefficients provided by the
SGS model used. This will be done in the next section.

T h e adaptation to curvilinear co-ordonates wits done by


David (1993), following Viviand (1974) (see also the com-
plete development in Fletcher, 1988), keeping the span- (1lb)
wise co-ordinate 2 3 Cartesian. T h e chain rule gives As mentioned in Thompson et al. (1985) and recalled
in Fletcher (1988), the metrics a [ i / d x j arising in the
(7) fluxes and Jacobians above have to be discretized in such

for any regular co-ordinate transformation ( x i , x 2 , x 3 )


(€1, € 2 , € 3 = 2 3 ) . Introducing the Jacobian
- a way that unwanted cross-terms cancel out, otherwise
the scheme is not consistent. First of all, they have to be
expressed as analytic functions of the metrics ax:c/a&, of
the inverse transform, in order to eliminate all derivatives
with respect to x i and x 2 in (9) and (10). These inverse
J = det metrics are discretized in the following manner:
24-3

are supposed to be known. For the incoming characteris-


tics (A; < O), it is necessary to prescribe w;?, in order
to pull out wEt'. This is done by considering the nature
of the boundary condition (adherence, free slip, perio-
dicity, prescribed flow rate, non-reflectivity, inter-block
matching.. . ). V,+l is finally deduced from WGtl assu-
ming simply LE+' = L"N'

Iin the corrector step ( I l b ) 3. RAPID OVERVIEW OF OUR SGS MODELS


(124 AND THEIR RECENT EVOLUTIONS
and Assuming spectra E(L) a K"' for all it, MCtais & Lesieur
(1992) proposed models defined in the spectral space, rea-
ding in a simplified form

in the predictor step ( l l a ) , and

716 zt ,.,, - 816 zt ,.,_ + 116 ..,_


21 z,k

AEz
in the corrector step ( l l b )
(W and
This is only first-order accurate, which is justified by the
fact that the grids we use are not very distorted, except with
very locally. Therefore azt/a(, remains almost everyw-
here close to 6 ~ ~ . ~ t ( h , t=
) ut(k,t)/Pn with P r t = 0.6 . (16c)
In the same way, the chain rule (7) has to be applied to CK denotes Kolmogorov's constant, and U: = 1 for k / k c <e
eliminate all derivatives with respect to z1 and zz from 0.3. It rises for higher k l k . a good fit of it is (in the case
the fluxes F,. This introduces metrics to be evaluated m = 513 at least),
as said above, together with derivatives of velocity and
temperature with respect to (1 and (2. Consistency then u;(k/k,) = 1 + 34.5exp[-3.03 kJk] . (17)
determines the way these derivatives, and also ala& E
a/az,, should be discretized.
Until now, this model has been used with a fixed va-
lue m = 513, giving satisfactory results, not only in the
The boundary conditions are based on a decomposition
case of isotropic turbulence but also stratified and/or ro-
into characteristics. in the spirit of Thompson (1981,1990)
and Poinsot and Lele (1992). The Riemann invariants tating homogeneous turbulence and temporally-growing
of outgoing characteritics are extrapolated, whereas the free shear flows (mixing layers, wakes). For streamwise-
incoming ones are either prescribed (e.g. at the inflow and-spanwiseperiodic wabbounded flows, the easiest way
boundary) or set to zero (non-reflective or open boun- of accomodating grid refinement at the wall is to work
dary condition). For example, going back to Cartesian on z z planes, normal to the wall, over which 2D spec-
co-ordinates for the sake of simplicity, in the case of a tra E z D ( ~ z D ,can ~ ,be~ computed.
) Assuming again is-
boundary perpendicular to the direction 21, the Euler tropy with E(k) m h-"', one can relate EZO to E and
equations are recast in their quasi-linear form express eddy viscosity and conductivity ut(kzD,y, t ) and
~ ~ ( L ~ ~ , y , t ) (16a).
f r o mOneofus (E.L.)didit in thecase
av
-+A
av of a plane turbulent channel flow. With m = 513, results
- = 0 , with V = T ( p , p ~ ~ , p ~ z , p ~ 3. , p ) are qualitatively correct, but the wall shear stress r, are
at azl
113)
\ - ~ , underestimated by about 20% (Fig. 1, top). This is be-
The matrix A is, as per usual, diagonalized in the form cause the model is too dissipative near the wall, where
A = L-lAL. Assuming L to be locally constant and in- experimental measurements show spectra steeper than
troducing the vector W = LV, system (13) decouples into k-'l3, Much better statistics are obtained with a variable
5 equations of the form m(y,t) estimated at each timestep from E z ~ ( k ~ ~ , y , t )
through a least-square fit between b , / 2 and k 2 0 ~ .the
aw
-+A-
aw = o , cut-off wavenumber (Fig. 1, bottom).
at azl
to be solved at the boundary point N through the semi- These results correspond to simulations at R = U<,,,h / u =
implicit scheme 5000, in which h denotes the channel's half height and
U<,*,the centerline velocity of alaminar Poiseuille flow of
same flow rate (usual convention). This should yield R, =
u,h/u z 200, which is the case for the top plot of Fig. 1
(instead of z 180 for the bottom one). Both calculations
are performed by means of de-aliased pseudo-spectral me-
thods on z z planes and 6th order compact schemes in the
y direction (details will be provided in Lamballais et al.,
(15) in preparation). The resolution is 64 x 65 x 32, for a do-
For the outgoing characteristics (A;; > O), the values of main of size 2nh x 2h x nh, so that the cut-off wavenum-
W"+l are obtained from that of A,; W E and wkTl, which bers along z and z are the same. Exlension to non-square
24-4

- switch the original structure-fnnction model offwhen


the flow is not thee-dimensional enough in the small
scales (David, 1993). In practice, an average vorticity vec-
tor ? ( Z , t ) is computed over P and its (4 or 6) dosest
neighbours. The structure-function model is applied only
if the magnitude of the angle (I = (3(1t ) ,?(P,1)) exceeds
FIQ.1 - urm. (solid), urml (dotted) and wrm, (dashed) acertain threshold an.S i d a t i o n s 2 of incompressible b
obtsinedfrom 2 LES differingodyin the determinationof m: tropic turbulence at resolutions ranging between 323 and
set to 5/3 in the top plot snd evalnsted from EaD(km, y, t),
which turned the model offin the b u s sublayer (bottom 64' gave pdf's of la1 pealring around 20'. Having fonnd
plot). In both plots, the symbols correspond to LES by Pi- the choice of a0 not critical between 10 and 45O, we fi-
mew (1933) with Germane-filly's dynamic model, which nally retained (IO = 20'. The model's constant was finally
v a y close to experimental measurrments. set to 1.56 times that of the SF model, a least-square fit
between onr test simulations yielding the same average
dissipation as the SF-model. Dispersion was found small
meshes is in progress; in particular, the procedure propw enough to justify this in f i s t approximation, but a lot
sed in Scotti et al. (1993) is being tested. of work ham yet to be done to reduce the arbitrariness
in this model. In any case, the most snrprising conclusion
When spectral methods cannot be used, we strive to de- about the filtered and selective strncturefunction models
termine eddy-viscosities out of a measure of the kinetic (hereafter FSF and SSF, respectively) is that they can
energy at the smallest resolved scale A = r / h . One of be interchanged withont much difference in the results
these local spectra is FaA (5,t), the aecond-order structure (Comte et al., 1994). This comes from the fact that they
function of the resolved velocity field, evaluated by avera- both considerably shrink the support of ut (with respect
ging over the closest neighbours of point z', either in all 3 to that of the original S F model), and that both supports
directions of space (bneighbour formulation) or on planes are almost the same (Fig. 2, middle and bottom plots).
normal to the wall or mean ahear (&neighbour formula- In any case, they do not react to A-vortices, whereas the
tion). In the case of infinite Kolmogorov spectra, energy- SF model does (Fig. 2 top).
conservation arguments (Leslie & Quarini, 1979)yield the
structure-function model (MCtais & Lesienr, 1992), deli-
ned by

u f F ( z ' , t )= 0.105 '


;
C A d m ,
consistent with the spectral model (16a).

This SF-model appears to be slightly less dissipative than


the Smagorinsky model with the constant 0.18 given by
the same assumptions (infinite Kolmogorov cascade. see
e.g. Comte et al., 1994). As it involves velocity incre-
meuts instead of derivatives, it also has the advantage
of being defined independently of the numerical scheme
wed. It is nevertheless not much better for transition
than the Smagorinsky model: low-wavenumber velocity
fluctuations corresponding to unstable modes yield ut's
large enough to &ect the growth rate of weak unstabili-
ties like Tollmien-Schlichting waves. So far, we have found
two ways of remedying this:

- apply a high-pass filter onto the resolved velocity field


before computing its strncture function. With a triply-
iterated second-order finite-difference Lapladan filter d e
noted -,one finds k ( k ) / E ( k ) SJ 403 (ilk.)' for all L,
almost independently of the velocity field and resolution.
With the same arguments as for the stroctore-function
model, this yielda the filtered structure-function model,
defined by (Ducros et al, 1995)

uFSF(P,t)= 0.0014 Cis'* A a .--. \A-

This model enabled Ducros to perform the LES of a spa- FIQ.2 - From top to bottom: iaosurfscea ut = 213 Y d v m
tially-growing boundary layer (at Mach 0.5) between Re, = hy the SF, FSF and SFS model, nspectively, in the tran*-
3.3 lob and 1.14 l o', which widely encompasses the tran- t i o d portion of a spatidly-gmwing boundsry layer at Mach
sition region, for a cost ofahout 80 hours of Crag 2. With 0.5 simulatednith the FSF model ( D u m and D u m et d.,
the first mesh line at gt IJ 3 (i.e. with just one point in 1995,or Comte et al., 1934). T h e -e velocityfieldwas used
for the three plots (a priori test).
the viscous sublayer) and only 32 points along g, statistics
were found to be within 20% agreement with experimen- aLES w i t h the originsl structure-functionmodel
tal data, as in Fig. 1 top.
24-5

4. A REFERENCE CASE THE INCOMPRES-


SIBLE MIXING LAYER SIMULATED WITH
SPECTRAL-LIKE METHODS
The scheme described above is deliberately dissipative, in
order to make it robust: as an example, we will show in
the next section an application of it in the case of a solid-
propellant booster. Let us also mention the flow over a
comprmion ramp at Mach 2.5 or the boundary layer at
Mach 4.5 that we briefly presented at the 74th AGARD
FDP (see Comte & David, 1995 and Ducros et al. 1993
for more details). The price to pay for this robustness is a
certain numerical h i p a t i o n (let alone the numerical dw FIQ.3 - Surface llJll = 1/3 w i , in DNS ir Re = 100. Peak
persion), which is difficult to measure. In the absence of vorticity recorded here is 2 U;.
really-conclusive analytical arguments. one way around
is perform comparisons with results obtained from nu-
merical methods famous for their precision, such as the Repeated at zero molecular viscosity with the FSF m-
spectral or collocation methods. del in its &neighbour formnlation, transition is obtained
farther upstream than before, even with weaker forcing:
In Comte et al (1992), we presented a comparison between Fig. 4 is obtained with E ~ D= lo-" and E I D =IO-', in a
two pseud-spectral DNS of incompressible mixing layers shorter domain than before (L= = 112 6., with only 384
at Reynolds number' 100 differing only by the nature of points to keep the meshes cubic), other dimensions and
the initial perturbations. In one case, these were made of number of collocation points being unchanged. The thre-
a mixture between 2D fluctuations (energy 10-4Uz) and shold is twice as large as before. Vorticity magnitude now
3D fluctuations of energy 10-5Uz. The result was the peaks at 4 w,, which is compatible with the high-Reynolds
formation of quasi-2D Kelvin-Helmholtz vortices under- number experiments of Huang & Ho (1990). This maxi-
going pairings and stretching weak hairpin vortices bet- mum is reached where streamwise vortices wrap around
ween one another. The spectra measured were in E k-3 the primary billows.
or P 3even after the second pairing, and vorticity re-
mained bonnded by its maximal initial value w.. In the
other case (same 3D fluctuations as before. but of energy
lO-'UZ and without 2D perturbulations) helical pairrng
were observed, with more energy in the small scales (spec-
tra in P I 3 ) , and all components of vorticity reaching
about 3 w,.

These simulations were repeated in LES without mole-


cular viscosity (Silvestrini, 1993). In both case, we o b
served the same large-scale vortex pattern as in DNS,
but with more numerous and intense small-scale vortices
(maxw, E 6w,). The difference in the statistics between
the two cases was smaller than in DNS, although the case
with 3D perturbations only remained more turbulent.

We now present the same kind of comparison in a spatially-


growing configuration, for a velocity ratio X = (U1-
U ~ ) / ( U I + U Z=) 1/2. Sixth-order accurate compact schemes
are used along z with radiative ontflow boundary condi-
tions, and pseudo-spectral methods on yz planes assu-
ming periodicity along I (spanwise) and free-slip along y

on Cray T3D by means of slab -


(code written by Gonee, 1993, and recently pardelized
pencil transpositions
under PVM). In all cases the computational grid is uni-
form with cubic meshes. Fig. 3 corresponds to a DNS in
a domain of &e L, = 1406., L , = 286,, L. = 146, for a
resolution 480 x 96 x 48. The upstream Reynolds num-
ber is 100, as in Comte et al (1992), which corresponds
approximately to the maximal value permitted at this re-
solution. The upstream forcing is a mixture between 2D
noise on the plane z = 0 and noise in the transverse direc- FIG.4 - Time evolution of the surface l l l j l l = 2/3 w i , in a
LES at Y = 0 , with C Z D = and E I D = lo-'.
tion y only, of respective energies CZDU' and e m U 2 with
~ Z = D IO-' and E I D = that is, 3% in turbulent
intensity.
3hased upon U = (U,- Uz)/2and 6,. halI the velocity
diffennce and the initial vorticity thi&em. respectively
24-6

AN INDUSTRIAL APPLICATION :THE VOR-


5.
TEX SHEDDING INSIDE A SOLID ROCKET
ENGINE

We are participating in an operation set up by CNES and


ONERA concerning the control of the vibrations induced
by vortex shedding within the solid-propellant boasters
of the future launcher ARIANE V. We show below pre-
liminary simulations performed with the code described
above, in a simplified planar test case, with the grid shown
below (Fig. 6).

L. 01 06 08

FIG.6 - Grid of the C1 test case (length L = 0.47m.radiue


H = 0.045m. resolution 318 X 31 points

The step is made of burning propellant, at a flame tem-


perature of 3387 IC and a mass flow rate, normal to
the walls, of 21.2 C g / r n a / a . Pressure p = 4.66 bar is
prescribed at the upstream end. The outlet is a nozzle
and the outflow boundary conditions are supersonic. The
burnt gases are c h a r a c t e r i d by the following parame-
ters: 7 = 1.14,R = 299.53 J / k g l K , p,,,d = 9. PI
et Pr = 1.

FIG.4 - (cont'd) - Note the good behaviour of the outflow With such values. 2D simulations are not possible without
boundary conditions. flux limiters or artificial viscosities. With a viscosity 8
times as large, they become possible without such limi-
ters, and Figure 7 shows the resulting vortices, in time
Investigating sensitivity to the nature of the upstream evolution. In such a case, the code gives approximately
perturbations would not be pertinent in such a narrow the same results as the second-order Mc Cormack code
domain'. We thus doubled L. and its corresponding num- SIERRA of ONERA (Lupoglazoff & Vuillot, 1992).
ber of collocation points. This should not change things
much in the quasi-2D case ( ~ Z D , E I D =
) (10-5,10-4).H- In 3D at the trne viscosity and with the filtered struc-
wever, with ( E ? D , E I D ) = (lO-',O), helical pairings are ture function model described above, the advantages of
observed in the wider domain (Fig. 5). The interested the (2,4) scheme become evident. The following figures
readers are refered to Silvestrini et al. (1995) for more correspond to a LES at a spanwise resolution of 90 points
details equally spaced over the span L , = r H FJ 0.141 rn,
with periodic boundary conditions. The initial condition
consists of the 2D flow shown above, taken at a given in*
tant of the steady regime, with low-amplitude white noise
(of amplitude IO-' the speed of sound a! the surface of
the propellant) on all the components of U.Without this
perturbation, the flow would have remained 2D, which
proves that the code is not *noisy". After having reached
the steady regime, which took 50 hours of Cray 90 at
450 Mflop (corresponding to Erns of real time), time s*

I
I
' "Spanwise correlation lengths are of the order of 3 - 5 6,
(6, is the local vorticity thickness). However,the large vortices
typicdly have lengths of order 20 6, when the irregularities
along the span are ignored" (quoted from Browand br ' h u t t ,
I
1985).
I
FIG.7 - Contour maps of entropy at 5 equally spaced ins-
tants.
24-1

of the anti-vibration protections of the rocket's control


systems, and illustrates the importance of taking three-
dimensionality into account, even when the largest vor-
tices are expected to be twc-dimensional.

FIG.9 - Mapsoftheentmpyfield.The topviewshowsacrass


section of the GBrtler vortices, the hottom one the streamwise
vortices which connect the KH billows.

FIQ.8 - Streamwise vortices in a quasi-industrial


configuration

ries are recorded for 5ms. Figure 8 shows an animation


of an isosurface of the magnitude of the vorticity vec-
tor. Streamwise vortices are not only visible inbetween
the large Kelvin-Helmhob billows, as in the previous sec-
tion, but also at the wall of the nozzle. These are likely FIG.10 - Temporal kinetic energy spectra recorded in the
to result from a Dean-GBrtler instability of the detached middle of the boaster. The solid line corresponds to the LES
boundary layer, which reattaches in the convergent part and the dashed line to the 2D DNS.
of the nozzle (Fig. 9).

The statistics are in global agreement with the experi- 6. CONCLUSION


mental data. In particular, we found kinetic energy and
pressure spectra which exhibit a fundamental peak around
2500Hz, and its successive harmonics. More precisely, Fi- A progress report of our efforts towards the industriali-
zation of Large-Eddy Simulations has been presented. In
gure 10 shows a comparison between the present LES and
the 2D calculation just above. In the 3D case, the fun- particular. it is shown that such simple algorithms as 5-
damental frequency is lower (2300Hz versus 2670) and point extensions of fully-explicit McCormack schemes can
be very effective, and compete with spectral methods as
the spectra are more developed, in particular in the low
frequency. This is of crucial importance for the design far as the description of fine vortical structures is concer-
ned. The importance of longitudinal GBrtler-type vortices
has been shown in the case of the flow within a simplified DUCROS.F., COMTE,P. & LESIEUR,M., 1995, "Largeeddy
booster of Ariane V, in addition to the more dramatic simulations of transition to turbulence in a weakly-cam-
case of HERMES' body flap which WM presented orally. pnwible bolmdary layer o w a h t plate'', submitted to
The academic simulation of incompressible mixing layers J. Fluid Mech..
has proved the sensitivity of LES to the nature of the FLETCHER,
C.A.J. 1988 Computational tecbniquea for fluid
disturbances superimposed onto the basic flow, showing dynamics 2". Springer series in Computational Physic#,
pp. 484.
that LES could be a good tool for receptivity studies in
aircraft and aerospace research. The next step of our de- GHOSAL. S., LUND.T.S. k MOIN,P. 1992 A l o d dynamic
velopments in this direction will deal with the adapta- model for largeeddy simulation. A n n r d Research Erisfa,
1992, Centu for Turbulence Research. Ames Re-ch
tion of our subgrid-scale turbulence modela to complex Center and S t d d University, pp. 3-25,
geometries, following the footstep of the Center for Tnr-
GHOSAL, S. & MOM, P. 1995 The basic equations for the
bulence Research in Stanford (see e.g. Ghosal & Moin,
large-eddy simulation of turbulent flows in complex g e e
1995). Finally, our opinion about the role that uumeri- metry. J. Comp. Phya., 118, 24-37.
cal dissipation should play in the turbulence-modelling
GONZE, M.A. 1993 Simulation numCnqve des adloges en
process is the followiug: the role of numerical dissipation
tranrisiion lo turkalcncc. T h h e INPG.
should be minimized, unless we have a way of controlling
it on physical grounds. Algorithms with non-hear dis- D., k TURKEL.
GOTTLIEE, E. 1976 DiSaipative twofour me-
tho& for timcdependmt problems. Math. Comp., 90
sipation are available: for example, the PPM scheme of (136), 703-723.
Collela & Woodward (1984) is capable of satisfying the
second principle of thermodyuamia and the positivity of HUANQ,L. k Ho, C. 1990 Small-scale transition in a pknc
mixing layer. J. Flrrd Mech., 210, 475-500.
the thermodynamical variables with an amount of nume-
rical dissipation close to the minimum wherever it is not LEONARD, A. 1974 Energy cascz.de in large-eddy simulations
of turbulent flows. Ad". Gwphyr.. 18A, 237-248.
needed. We disagree with the daim that Euler-PPM cal-
culations m LES (either MILES or BILES), because the LELE,S. 19gOCompactfinitediffennceschemeswithspeetrb
physics of turbulence has not been incorporated yet. HP like resolution, J. Comp. Phya., 10.9. 16-42.
wever, we think that this should be possible. Firstly, in LESCHZINER. M. 1995 SMAI-CNRS come on Nvmcrrccll Mc-
subsonic regions at least, its dissipation should be made :hod. for Turkulenee, h a y , France. June 7-9, 1995.
as little dependent on the grid orientation as possible. LESIEUR,M., METAB.0.1996 New t m & in h @ d y si-
Then, we should try to force this dissipation to equal the mulations of turbulence, Ann. Rev. Fluid Me&, 28 (to
value prescribed by a given subgrid-scale turbulence mo- aPp-4.
del. Thus, explicit eddy-viscosity models might become LESLIE,D.C., QUARINI, G.L. 1979 The application of tur-
redundant one day. However, we think that the molecu- bulence theory to the formulation of subgrid modelling
lar viscous terms should be kept in all simulations of wall- procedures. J. Flrid Mmh., 91, pp. 65-91.
bounded flows. LUPWLAZOFP, N. & VUILLOT,F. 1992 Numerical simulation
of vortex shedding phenomenon in 2D test (ado solid rc-
7. ACKNOWLEDGEMENTS &et motors. AIAA Paper9%0776,30thAIAA Aermpace
Most of the computational time used for the 3D calcula- Sdencen Meeting Reno. USA
tions presented here has been freely allocated by IDRIS. MOM, P. & KIM, J. 1985 The structure of the vortidty field
the CNRS computing center. The study of the vortex in turbulent channel flow. Part 1. Analysis of instant*
neoua fid& and statistid cornlatiam J. fluid Mmh..
shedding in the boosters of Ariane Vis under the CNES/ 156, pp. 4 4 4 6 4 .
ONERA contract no22.492/DA/Ai.CCi. The discussion
about the role of numerical dissipation has greatly be- POINSOT, T.J. & LELE,S.K. 1992 Boundary conditions for
djrect simulations of compressibleviscous reacting flows,
nefitted from the lectures given by Prof. Fereiger, from 3. Comp. Phyr. 101, 104-129.
Stanford University, during his stay in Grenoble last year.
S m I A . . MENEVEAU C., LILLvD.K.1 9 9 3 G e n u h d S -
8. REFERENCES gorinsky model for anisotropic grids. Phy.. Fluids A. 5,
(9). pp. 2306-2308.
COLLELA, P. k WOOOWARD, P.R. 1984 The piecewise par- SILVEIPA-NETO, A., GRAND, D., METAB. 0..LESIEUR. M.
bolic method (PPM) for gsbdynamical simulations. J. 1993 A numerical investigation of the coherent vortices
Comp. Phia.. 54, 174-201. in turbulence behind a badrward-fscinp step. 3. F h i d
COMTE,P & DAVID,E., 1995, LargeEddy Simulation of Mcch., 256, pp. 1-25.
G6rtrtlu vortices in a Curved Compression h p . Pmc. SILVESTRMI. J. 1993 ScnsibilitC itls conditions initialer der
of the Second French Russian worbhop on experiments- a t n e t w e 8 cohCrrniu dam =ne couche de m t h g r iem-
tion, modelization, computation in flow turbulence and porellc. DEA de I'INPG.
eombustion, INRIA/GAMNI-SMAI, ed.: B.N. Chetve-
mehkin. A. DesideFi. Yur A. Kuenetsov, J. Puiaux & B.
THOMPSON, J.F., W m r , Z.U.A & MASTIN1985 Nsmrri-
c d gnd gmerahon, foundations and nppheations, North-
StouI3et. John Wiley & 801111 Publishers (souspnesa).
Holland, Amsterdam.
COMTE,P. , Ducnos. F.. SILVESTRINI, J., DAVID,E., LAM-
BALLAIS, E., M~TAIS, 0. & LESIEUR, M., 1994, "Simula- THOMPSON, K.W. 1987 Time-dependant boundary conditions
tion des gran& &bell- d'6coulanents transitionnels". for hyperbolic systems, J. Comp. Phrs. 68, 1-24.
AGARD 74th Fluid D&cs Panel Meeting and Sym- THOMPSON, K.W. 1990 Time-dependant boundary conditions
p d u m on "Application of direct and large eddy simuls- for hyperbolic syst- U.J. Comp. Phys. 89, 439-461.
tion to tramitionand turbulence", Chania, Crete, Grke,
SILVESTRINI, J.H.. COMTE, P. & LESIEUR, M., 1995, DNS and
i a z i AM. 14.1-14.12. LES of incompressiblemixing layers developing spatially
DAVIO.E. 1993 Modfhrahon der ieorlcmcnta comprrssibles Pmercdinos of l'urbalent Shenr F l o w IO, Pennsylvania
et hyperaonipcs :unc rppmchc msfotionnoire. ThbseINPG
DUCROS F. 1995 Simtllationr nrmfnqrer dindra e t dca grmdes VIVUND, H. 1974 Fomea conservatives des &pations de Ls
Cehcllcr de eerche. Itmitm compmartblst. Thhc INPG. dynamigue des gar,, Rich. Acror., 1. 65-68.
25-1

APPLICATIONS O F LATTICE BOLTZMANN METHODS T O


FLUID DYNAMICS
S . A . Orszag
Fluid Dynamics Research Center, Forrestal Campus,
Princeton University, N J 08544-0710, USA

Y. H. Qian
Department of Applied Physics,
Coluriibia University, New York,’NY 10027, USA
S . Succi
IBM European Center for Scientific and Engineering Computing,
171 P.le Giulio Pastore, Roma, 1-00144, It.aly

SUMMARY ing the motion of a fluid, it may prove convenient


to use a population of microvariables (”particles”)
In this paper, we present recent developments in whose microdynamics can be freely adjusted to
the theory and application of lattice Boltzmann match the Navier-Stokes equations on a macro-
techniques and related lattice BGK models. Lat- scopic scale [5].
tice based methods allow the study of complicated
The LBE method takes this approach one step-
systems with simple, efficiently computable phys-
forward towards the macroscopic world, from the
ical models. Here we will report some progress
molecular to the kinetic level, by replacing the
with these methods and give a n overview of their
boolean microdynamical variables with their cor-
basic ingredients. Applications t o various types of
responding floating point expectation values. This
turbulent flows are described.
move, while preserving the locality in space and
time of the evolution rules, which are key to the

1 INTRODUCTION amenability to p a r d e l computing, offers three


main advantages: a better amenability to present-

The Lattice Boltzmann Equation (LBE) is a direct day computing architectures (increasingly faster

method to solve the Navier-Stokes equations on a on the floating point side); a wider degree of lati-
digital computer. LBE is rooted in boolean lat- tude in choosing the details of the evolution rule;
tice gas techniques, a sort of ”minimal” molecular a reduction of the separation in scale between the
dynamics scheme based on the observation that micro-world and the macro-world (i.e. the aver-
the large-scale dynamics of fluid flow is largely in- aging operation on a suitable region of the micro-
dependent of the details of the underlying micro- dynamical lattice needed in boolan simulations t o

dynamics. This suggests that in order to numer- remove statistical noise is no longer necessary).
ically integrate the differential equations describ-

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
25-2

The Lattice Boltzmann equation was intro- lence.


duced in the late 80’s to cope with the two major
drawbacks of the Lattice Gas Cellular Automata
2 Lattice Gas dynamics
(LGCA) technique: statistical noise and exponen-
tial complexity of the evolution rule with the num- The development of the lattice Boltzmann equa-
ber of degrees of freedom per lattice site. tion (LBE) is intimately related to lattice gas cel-

Ever since, the method has gone from strength lular automata (LGCA). Interest in LGCA origi-
to strength up to the point where it can be put nated with the seminal paper of Frisch, Hasslacher

on a par with most advanced computational fluid &. Pomeau (1986) in which it is shown that a sim-
dynamics (CFD) techniques for a large variety ple automaton living on a 2D hexagonal lattice

cf problems, ranging from fully-developed homo- can provide, in the limit of large scale motion, a

geneus incompressible turbulence, to multiphase faithful representation of 2D fluid dynamics [5].

flows in porous media. In contrast to the 2D case, no 3D Bravais lattice


exists with enough symmetries to lead to 3D fluid
Besides its amenability to parallel computing,
dynamics, A clever way out of this problem was
the method is appreciated for the ease of imple-
found by d’Humi&res, Lallemand & Frisch (1986)
mentation of grossly irregular geometries as well
who pointed out that a suitable four dimensional
as for the flexibility of the evolution rule which
lattice, the face-centered hypercubic (FCHC) lat-
allows to model complex physics by minor modi-
tice, leads to the proper symmetries. To obtain
fications of the basic collisional scheme.
three (two) dimensional hydrodynamics, periodic
Despite these brilliant features, LBE.has not yet boundary conditions are imposed along the 24 di-
penetrated the CFD engineering community, the rection and the flow is projected into 3D (2D) [3].
primary hurdle being its inability t o deal with non-
The path leading from LGCA t o the Navier-
uniform, irregular mesh distributions.
Stokes equations is based on a standard procedure
This problem has been partially alleviated in of statistical mechanics: (1) to get from the parti-
the recent past by importing finite-volume tech- cle level to the Liouville level, an ergodic assump-
niques within LBE so as to produce a finite-volume tion is used; (2) to get from the Liouville level
LBE capable of dealing with non-uniform (struc- to the Boltzmann kinetic level, the assumptions
tured) grids. that collisions are instantaneous and localized in

This paper is organized as follows: first we space are involved; (3) to get from the Boltzmann
present a cursory view of the LGCA nd LBE tech- level to the Navier-Stokes continuum level, the as-
niques respectively. Subsequently we describe two sumption that the particle mean-free-path is much
applications of LBE to the area of fluid turbu- smaller than any macroscopic variation length is
lence: three-dimensional Rayleigh-Benard convec- made. The formal procedure to achieve the hy-
tion and three-dimensional channel flow turbu- drodynamic description of LGCA is based on a
multiscale formalism using the Knudsen number
25-3

as a small parameter. Here N ; ( 5 ,t ) is the ensemble averaged number


density of particles of type i lying a t the lattice
The main advantages of LGCA are as follows:
point at 5,t and propagating along the direction

0 Round-off error-freedom identified by the discrete speed 6. Also, A ; ( N )


is obtained from the boolean collision term by
0 Regular data structures, ideal for vector pro-
simply replacing the stochastic boolean popula-
cessing
tion n; with the ensemble averaged population

0 Local interaction model, ideal for parallel pro- N ; . The problem of noise in equation (1) is ab-
cessing sent because N; is a real variable and no aver-
age at all is needed to recover the macroscopic
0 Ease of implementation of extremely irregular
fields. McNamara & Zanetti (1988) proposed to
geometries and boundary conditions
use Eq. ( 1 ) directly for hydrodynamic simula-
tions with the A, arising from the corresponding
The price t o be paid for these advantages re-
boolean models. In particular, they studied the
flects in the following disadvantages:
model defined by the FHP-I11 rules by simulating
the decay of shear and sound waves of finite wave-
0 Statistical noise
lengthi [lo]. The comparison between the numer-
0 Exponential complexity of the collision oper- ical values and the Chapman-Enskog multiscale
ator with increasing number of states/site predictions shows that the hydrodynamic value is

0 Relatively high-viscosity and therefore low ef- accurate to better than 5% even for a lattice as
fective Reynolds numbers small as 4. Also the behavior of sound waves is
satisfactory.

The issue of statistical noise is a common fea- The McNamara-Zanetti approach, while fixing
ture of all particle models; substantial space/time the problem of statistical noise, is still left with
averaging is required t o extract reasonably smooth the intractable complexity of the collision oper-
hydrodynamic signals out of the LGCA micrody- ator because all b-body interactions included in
namics. The issue of exponential complexity is the boolean collision term are still present. This
also typical of finite-state algorithms. makes their approach unviable in more than two
dimensions.
3 Lattice Boltzmann dynamics
Higuera & Jimenez (1989) [6] noticed that the
Lattice Boltzmann equation can be further sim-
Lattice Boltzmann techniques provide a way out
plified without losing any generality in terms of
of both of these problems. With the assumption
hydrodynamic fidelity. The reason is t h a t macro-
of molecular chaos, it is possible to write the fol-
dynamic equations in LGCA formally arise in the
lowing kinetic equation:
double limit of small Knudsen numbers and small
N;(ac'+ < , t + 1) - N;(ac',t)= A ; ( N ) i = 1,b (1)
25-4

Mach numbers. It is then convenient to consider bound to the Reynolds number attainable since
the expansion of the collision term on the right the LBE viscosity is exactly the same that results
side of (1) corresponding to these conditions. To from the corresponding LGCA. Given the fact that
do this, let us write N , as one is ultimately interested just in the large-scale,
hydrodynamic features of the flow, a t this point,
this appears as an unnecessary restriction.
N , = Nleq(p,.)' + Nlneq(Vp,Vu), (2)
One is therefore naturally led to regard the
and further decompose Nteq as LBE as a self-standing model of the Navier-Stokes
equations, regardless of any underlying LGCA dy-
~ , e q= N , ( O ) + N ~ I )+ N , ( ~+) o ( M ~ ) (3)
namics (Higuera, Succi, Benzi (1989)) [7].

where the upper index refers to the order in the The starting point in the definition of the 'self-

Mach number M . This expansion permits to ex- standing' lattice Boltzmann equation is again the
press the collision operator in terms of a simple linearized kinetic equation (4). The change in per-
2-body scattering matrix spective is however substantial: the choice of the
quantities A;j and N,eq in (4) is no longer dictated
by an underlying boolean microdynamics but is
&(N) 2 A ; j ( N j - NJ' 1. (4)
rather adjusted to the macroscopic equations to be

where A;, = @, the derivatives being


calcu- reproduced. With this broader view, the attention
lated in the state of zero velocity N; = d = p / b . is shifted on the scattering matrix and notably on
The element A;, controls the scattering rate be- its leading non-zero eigenvalue, the one control-
tween directions i and j , and Nzeq is the local ling the viscosity of the LBE flow. This eigenvalue
maxwellian equilibrium expanded to second order can be tuned a t the outset so as to achieve the de-
in the local flow field. sired flow viscosity in a fairly handy fashion.

Despite its apparent linearity, the expression (4)


accounts for second order terms in the expansion
4 Lattice BGK models
of the collision operator.

T h e Higuera-Jimenez LBE marks an important In a similar vein, Bhatnagar, Gross & Krook
breakthrough as it opens the way t o practical (1954) used a relaxation approximation t o model
three-dimensional simulations of fluid flows; as the effect of complicated collisions [a]. The ba-
a matter of fact it turns a 2b complex problem sic formulation of lattice BGK models can then
(where 6 is the number of bits at each lattice site) be described as a simplified Boltzmann equation
into a b2 complex one! The quasilinear LBE in- starting from time evolution equation as
troduced by Higuera & Jimenez is still in a one-
to-one correspondence with its underlying LGCA
microdynamics. This sets a relatively strict upper
25-5

where w is a relaxation parameter (collision fre- where IS1 is the amplitude of the strain tensor
quency in kinetic theory). The key point here and C1 is a constant.
is the choice of the equilibrium state N;" so that
A nice property of LBE is that the strain tensor
it leads to the exact Navier-Stokes equation at
Sap is available locally as an appropriate linear
hydrodynamic space and time scales. The right
combination of the particle populations Ni Other
choice is
eddy viscosity models may be implemented in a
similar way. The inclusion of standard wall condi-
tions for the eddy viscosity is equally straightfor-
where c, is the speed of sound, and t , are weights
ward.
depending on the square amplitude of the velocity
From a numerical point of view the LBE is basi-
p (since particles are either at rest or move one grid
cally an explicit finite-difference scheme working
site per timestep, p is an index from 0-2 in 2D, 0-
at the edge of the Courant-Friedrichs-Lewy con-
3 in 3 D , which labels particles at rest, in motion
dition c a t = A x and bearing a significant resem-
along or in motion diagonal to the grid). Require-
blance with the Dufort-Frankel scheme. It is char-
ments of isotropy and Galilean invariance impose
acterized by a favorable computationlcalculation
constraints on the weights t , which are model de-
ratio which is key to its amenability to parallel
pendent (Qian and Orszag (1993) [ll]).
implementations across virtually the whole spec-
A two-scale analysis in time leads to the effec-
trum of present-day parallel computers. This fa-
tive hydrodynamic equations at second order of
vorable ratio is achieved at the expense of some
the Knudsen number (the ratio of mean free path
extra-memory and CPU overhead (the number of
t o characterist ic length):
discrete populations exceeds the number of signif-

8tP + aa(Pua) = 0 icant hydrodynamic fields) as compared t o stan-


dard explicit CFD schemes.
at(P4 + ap(Puaup) = -a,(c?p)
(6)
+.ap[Ptapua + aaup)]
where cs is the sound speed and U the shear vis-
cosity is given by 5 Applications
c; 2
U = -(- - 1)
2 w Many applications of lattice BGK methods to di-
Also, the incorporation of an eddy viscosity model verse fluid flows have and are being made; for a
is quite straightforwardly accomplished through recent review see (Qian, Succi and Orszag, 1995

the introduction of a space- and time-dependent P21).


relaxation parameter w . The Smagorinsky for-
Here we shall cursorily review two recent appli-
mula for the eddy viscosity, for example, becomes cations: three-dimensional Rayleigh-Benard ther-
simply
mal convection and three-dimensional channel
flow turbulence.
25-6

5.1 Three-dimeiisioiial thermal convec- where S P ( r )is defined as:


tion

Recently, the LBE formalism has been extended in


such a way as to handle thermal convection by in-
angular brackets denoting ensemble-averaging.
cluding the dynamics of a temperature field within
According to the standard K41 Kolmogorov the-
the fluid flow, Massaioli, Succi, Benzi, (1993) (81.
ory, in the scaling regime (Reynolds number going
The thermal LBE code has been extensively ex-
to infinity) the 3rd order structure function S ~ ( T )
ploited t o gain new insights into a number of issues
becomes a linear function of the space separation
related to thermal turbulence, such as the shape
T, which is why scaling is commonly probed by log-
of the probability distribution function of veloc-
plotting the structure functions S, versus T. The
ity and temperature fluctuations and the related
problem with this procedure is that the Reynolds
implications on the scaling properties of thermal
numbers achievable by direct simulation of the
turbulence.
Navier-Stokes equations on present-day comput-
Perhaps, the most valuable outcome of these ers are not high enough to attain a fully-developed
simulations is a clue on the nature of turbulent scaling regime. The result is that a clearcut mea-
flows which goes now by the name of ”extended surement of the scaling exponents is hampered by
Self Similarity” (ESS). ESS represents a kind statistical inaccuracies.
of generalized scale invariance which apparently
The point of ESS is that the eq. (7) holds even
holds also in the limit of low Reynolds numbers,
for moderately low Reynolds number for which
i.e when dissipation still plays a non-negligible role
the Kolmogorov relation S ~ ( T21) T does not ap-
on the flow dynamics (Benzi, Ciliberto, Massaioli,
ply, whence the denomination of ”extended” self-
Tripiccione and Succi, 1993) [4].
similarity.
The basic statement of ESS is that scaling prop-
The practical implication is that scaling ex-
erties of a turbulent flow are most conveniently
ponents can then be reliably measured out
highlighted by inspecting the structure functions
of moderate-Reynolds number simulations, well
one versus another rather than as a function of the
within reach of present-day computational ca-
space separation T, as suggested by the common
pabilities (1283 grid points being perfectly ade-
practice.
quate).
In particular the scaling exponents up can be
The validity of the ESS assumption is currently
derived by measuring the p - t h order distribution
being explored for a variety of different flows, in-
function S P ( r )in terms of &(T) according to the
cluding Rayleigh-Benard turbulence, magnetohy-
following relation :
drodynamics and others.

In the specific instance of Rayleigh-Benard con-

(7) vection, ESS has permitted to gather a wide body

!
25-7

of numerical evidence in favor of ’buoyancy-driven’ [91.


Bolgiano scaling , i.e. energy spectrum scaling like
The idea is to reproduce the well-known loga-
E ( k ) N k - ” / 5 , as opposed to ’non-thermal’ Kol-
rithmic law-of-the wall of the mean flow profile:
mogorov decay E ( k ) 2 k - 5 / 3 [8].

5.2 Three-dimensional turbulent chan-


nel flow

where x = 0.4 is the Von Karman constant,


As mentioned in the introduction, the LBE has
’u, a typical turbulent velocity, d is a calibration
been recently merged with the finite volume
constant, and 6 = u / v , is the thickness of the
method to produce a variant of LBE (FVLBE
“viscous sublayer”.
hereafter) able to deal with non-uniform grids.
The average velocity profiles drawn from the nu-
The idea is t o take the differential form of
merical simulation are checked against the above
LB dynamics and apply a finite-volume procedure
expressions to produce best fit values of U”,v:, d,
based upon integration of eq. ( 1 ) on each cell of a
where the superscript n denotes ‘numerical simu-
control grid of (almost) arbitrary shape.
lation’ (see Figure 1).
By straigthforward use of Gauss theorem, we
The actual values of U,, u
‘,: d, are derived from
obtain
the slope of the linear plot 5, ‘us z (v:/v), the
slope of the plot C, ‘us log(t) ( v,/x ) and the
value of Zog(ii,) a t z = 1 ( ‘ u , / x .l o g ( v , / v ) + dv,).
The main outcome of these simulations is that
where Fa,cis the mean population of the macro-
turbulence is supported during the entire life span
cell c, @,,c the corresponding flux across the
of the simulation, that is 2.4 x lo5 time steps, cor-
boundaries of c , and is the rate of change
responding to about 90 longitudinal transit times.
if F,,c due to collisions occourring within the cell
This is due to the fact that the channel is long
c. Clearly, the actual computaion of surface fluxes
enough to support streamwise rolls feeding cross
involve an interpolation technique. For the case
channel turbulence.
in point, piece-wise linear interpolation is used so
that locality is preserved to a good extent in the Data samples have been collected every 53 steps
numerical scheme. in the interval [loo, 000,240,000], thus yielding
about 2600 profiles for statistical d a t a analysis.
This scheme has been validated for the case of
The numerical best-fit values deduced from the
three dimensional turbulent channel flow simula-
simulation are as follows: V” = 0.013 f 0.002 ,
tion on a moderate resolution grid (64 x 64 x 128)
spanning a physical channel of heigth H = 192,
v: = 0.013 f 0.001 , d” = 6.5 f 0.7.
length L, = 960, and width L, = 512, i.e. pretty First , we remark that the measured viscosity
close to the one examined by Moin and coworkers is about twice higher than the theoretical input
25-8

value v = 0.05. This is attributed to localized References


peaks of numerical viscosity occurring there where
the lattice pitch is changing due to the mesh non- [l]G. Amati, S. Succi and R. Benzi, Fluid Dyn.

uniformity (a sharp 1-2-1 mesh distribution along Res., in press

z has been adopted). [2] P. Bhatnagar, E.P. Gross, and M.K. Krook,

Second, we note that d" is within the error Phys. Rev., 94:511, 1954.

bars provided by the literature, although some-


[3] D. d'Humikres, P. Lallemand, and U. Frisch,
what on the upper side. Finally, since turbulence
Europhys. Lett., 2:291, 1986.
is sustained for a significant time-span, wall stress-
tensor statistics is also available for the purpose of [4] R. Benzi, S. Ciliberto, F. Baudet, F . Mas-

internal consistency checks This yields: saioli, S . Succi and R. Tripiccione, Phys. Rev.
E, R29, , 48, n.1, 1993

[5] U. Frisch, B. Hasslacher, and Y. Pomeau,


(9)
Phys. Rev. Lett., 56:1505, 1986.

in a pretty good match with the values deduced


[6] F.J. Higuera and J. Jimenez, Europhys. Lett.,
by the velocity profiles (Figure 2).
9(7):663-668, 1989.
To sum up, these moderate resolution runs sug-
[7] F. Higuera, S . Succi, R. Benzi Europhys.
gest that the FVLBE scheme provides results well
Lett., 9-4, 1989.
within the error bars of current CFD at quite a
comparable computational cost (10 ps per grid- [8] F. Massaioli, R. Benzi, S. Succi and R. Tripic-

point per step on a IBM RS 6000 mod. 580 work- cione, Eur. J . Mech. B/Fluids, 14, n.1, 67-74,

stat ion). 1995

Further work is needed t o judge upon its com- [9] P. Moin, J . Jimenez, J. Fluid Mech., 225, 213
petitiveness on a more quantitative ground. 1991

[lo] G. McNamara and G. Zanetti, Phys. Rev.


6 Conclusions Lett., 61:2332, 1988.

[ll]Y. H. Qian and S. Orszag, Europhys. Lett.,


In summary, Lattice Boltzmann methods pro-
21-3, 1993.
vide a complementary numerical approach to tra-
ditional numerical methods for complex nonlin- [12] y. H. Q i m , s. Succi and s. Orszag, Anna1
ear systems. Benchmark problems have validated Review of Comput. Physics, vol. 5, 1995, in
the approach as a flexible and efficient numerical press
method. Fruitful applications are being made to
multiphase flow simulations, subgrid modeling of
turbulence and non-uniform lattice applications.
0.300

0.250

x 0.200
3

&
U
40.150
>
e
z
=0.100
......o 0...
-e..

.... 0 .-....
0.050 ...... 0 .....
........... 0.......
::::.......
...........g........
0.000 4 I I
I
1E+OO 1E+01 1E+02
AXIS 2
25-10

............................ ...............................

0 20 40 60 80 100 120 140 160 180 200


Axls 2
‘---

26- 1

TRANSITION IN THE CASE OF LOW FREE STREAM TURBULENCE

V.T.Grinchenko and V.S. Chelyshkov

Institute of Hydromechanics, NAS of Ukraine,


8/4, Zhelyabov st., Kiev, 252057, Ukraine

INTRODUCTION

It is well known that two types of transition are possible in the boundary layer: natu-
ral and ’bypass’ transition (see review of A.M. Savill [14]). First type of transition is
observed in the artificial case of low free stream turbulence, ’bypass’ transition usually
takes place in real technical equipment: aircraft, turbine engine etc. Theoretical inves-
tigations of both type of transition excite such difficult questions as problem of model
construction, problems of accurate and effective space and time resolution.
Known models can be divided onto two parts: semi-empirical models (for instance,
Savill-Launder-Younis model [15] ( 1995)) and models based on reduction of initial-value
and boundary problem for Navier-Stokes equations (adding of artificial term of mass
force adopted by Laurien E. & Kleiser L. [ l l ] (1989), Parabolised Stability Equations
model, which was designed by Bertolotty F.P., Herbert Th. & Spalart P.R. [l] (1992),
’fringe’ model suggested by P.R. Spalart [17] (1993)). We describe now one model of
second type, namely, the Slow and Fast disturbances interaction Model (SFM) designed
by V.S.Chelyshkov [6] (1993). The model is based on the assumption that slow and
fast disturbances interaction in longitudinal coordinate is possible in such weakly non-
parallel flows as non-gradient and gradient boundary layers, jets and wakes. This idea
was developed last years in the papers [4, 5, 6, 81 (see also review by V.T. Grinchenko
& V.S. Chelyshkov [9]). The approach is valid for 3-D flows, but we shall regard for
simplicity 2-D boundary layer near semi-infinite flat plate.

THE MODEL OF DISTURBANCE INTERACTION

It is known that two scales of flow in longitudinal coordinate (slow and fast) are possible
near a flat plate. Blasius flow is slow (weekly non-parallel) flow. Two dimensional
perturbances of Blasius flow are divided into two types: slow undamping perturbances,
which control the boundary layer thickness [12] and fast non-stationary perturbances
[16]. Both types of perturbances must depend on slow longitudinal coordinate, but
experimental and theoretical investigations show, that we can neglect this dependence for

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
26-2

second ones [16].Fast disturbances self-interaction results in fast and slow disturbances. I
The last ones make the contribution to weekly non-parallel flow compound. So the way of
SFM construction is following. Let 1 be the distance from leading edge to a fixed point on
I

a flat plate, U, - a velocity of run flow, p - the fluid density, U - the kinematic viscosity
coefficient. Cartesian coordinates (d, y') are introduced to describe 2 D non-stationary
1
flow, which depends on time t'. These coordinates beginning coincides with leading
edge, and 2'-axis directs along the plate. The velocity vector components are designated
as u',v' in this coordinate system, p' is the pressure. We choose the non-dimensional
variables using the formulae:
I
2' = 1x0, y' = S*y, t' = lUG'T, U = U,U, U' = U,Xv,

p' = pUip, S* = ~ d m ,= 1.72078766,


K X = S*/l.
Then the velocity vector field and the pressure

F = { U ,v,PI(X0,Y, T )
is described by Navier-Stokes and continuity equations
1
+ uaXou + vdyu = -dXop + -(ay9
E2
+ X2dXoX0)u,
X2(dTv + udXov + vayv) = -ayp + X2 -(ayy
K2
+ X2aXoX0)v, (1)
dxou + dyv = 0.
Equations (1) need suitable initial-value and boundary conditions in the flow domain,
which is not defined for the present. Boundary conditions

are set on a flat plate and far from the wall. Poisson equation for pressure

is the result of equations (1).


Parameter X is small far from the leading edge and Cartesian coordinates (X0,y) is
stretched out of transverse coordinate. When X+O Blasius solution

F i~ FB, F B = { ~ ~ , ~ ~ , p ~ } ( X o u, By =) ,{U B , UB },
satisfies Prandtl equations and the boundary conditions

Thus physical condition of damping v when y+m is substituted for limitness condition.
We define
+
Xo = 1 X, X = Ax, T = A t , Re = K ~ / X . (5)
26-3

The disturbed flow field is considered in half-band

v= I x I T/Q, 0 I y < CO},


where Q - O(1) is a parameter. Now we define the solution F of the problem under
consideration in the following way

F = F~ + F~ + ~f

S s s
FS = { U S , ~ S , p S } ( ~ o , Y , ~ ) ,= {U ,2, 1 (6)
Fr = {uf,vf/X,pf}(x,y,t), uf = {U f , Uf }
Here FS and Ff are the vector fields describing slow and fast disturbances. We introduce
the x-average in V
-
F=-
CY
27r
1
TICY

-T/CY
Fdx

and shall suppose that F’ = 0. Substituting (6) to initial problem (1) - (3) and throwing
away, as for laminar flow description, addends of the order of O(A2),a system of equations
and boundary conditions are obtained. We add to nonlinear equations and subtract from
them x-average of the convective addends, which contain fast disturbances. Now we can
separate in the convenient way all addends of each equation into two parts. Then we
break these two parts of addends and equate to zero each of them. The problem is
obtained:

atus + -((ax,U
IC2
Re
B US + ( U B + uS)ax0uS+ a,u B S + (?IB + VS)a,uS)-
2,

1
- -ay,us
Re
+ N,(UB, us, U f ) = 0, (7)

&,US + ayvs = 0,

- a,,pS = Np(UB,us, uf),

&U’ + 8,Vf = 0,
26-4

-APf = Np(uB,u s,U f) - Np(UB,US,Uf),

,f Iy=o= f lv=o-- 0 ,

N,(uB, us, U’) (UB + uS)d,uf + -(dxoUB


Re
K‘
+ &,U
S)U f +

Np(UB, us, U’) = 4-(dx,uB


K2

Re
+ dxou )a,uf+

+2(a,uB + dyUS)a.Vf + 2((axUf)2 + a , u f a v f ) .

In our opinion the SFM (7) - (14) describes near-wall flow in both cases of low and
high free stream turbulence. The equations have no the second y-derivative of us. That
is why the physical condition of damping vs ldr from the wall is replaced here, like in
(4), by limitness condition, and the solution of problem (7) - (14) will not be uniformly
applicable. Relationship
a,vs O(X)
is the condition of the model validity. This relationship cannot be established a priory,
but seems to be acceptable due to week dependence on time of FS. The natural conditions
of disturbances damping far from the wall, like in (2), have to be carried out for ”fast
part” of flow field. Substitution of one of Navier-Stokes equations for Poisson equation
allows us to construct time discretization schemes without the need for fractional step.
This way also gives the possibility to extend the solution algorithm to 3D-problem.
The values of velocity vector components are unknown at the boundaries orthogonal
to the wall. We cannot introduce periodicity conditions at these boundaries because the
flow is weakly non-parallel. Following the idea of boundary layer coherent structures [2],
we shall suppose that the flow is close to periodic in longitudinal direction and

To vanish slight arbitrariness in these boundary conditions we shall construct the solu-
tions, depending on both longitudinal coordinates in some special way.
26-5

SPACE A N D TIME DISCRETIZATION

Direct methods are applied for discretization of the problem (7)-(14). The known forms
of perturbances dependence on longitudinal coordinate are used for trial functions choice:

j=O j=O

PS = P Y Y , t ) , 77 = Y / G , (15)
{Uf,V f , P%, Y, t ) = b k , Vk, P k } ( Y , t ) exP(iakz)- (16)
Ikl<K,k#O

In (15) vo = 0, and power indexes vj ( j > 0) are selected on the basis of vorticity
exponential damping far from the wall, such as v1 = 1, v2 = 1.887, v3 = 2.867, v4 =
3 . 8 , . . . . Now we can expand first two terms in (6) into Taylor series in X, substitute the
result t o (7)-(14) and throw away addends of the order of o(Xz) in (11). Using (15),
(16) and expanding variable Xz into Fourier series in (11) we can separate longitudinal
coordinate by projection equations under consideration into two systems of test functions:
X k , IC = 0 , 1 , . . . , and exp(iamz), m # 0.
Sequences Xo-uJ and X k are not orthogonal to each other in the interval of their
changing. This leads to numerical difficulties for slow part of solution, when N is large,
due to necessity to inverse matrix of Hilbert’s type.
The next stage of approximation is the solution representation in coordinate orthog-
onal t o the wall in the interval [O,m).The asymptotics of the velocity and the pressure
field coefficients of fast disturbances far from the wall have the form e z p ( - a k y ) for
near-wall modes, where k > 0 is Fourier harmonic number. Therefore in the problem
class at issue for solution approximation in coordinate y it is convenient to use exponen-
tial polynomials (EP) orthogonal on semi-axis by weight of unity. Some computational
and/or algorithmic advantage can present EP &(y) = ezp( -ky)PA:y)(1 - 2 e z p ( - y ) )
obtained by orthogonalization of exponential sequence in inverse order, starting from
some number n [3]. Here P$’”’ are Jacobi polynomials. These polynomials are used for
solution representation in coordinate orthogonal to wall, and sequence €n,k is filled up by
unity for approximation of vertical velocity vector component of slow disturbances. Final
projection into phase space is carried out by Bubnov-Galerkin method, that allows one
to use the ’boundary functions’ [13] to satisfy the boundary conditions at the wall. For
precise numerical integration Gauss quadrature formulae derived in terms of properties
of EP is applied, so 3n/2 points are used in the algorithm.
The described way of spatial approximation results in triangular matrix as discrete
analog of Laplacian that allows one to employ explicit schemes in time. So variant of
Runge-Kutta method was adopted for time resolution.
The following stage of discretization is stated in details in [4]. Collocation method is
more preferable for 3-D flow modelling. Variant of collocation method, namely, combined
direct method is suggested in [7] for near-wall flow simulation.
26-6

N U M E R I C A L RESULTS O F NATURAL TRANSITION SIMULATION

Level of flow vorticity far from the wall y = 0 and inflow boundary conditions define the
influence of free stream turbulence in D-domain. Really, recent experiments [19] show,
that high free stream vorticity before a flat plate changes Blasius profile ancl escites fast
oscillations near the nose part of the plate. So both time-undamping slow perturbances
a.ncl fast disturbances are developed due to changing or inflow condi tions at the bounda.ry
of D-domain.
We shall consider here more simple case of exponentia.lly sinall free stream turbulence.
In this case we shall suppose that influence of time-undamping slow perturbances is
small for natural transition and slow part of disturbed flow is one-dimensional in the
boundary layer coordinates ( q , X o ) ,so N = 0 in (15). We omit the terms of the order of
O ( X ) in equations ( l l ) ,so periodicity conditions are valid at the orthogonal to the wall
boundaries for fast disturbances. Such simplifications lead to initial-value and boundary
problem, which has no functiona.1 u b i trariness in space. We also shall suppose that
modes of continuous spectrum are not excited 'and our algorithm is constructed in such
a way that disturbed flow clamps far from the wall in accordance with asimptotics of
u ear-w a.11 M O cl es .
Physical parameters Re = 520 and CY = '0.30s set 2-D flow doiiiain. Simulation
parameters are Ir' = 7 , n = 32. The parameter values yield dynamic system, which
has 409 degrees of freedom. The simulation was performed for interval 0 5 t 5 20000.
Initial values of amplitudes were determined from the solution of Orr - Soiiiiiierfielcl
eigenvalue problem. The values correspond to initiation of Tollniein - Sclilichting wave
with phase velocity eclual to 0.396. The clisturbance clevelopinent pictiire is diviclecl into
-
lhe two pa.rts. A t first ( 1 is less than 10000) the triivelliiig wave regime w i ~ ; l increiuiiig
amplitude arises. When the oscillation energy reaches some value the single-wave regime
i

is reconstructed and the regime close to oscillations with i~ianyfrequencies is excited.


Let us T,,, is disturbed skin friction and

For steady flow regime tlie amplitucIes 7w at R: = o a.nd 72


are shown in Fig. 1 and Fig. 3.
Power spectrum of rw is shown in Fig. 2, where integers are the numbers of Z-Fourier

1 1:. 1 I
2

19000 19050 19100 I 0 1 2 3 271 i/n


iig. 1 .. 2
Ij9.
26-1

harmonics, which have oscillations with frequencies according to the picks. One can see
that apart from main travelling wave, which has phase velocity equal to 0.566, other
oscillations' 'exist. Among these oscillations the largest energy has tlie oscillation with
convective speed equal to 0, SO9 Urn, which is excited by the second z-Foirrier harmonic.
It is of interest that each space scale has own nuiiibcr of oscillation frequencies. I t also
appears, that near-wall travelling wave phase velocity practically coincides with near-
wall propagation velocity of perturbations i n channel [lo]. In contrast with the result
of work [13] we have found that phase velocities of both pressure ancl friction equal to
each other near the wall. The skin friction s-Fourier harmonics f,(ak) decay rate is
shown in Fig. 4 for simulation time t 20000. One can see that the decay is enough -

0.005 1-

0.[Jo I: -. I...J...I._.I-.J.-.'__~_~_.I..~..I...J ..._1 _ . . 1 ......)..I....I . . J . . . i

10000. 1 f-3!jOO 19000 19500 0 0.5 1 1.5 2 I<


lis.3 i(Q.4

rapid, so the seventh harmonic is about 200 times less than the first.
Direct numerical simulation experience leads to the conclusion that non-dimensional
time, which is necessary to obtain fully developed flow, usually is very long. Curiously,
the according physical time is enough short. Let us T is the dimensional time, so

r = vRe / ULt.
If Urn = 1m/s and the fluid is water, then the physical simulation time is 10,4 s in
examining case. This time greatly differs from the computer time, which is necessary
for 2-D':modelling. Simulation of 3-D boundary layer is more dificult problem and tlie
statistically steady solution have not been obtained up to now in this case (see, for in-
stance, [IS]).

CONCLUSION

1. The new mathematical model based on Navier-Stokes equations has been devel-
oped. The model can be eRective for quantitative description of a class of weakly non-
homogeneous flows. The model was tested by consideration 'the flow stabili ty problem
near a flat plate.
26-8

2. To verify our model approach and discretization algorithms we have carried out
long-time DNS of disturbed Blasius flow for various but moderate numbers of degrees of
freedom.
3. We have found that balance between the numbers of taken in orthogonal directions
functions have to be observed. If Ii’ is the number of taken Fourier harmonics in lon-
gitudinal direction and n is the number of taken exponential polynomials in orthogonal
to the wall direction then n = n ( K ) for successful execution of our algorithms. It is
essential to notice that increasing of I< leads to n-increasing.
4. Our experience of near-wall flow modelling leads to the conclusion, that numerical
solution breakdown, the so-called ‘turbulence arising’ does not correspond to the real
physical phenomena in the boundary layer.
5. In our opinion we have found statistically steady state of flow near a flat plate. This
flow is time-organized structure, which has the background of quasi-periodic oscillations
with incommensurable frequencies.

References
1. Bertolotty F.P., Herbert Th. & Spalart P.R. Linear and non-linear stability of the
Blasius boundary layer// J. Fluid Mech, 1992, 242, 441.
2. Cantwell B. J. Organized motion in turbulent flows//Ann. Rev. Fluid Mech, 1981,
13, 4.57.

3. Chelyshkov V.S. Sequences of orthogonal on semi-axis exponential polynomials//


Doklady A N UkSSR, 1987, ser. A , No 1, 14 (in Russian).

4. Chelyshkov V.S. Numerical simulation of self-excited oscillations in the boundary


layer over a flat plate. 2. Two-dimensional non-linear problem. Kiev, 1990, 20 p.
(Prepr. A N UkSSR. Institut Matematiki.- 90.24) (in Russian).

5. Chelyshkov V.S. Numerical simulation of 2-D secondary flows in the boundary


layer //Doklady AN UkSSR, 1990, Ser. A, No 11, 43 (in Russian).

6. Chelyshkov V.S. The model of near-wall flow//NAS of Ukraine. Institute of Hy-


dromechanics. Annual report for 1993. Kiev, 1994, 54.
7. Chelyshkov V.S. Variant of direct method in the theory of hydrodynamic stabil-
ity// Gidromekhanika, 1994, No 68, 105 (in Russian).

8. Chelyshkov V.S. Wave regime in the boundary layer// NAS of Ukraine. Institute
of Hydromechanics. Annual report for 1994. Kiev, 1995, 41.
9. Grinchenko V.T., Chelyshkov V.S. Direct numerical simulation of boundary layer
transition. In: Near-Wall Turbulent Flows, Ed. So, Speziale and Launder. Elsevier,
1993, 889.
26-9

10. Kim J., Hussain F. Propagation velocity of perturbations in turbulent channel


flow J / Phys. Fluids , A, 1993, 5 , Pt. 3, 695.

11. Laurien E. & Kleiser L. Numerical simulation of boundary-layer transition and


transition control// J. Fluid Mech., 1989, 199, 403.
12. Libby P., Fox H. Some perturbation solutions in boundary layer theory. Part 1.
The momentum equation// J. Fluid Mech., 1963, 17, No 3, 433.
13. Orszag S.A. Numerical simulation of incompressible flows within a simple bound-
aries. 1. Galerkin (spectral) representation// Stud. Appl. Math., 1971, 5 0 , 4, 293.
14. Savill A.M. Some Recent Progress in the Turbulence Modelling of Bypass Tran-
sition// In: Near Wall Turbulent Flows. Ed. So, Speziale and Launder. Elsevier,
1993, 889.
15. Savill A.M. The Savill-Launder-Younis (SLY) RST Intermittency Model for Pre-
dicting Transition// In: ERCOFTAG bulletin, 1995, No 24, 37.
16. Schlichting H. Boundary Layer Theory, 7th ed, New-York, McGrow-Hill, 1979.
17. Spalart P.R. & Watmuff J.H. Experimental and numerical study of a turbulent
boundary layer with pressure gradients// J. Fluid Mech., 1993, 249, 337.
18. Spalart P. R., Crouch J . D., Ng L. L. Numerical study of realistic perturbations in
three-dimensional boundary layer. In : AGARD Symp. on application of direct and
large-eddy simulation to transition and turbulence. (1994, April 18-21) Chania,
Crete, Greece.
19. Westin I<. J. A., Boiko A. V., Klingmann B. G. B., Kozlov V. V., Alfredsson P.
H. Experiments in a boundary layer subjected to free stream turbulence. Part 1.
Boundary layer structure and receptivity // J. Fluid Mech.,1994, 281, 193.
1
27- I

Structured Adaptive Sub-Block Refinement for 3D Flows

I<. Becker
D e p a r t m e n t EF 11
Daimler-Benz Aerospace Airbus GmbH
D-28 183 Bremen

S . Rill
Hochschule Bremen
Hunefeldstr 1-5
D-28 I99 Bremen
Germany

1 SUMMARY threshold determines whet.her a point or local region is al-


Structured sub-block refinement is a means t.0 refine a ready 0.k. with respect to the expected error or whether
mesh a t certain areas within t,he flow region, in order it is a candidate for mesh adapt.ation.
to enhance the local resolution of the flow equations or In principle, mesh adaptation distinguishes between
flow solution wit.hout. going t.o costly global mesh refine- mesh enrichment and mesh movement. Mesh movement
ment. By the use of appr0pria.t.e sensors, the regions of tries to improve the solution by shifting the existing mesh
refinement can be defined during the running flow solving point,s t o more appr0priat.e positions. Mesh enrichment
process so that the adaptat.ion becomes automat,ic. And means to refine the mesh which leads to an increased
the use of struct.ured refinement, i.e. refinement by block- number of mesh point,s. Eventually, a coarsening of the
like areas, does only require minor changes to the overall mesh is also possible in regions where the quality mea-
multi-grid iteration scheme. Strategies for the selection sure is already good. Both approaches have their specific
of sub-blocks and first results for 2D and 3D Euler- and problems, best may be to combine them.
Navier-Stokes test cases are given. T h e drawbacks and Within this paper, we try to describe adaptive mesh en-
the potential of the method are discussed. richment strategies within a structured multi-block con-
text. T h e principle structure of the flow solver shall not
be affected by the local refinement. This means that re-
2 LIST OF SYMBOLS finement zones have to be of st,ructured type, i.e. they
A cont.inuons Navier-St,okes operator must be regular mesh blocks. We use a concept of sub-
U, coihnuoos solution t,o Navier-Stokes syst,em blocks which has been developed within the Euromesh
f right Iiantl side of Navier-Stokes syst.em project of BRITE/EURAM and the ECARP project of
I , I1 transfer operators bet.ween meshes I M T Aera 3 of C E C research. This concept allows to
7- local t,runcat.ion error treat sub-blocks as additional levels of refinement in the
I , J , I< indices of points in compntat.iona1 space usual multi-grid sequence of t.lie MELINA flow solver
Subscripts and superscripts (Fig. 1) [RilBec92].
h discrete form referring t,o mesh h. Structured mesh enrichment of the form described above
2h discrete form referring to mesh 2h. has its drawbacks when tracing features of the flow which
ih from mesh h to mesh 212, or run diagonally t,lirough the mesh. Unless there are lim-
mesh h relat,ive t,o mesh 2 h iters for the size of the sub-blocks, quite large refinement
approximation t,o . zones must be expect,ed. So, we are aware that this spe-
cial type of mesh enrichment will not be the ultimate but
3 INTRODUCTION a first and practicable solut,ion t,o mesh adaptation.
T h e process of discretization of the flow equations causes
differences between t,he continuous solution of the Navier- 4 SUB-BLOCK APPROACH
Stokes system of tlifferent.ial equations and the solution T h e idea of structured sub-block refinement is t,o simply
of the system put onto the comput.er. This error is called patch locally refined mesh blocks onto the existing mesh
local truncation error, and it plays the major role con- and connect the additional fine sub-blocks with the origi-
cerning solution deficiencies. Discretization errors, t,lieir nal mesh via a multigrid teclinique. Thereby, a sub-block
magnitude and distribution about. the flow region, are in- has to lie completely in a grid block of the existing mesh,
fluenced by geometrical mesh propert.ies as well as prop- which includes t,ouching the block boundary. But a grid
erties of the flow solut.ion. Both t,vpes have to be encoun- block may have various sub-blocks and a sub-block may
tered when selecting appropriate sensors that shall drive have several sub-blocks itself (see Fig. 2 ) .
adapt,ative flow solving algorit,lims. T h e sub-block approach can be viewed as a compromize
All types or combina.t.ions of sensors result. in single point,- be tween st,ruct.ured an U ns t.ruc tu red meshes, com bi n i ng
wise quantities which have t,o be scanned. A certain the benefit, of high comput,at.iunal efficiency on structured

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
21-2

1st Local
SUb- Bloc k
Level

Global
Fine
. Level

Global
Medium
Level

Global
Coarse
Level m
I i k
Cycles Cycles Cycles Cycles
Figure 1: Multigrid sequence with local refinement,.

meshes and of clustering grid points in a ”quasi unst,ruc- robust, fast and easy, we adopted it also for the Euler
tured” way by scattering sub-blocks and even further re- meshes.
fined blocks in regions of discret,izat,ion errors. It is en-
visaged to use this met,hod for solution adaptive mesh 4.2 ~ ~ ~ ~ between
~ , ~ alld~
~sub-blocks ~ ~
refinement if t.he regions of sub-block refinement are de- Coarse Blocks
termined automatically during the iterat.ioii by suitable
sensor functions. In general, sub-blocks cover only part of the computa-
tional domain. Boundary conditions on their outer block
boundaries must be defined such that there is no algo-
4.1 Surface and I n t e r i o r Point D e f i n i t i o n rithmic influence on the overall flow solution. Within tlie
When a sub-block is created, between each two mesh mult.igrid cont,ext,, flow variables are interpolated from
points on a coarse grid line an intermediate fine grid point the coarse mesh. If tlie sub-block boundary touches
lias to be introduced. the coarse block boundary, the same boundary condi-
On any component’s surface, t.his new point lias t o lie on tion is applied. Wall, symmet.ry or similar conditions are
the surface. This means that the new point has to be con- thus treated correctly. Special t.liings have to be done
structed using the original surface definition. However, if the sub-block boundary lies inside the coarse block.
this causes severe problems if t,he surfaces are defined Boundary values of the sub-block cannot be set as fixed
by external CAD means, for example. Therefore, most Dirichlet type conditions bccause this conflicts with tlie
oft.eii special interpolation procedures are used which cre- mixed type nature of the flow equations. T h e int,erpo-
a t e local surface approximations from t.he existing coarse lated values serve only as initial guess and the values
mesh points. T h e single approaches differ by the qualitmy are updated using the original flow equations themselves
of surface representation. For aerodynamics, the criteria on the fine mesh. Thereforr a t least one row of guard
of absolut distances t o the real CAD surface and wavy- cells has to be created around the sub-block which con-
ness of the interpolated surface play the major role. For tains the flux int.egra1 information needed for the applica-
the moment, we don’t. want to stress this problem: we tion of the cell vertex discretizat,ion a t the real sub-block
simply use Coons’ local patches. boundary. This procedure is quite the same as is applied
T h e definition of interior fine mesh points is not that con- between two adjacent blocks of the original non-refined
strained. As long as Euler meshes are considered, those mesh. In addition to this, co‘iservativeness has t o be en-
mesh points can be construct.ed using simple trilinear in- sured across the sub-block boundaries. In our code, this
terpolation of the coarse cells in t,he field. is achieved by replacing the flux integrals along coarse
For the very dense Navier-St,okes meshes, in the vicinity cell faces a t the sub-block boundary location: the coarse
of a curved surface intersections of field mesh lines wit,h mesh integral is replaced by tlie sum of the participating
the true boundary are very likely to occur with t d i n e a r fine mesh integrals.
interpolation. Therefore the filling algorithm has been This type of communicat,io:i between sub-blocks and
changed to the use of Coons’ representation for each mesh blocks is managed wit,h the help of the face group con-
plane parallel to tlie surface, not, only tlie surface planes. cept. Each block lias a t least one face group. This group
This guarantees smooth behaviour of the mesh in the of six faces consists of the minimum/maximum index
whole sub-block, especially in t,he boundary layer mesh. planes (boundaries of tlie computational domain) of tlie
Additionally, its avoids any intersection of mesh lines or block. For each sub-block that is added to tlie coarse
planes with fixed surfaces. Because this approach is t,liat, block, a new face group is denned. It contains those
27-3

Sub-Block 1 Sub-Block 2
Figure 2: Sub-blocks within a mesh block - schematic
view.

segments of coarse mesh planes that coincide with the For example, if we have a four block finest mesh, in a first
block boundaries of the respective local sub-block. So adaptation step sub-blocks might be suggested only for
this face group is the hull of the sub-block inside the three blocks. This leads to different finest levels on dif-
coarse block. The respective topological description dat,a ferent blocks within the mult,i-grid cycles. Additionally,
are nsed to drive the communication offlow variables and consecutive sub-blocking during subsequent adaptation
other relevant data within the flow solver. loops has to be allowed which means that snb-sub-sub-
If two sub-blocks within a coarse block or across the ...blocks can occur. Such and similar conditions have
boundaries of two coarse blocks are adjacent to each been investigated concerning the convergence behaviour
other, then communication should be allowed directly and the quality of solution. especially at the junction of
between those sub-blocks. The simplest way is to trans- refined and non-refined regions. No specific problem has
fer data from a sub-block to the respective face group been detected with the Eider flow solver. However, with
of the coarse block and from there to the neighbouring the Navier-Stokes solver it turned out that the imple-
sub-block. However, this path of communication con- mentation of the turbulence model has a great impact.
tains interpolation errors and should thus be replaced by In practice, the Baldwin-Lomax model used requires wall
the immediate transfer of data from one sub-block to its distance information. This information is very difficult to
neighbour. Within the topological description data, this obtain in general multi-block meshes if it is not evaluated
problem could be easily solved because sub-blocks are in a preprocessing step.
treated in the same way as usual blocks.
If anew sub-block isconstructed, the topological data are 5 SENSOR EVALUATION
updated automatically. Boundary conditions and con-
The evaluation of any sensor field always means scanning
nections to adjacent sub-blocks are detected and included
the field for a pre-specified range of values that are con-
in the description. This makes the fully adaptive incor-
sidered to indicate deficiencies of solution accuracy. We
poration of new sub-blocks into an existing multi-block
can distinguish between sensors that depend on the flow
mesh relatively easy once the respective coarse mesh face
solution itself and sensors that are defined by purely gew
group boundaries are known. One major technical dif-
metrical quantities. Mathematical analysis of discretira-
ficulty is the generality of sub-block to sub-block con-
tion leads to certain guideliues concerning the mesh. One
nections. Up to now, two sub-blocks of a coarse block
of those rules is that one should use smooth and orthog-
are only d o w e d to touch each other if it is with one full
onal meshes. Measures of those quantities can thus be
face. Touching only with part of a face would require
used to determine "bad" regions within an existing grid.
new segmentation of the respective faces and can easily
On the other hand, the flow itself shall drive the mesh
result in very complex face segmentations. On the other
in order to properly resolve special features like shocks,
hand, the above restriction hinders an effective treatment
stagnation regions. boundary layers or shear layers. The
of diagonal refinement. For the moment the drawback of
analysis of respective Sensors leads to suggestions for en-
full face touching has to be overcome by resizing respec-
hanced grid density regions.
tive sub-blocks. However, part-of-face touching is under
development.
Several topologically different sub-block configurations 5.1 Flow Independent Sensors
have been tested. Because the sensor evaluator may sug- Within the BRITE/EURAM Euromesh project, apalette
gest qnite general addition of sub-blocks, it might be nec- of geometrical quality measu:es haa been developed. Now
essary to have such arrangements run quite robust. we use these measnres for a priori qualification of meshes,
Figure 3 Local truncation error estimate for a
NACA0012 Euler caae - computational space and phys-
ical space - left: r(continuitg equation), right: r(2nd
momentum equation).
I. a

mainly. Within the DA mesh generation system IN-


GRID, the following 3D measures are implemented: or- Au=f (1)
thogonality, skewness. aspect ratio and expansion rate is discretised on a mesh with typical mesh size h
for 3 index directions. Neither of them leads to an ab-
solute criteria for mesh quality. Orthogonality, for ex-
ample, cannot be achieved in the whole mesh if there AhUh = f h (2)
appear angles other than 90 degrees on the surface. Re- where uh is the discrete solution, then the local trunca-
spectation of those angles is necessary for high quality tion error r h is defined by
surface representation, but clearly violates the principle
of orthogonality. Similar statements can be made for the Th = AhlL - A U (3)
other quantities. Nevertheless, those quantities should If we further add and subtract the discrete operator Ah
be taken into account when creating base meshes. applied to an approximation ihof the discrete solution
'Jhi

5.2 Flow Dependent Sensors


The first and most likely reason for deficiencies in solution
AhUh fh - Ahah Ahih,+ (4)

accuracy is a too high level of local truncation error. This and represent this equation on the next coarser grid with
error describes in principle how good the nonlinear op- mesh size 2h, then we end up with the multigrid coarse
erators of the Navier-Stokes equations are approximated grid correction equation
by the discrete differentiation and integration rules on a
specific mesh. It must be reminded that there is no lo- +
AzhUzh = I I i h ( f h - Ahah) AzhIihih, (5)
cality in the relation between this error and the global which contains the local truncation error estimate on
truiication error of the solution itself, i.e. the solution mesh 2h relative to mesh h:
error can occur at quite different locations than the local
truncation error [I<lim95]. This is especially due to the Tih = A2hIih6h - I I i h A h i h . (6)
transport character of the equations.
Under the assumptions Ah ii: A and i h ES U this yields
Truncation error estimates can be extracted directly
from the multi-grid cycles: Specific differences between
medium and coarse mesh residuals in a three level com- rih ii:AZhU - Au, (7)
putation yield a n estimate of the local truncation error which is the local truncation error rzh on mesh 2h. Fig.
r [Bra77]. This estimate for all equations of the Euler or 3 gives an impression on the distribution and the levels
Navier-Stokes system is used to define the locally refined of local truncation error for a 2D transonic test case.
(fine) mesh level. During the studies i t has been found very useful to have
In detail, if the continuous equation presentations of the estimate in physical as well as in
Figure 4: Suggestions for new sub-blocks - left: RA-
DIUS=5, right: RADIUS=2.

computational domain. Interestingly, the errors for t,he The user-given tolerance value RADIUS has a strong in-
single equations seem to be complementary to each other. flnence on the size of the sub-blocks. It also defines the
Near the nose of the airfoil. r(continuity - equation) minimum distance between two sub-blocks within one
suggests refinement in other parts of the flow field than block. In order to avoid that very large sub-blocks are
r(momentum - eguations), for example. For our inves- suggested which more look like a global mesh refinement
tigations, the L1-norm over all equations was taken to the maximum size of sub-blocks must be bounded. Also,
drive the refinement. More detailed studies can be found it may happen that many small sub-blocks are created if
in [Lau95]. any singular bad point is taken into account. This can be
hindered by a minimum bonud for the number of points
5.3 Sub-block Deflnition within a sub-block.
Fig. 4 shows the index cube iepresentation of suggestions
In thecontext of structured sub-block refinement, astrat-
for sub-blocks within coarse block. In the first case, a
egy has to be developed by which the location and ex- RADIUS of 5 was chosen whcreas in the second case the
tension of local subblocks can be determined. The eval- RADIUS value was 3, resulting in one more sub-block of
uation of any sensor defines a set of "bad" points or cells smaller size.
that appear as clouds in the index space of each struc-
tured block. On the one hand, the subblocks have to
cover those clouds. On the other hand, the size of the 5.4 A d a p t a t i o n Cycle
sub-blocks corresponds to the numerical effort and thus Mesh enrichment via sub-blocks should run automati-
has to be as small as possible. The strategy to define cally within the flow solutioii process. However, for the
reasonable subblocks is as follows: development, of snch a method it is reasonable to com-
bine the single elements of code in a more loose form.
Find a first bad point (1,J.K) The adaptation cycle has bren splitted into 4 steps:
Set IMIN=IMAX=I-index of bad point: same with
Start calculation on a reasonably fine mesh and
store the resu1t.s (mesh, flow solution, local trun-
e Trace the surroundiiies
(IMIN-RADIUS, IMAX+RADIUS; ...) of the cur-
- cation error),

rent (IMINJMAX; JMINJMAX; I<MIN,I<MAX) * Run the sub-block suggestion code and store
area for more bad p0int.s. MINIMAX indices for each coarse block,

If any more bad point h a s been identified, en- e Generate the enriched mesh which contains the
large the respective MINIMAX values and restart previous mesh and the new subblocks,
search.
Restart the flow solver using interpolated values as
If no more bad point can be found, define t.he sub- starting solution for the new sub-blocks.
block from the current MINIMAX values.
214

The RAE2822 test case 9 has often been used for valida-
-,.I tion purposes. Always problems with the suction peak
CP on the upper wing nose have been reported as i t is the
-1.2
c u e with the present results. Current computations have
been made for fully turbulent flow.
If (NI) is assumed to be a mesh of usual finen-, the
(Nl) result should be the target for adaptation. Results
produced with the coarser mesh (N2) obviously show up
4.8
high level numerical errors. If (N2) is refined locally as
described above, which is (N3). the result is already very
-0.5 close to the target (Nl) However, the computing time
is only about 40 p.c. of the ( N l ) computation, as can be
-0.4 seen from Fig. 6. Additional local refinement for (Nl),
which is (N4), yields again asolution which is more close
-0.2 to the experiment both near the nose and for the pres-
sure gradient in front of the shock. For cost reasons, a
0.0
target computation for a globally refined (Nl) mesh has
not been performed. The experiment has been used, in-
stead. In the (N4) case, convergence of the lift coefficient
0.2
is reached a t only minor additional expense compared to
(N1).
0.4
Fig. 6 shows the convergence behaviour of the method
for the different meshes. The residuals for all cases drop
0.6
0,s 0.1 a., 0.6 0.8 1.0
down with CPU time very quickly. There are no spe-
XIC cific observations in the case of embedded sub-block be-
ing present. However, the current implementation of the
Figure 5: Pressure distribution RAE2822 for meshes Baldwin Lomax turbulence model in the MELINA flow
(Nl),..,(N4) - comparison with experiment. solver may cause problems if the sub-block cuts the mesh
within the boundary layer. If such a sub-block does not
extend down to the wall surface, then the wall distance
This cyde can be run until the maximum number of re- needed for the turbulence model has not the right values
finement levels h a s been reached. It is assumed that each and may thus lead to bad results or even non-convergence
time only the relatively finest level can be refined. of the overall algorithm. This state of the flow solver
hinders automatic adaptation in any complex case at the
6 NUMERICAL RESULTS moment.
The sub-block concept described above has been imple-
mented in 3D. However, for cost reasons and for first 6.2 3D WingJBody Test Case
validation purpoaes it is reasonable to begin with 2D The application of the sensur analysis implemented in
Enler and Navier-Stokes flows. The basis of 3D Euler the ADAPTOR code [LauMauQS] to the F4 wiug/body
and Navier-Stokes investigations on local refinement and Navier-Stokes test case showed up nice properties. As can
adaptation was a wing/body combination. be seen from Figs. 7.8, with the current base mesh the T-
error is concentrated in the vicinity of the configuration.
6.1 2D Test Cases It clearly detects
Local mesh refinement has been tested in 2D, first: the body nose region as being not properly re-
RAE2822 test case 9 with a free stream Mach number solved,
of 0.734, angle of attack of 2.54 degrees and Reynolds
number of 6.5 million. The Navier-Stokes calculation the wing nose region .a spurious entropy produc-
should serve as a preliminary test to show the effect and tion region,
effectiveness of local refinement. Refinement was done
by hand using sub-blocks which covered the whole np- the shock region as being insufficiently resolved for
per surface including the supersonic region and extended steep gradients,
slightly on the lower surface near the nose of the air-
e the sonic line as being sensitive to numerical errors,
foil. Fig. 5 shows the resulting pressure distributions for
different meshes and the experimental values. We have the trailing edge and wake region as being sensi-
chosen four different meshes as there were tive because of rapidly changing flow including free
shear layers and
( N l ) standard fine C-mesh with 241 x 77 mesh points,
about 30 points normal to the wall in the boundary the boundary layer near the wall where pre-
layer, adaptation of the mesh to the boundary layer pro-
tiler is only possible up to a certain extent.
(N2)mesh ( N l ) coarsened once by omitting every sec-
ond point, with 131 x 39 mesh points, This makes us hope that automatic recognition of defi-
ciencies in discretization is possible, and adaptation will
(N3) mesh (N?) with sut-blocks and reduce the overall local trnncntion error.
(N4) mesh ( N l ) with sub-blocks
~

27-7

.4

,
b ko
.L
aim Bbo
I

41i~ iiw iwo 2000 o izw iino 2mb


Time Time
Fielire 6: Conuernence behaviour for meshes fN1) ....(N4\- residual and lift coefficient a d n s t CPU time.

'ieure 7 F4 winglbodv confieuration - continuitv eauation truncation error estimate for surface and svmmetrv dane.

Figure 8: F4 winglbody configuration - truncation error estimate of x-momentum equation for spanwise mesh plane.
21-8

Figure 9: F4 wingfbody configuration - Mach contours and subblocks at spanwise mesh plane.

-1.4

-1.2

-1.0

-0.n

-0.5

e
0 -a.+

-0.2

0.0

0.2

O.$

0.6
0.0 0.2 0.4 0.6 0.8 1.0
x/c
Figure 10: F4 wingfbody configuration - cp at 63 p.c. (left) and 52 p.c. (right) wing span ; -.-fine reference ;
....._non-refined : - - - with embedding.
27-9

T h e ADAPTOR run with the medium grid truncat,ion (Lit,erat.ura.uswert,ung, 1993, 93-038) iind
error estimate of all 5 flow equations leads to mult.iple I1 (Anrvendung und Beurteilung, 1994,
sub-blocks. I t mainly suggests a block in t,he vicinit,y of 94-091), Technical Report, Daimler-
the surface along the whole span of the wing, starting Ben z- A G .
somewhere below and behind the wing and ext,ending
over tlie upper surface again behind the trailing edge. [ Lau 951 Lauke Th.: ”Adaption von Rechennetzen
A second block covers the off-surface region around the zur Steigerung der Genauigkeit von 3D-
wing nose and extends about t,he supersonic region. Part,s Stromungssiniulationen”, Diplomarbeit,
of the sub-blocks can be seen in Fig. 9, where a spanwise Technical University of Berlin, June 1995.
cut with local Mach contours is shown. T h e pressure
[LauMau95] Lauke T h . , Ma.uch. H.: ” A D A P T O R
distribution at two mid-wing cut,s show quite a good im-
User’s Manual”, Daimler-Benz Aerospace
provement compared t,o the coarse mesh solution (Fig.
Airbus, June 1995.
10). Tlie suction peak as well as the pressure roof top
gradient, and the shock position are in a good agreement, [RilBec93] Rill, S.; Becker. I<.: ”MELINA - A
compared to the fine mesh reference solution. And t.he Mult,i-Block. Multi-Grid 3D Euler Code
locally refined mesh has only about 40 p.c. of t,lie points wit,li Local Sub-Block Technique for Lo-
of t,he global fine mesh. cal Mesh Refinement”, ICAS Paper 92-
4.3.R, ICAS Conf., Beijing, Sept. 1992.
7 CONCLUSIONS
Mesh enrichment based on a structured sub-block ap-
proach has been considered as an effect,ive way t,o im-
prove numerical solut,ion of flow equations. Tools 1ia.ve
been defined and st,rat,egicprovisions have been made to
test this approach under indust,rial constraint,s. Up to
now, the main procedures have been set up. Results for
locally refined meshes have been calculated for 3D and
3D Euler- and Navier-Stokes test cases. Next st.ep will
be tlie full integration of the adapt,at,ion into the flow
solver and the validat.ion and improvement of the overall
process.
Because of the problems with t,urbulence model imple-
mentat,ion in Navier-Stokes we’ll first try to sort out, t.lie
automatic a d a p h t i o n problems, sensor analysis, etc. on
1 the basis of the Euler equat,ions. More general sub-block
- to - sub-block conuect,ions are under development, which
allow a more cost-effective resolution of diagonal flow fea-
t,ures.
A lot of tests have been run mit.1~t,he ADAPTOR code,
and a lot of changes of tlie evaluat,ing strategy have been
necessary in order to find a reasonable suggestion for snb-
blocks. Tlie expense of more t,lian 50 p.c. cost saving
which we have achieved wit,h t,he current examples is al-
ready quite good under indust.ria1 conditions. Ongoing
work will be concentrated on making adaptation fully
automatic, robust, and efficient.

8 ACKNOWLEDGEMENTS
T h e basis of t.his work has beeu partly conduct,ed as
BRITE/EURAM area 5, C E C funded, applied research.
Recent resulk have been obt,a.ined within the I M T area3,
C E C funded, E C A R P project. We a.re grateful for t.liis
support.

References
[Bec93] Becker. I<., Anma.nn, P.: ” T h e Int,erac-
tive Grid Generation System INGRID -
Version 5.0”, DA-report, December 1993.
[Bra771 Branclt., A.: ” Multi-Level Adapt,ive So-
lutions t,o Boundary Value Problems”,
Mat,liemat,ics of Comput,at,ion, Vol. 31,
No. 138, pp. 333-390, April 1977.
[I<lim951 Klimetzek, F.: ”Felilerast,imat,oren fiir
Stromii ngsbereclin ungsverfaliren, Teil I
1
28- 1

MuItibIockStructured Grid Algorithms for Euler Solvers


in a ParallelComputingFramework

Stefan0 Sibilla
Aennacchi SPA.
Dipartimento di Aerodinamica
E a Foresio, 1 21040 Venegono Superiore (VA) Italy

and

Marcello Vitaletti
IBM Semea S.p.A.
E. C.S.E.C.
Piazza G. Pastore,6 00144 Roma Italy

SUMMARY
Cartesian coordinates
Specific algorithms have been developed for specific heat ratio
numerical solution of Euler equations on multi- time step
block structured grids of general topology; these numerical viscosity coefficient
algorithms involve determination of convective spectral radius
and dissipative fluxes, residual collection from pressure sensor
fine grid levels during multigrid cycles and time curvilinear coordinates
step evaluation. They must be properly integra- density
ted with residual and flow variable averaging
when the internal boundary condition is introdu-
ced. 1. INTRODUCTION
The influence of block subdivision on the
bow-shock in front of a blunt-nosed body is Multiblock methods consist in the decomposition
analysed with different multiblock algorithms; a of complex computational domains into simpler
structured and a locally unstructured topology subdomains, which can bc more easily handled
are also compared. in the management of the simulation and in the
Results show that no additional error is introdu- subdivision of the computational task on diffe-
ced in multiblock solutions if internal block rent processors.
boundary conditions are applied at each stage Structured grid blocks can be generated in these
and edgehorner boundary cell contributions to subdomains, in ordcr to combine the efficiency
flow quantities are properly taken in account. and simplicity of CFD algorithms developed for
single-block structured grids with the geometric
flexibility needed to describe topologically
LIST OF SYMBOLS complex regions.
The main difficulty in multiblock methods lies
a speed of sound in the correct treatment of block interfaces,
Cd drag coefficient which are located in thc flow region and repre-
CFL Courant number sent a numerical boundary condition with no
CP,, stagnation pressure coefficient reference to the physical problem: their presence
D dissipative flux can introduce errors in the solution which can
E specific energy either prevent complete convergencc to the exact
H specific enthalpy solution or impose constraints on the grid gene-
P pressure ration.
e convective flux IBM has developed a parallel multiblock frame-
work called PARAGRID [1,2] which supports
4 flow quantity vector
R- residual suitable data structure for the management of
S cell face area vector data communication between adjacent blocks.
4 v,w velocity components The computation is performed in parallel mode
V cell volume at block level, thus allowing exploitation of
VC” control volume

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
28-2

workstation clusters and/or multi-processor and are solved through a cell-vertex finite
systems. volume space discretization [4]: flow quantity
A structured multiblock Euler solver had been values located at cell corners represent average
previously implemented within this framework values of flow quantities in the control volume
[3]; its results were generally good as long as the made of all the cells (e.g. 8 for an internal node
overall solution quality and global aerodynamic of a structured grid) sharing that node.
coefficient evaluation were concerned. Problems Convective fluxes through the control volume
were nevertheless detected in the convergence surface, which are represented by the second
rate and in the solution quality at the interfaces term in the left hand sidc of (3), are computed as
between adjacent blocks; moreover, only "stru- sum of the contributions of all the cell faces
ctured" block topologies were solved consistently which form the control volume surface itself;
with original structured algorithm: this means, face values are taken as the average of the values
for example, that only internal edges shared by at the corners of the face.
four blocks or corners shared by eight blocks Such scheme is equivalent to a second-order
were allowed; for all other block topologies, accurate central diffcrence on a Cartesian grid;
approximate corrections were introduced. such discretization leads to odd-even decoupling,
Some solution algorithms were therefore modi- allowing numerical oscillations, and provides no
fied in order to account for the presence of intrinsic numerical dissipation to damp these
locally unstructured topologies at block bounda- oscillations and other non-linear instabilities. A
ries; these algorithms were designed for applica- dissipative term, based on first- and third-order
tion in a parallel environment, minimi7ing the differences of the flux variables and scaled on
number of data exchanges between adjacent the local spectral radii of the flux Jacobians, is
blocks, and therefore the communications be- introduced in the form of an added flux term [5].
tween computational nodes. For a control volume centered on the grid point
i,j,k equation (3) takes the semi-discretized form
2. NUMERICAL SCHEME

2.1 Finite volume formulation

The three-dimensional Euler equations


In equation (4) Qi,i,kis the discretized convective
9 + - af ( q ) + - ag ( q ) + - ah ( g ) = O (1) flux
at ax aY az

where

where NFid,, is the number of cell faces forming


the surface of the control volume centered on
node &j,k and having area vector 8; the form of
the dissipative flux D,,i,kis discussed in section 4.
Equation (4) is solved in time by a 5-stage
Runge-Kutta scheme [6] whose coefficients are
chosen in order lo allow high stability limits
(CFL = 4 on the linear convection equation) and
large margins for numerical dissipation. Stability
limits can be increased by. two or three tinies if
residuals are smoothed by application of a suita-
ble implicit operator at the end of each inter-
mediate Runge-Kutta stage. Finally, multigrid
are written in integral form method for the reduction of low frequency
errors [4] accelerates convergence to steady state.
22. Domain decomposition

The computational domain is divided into smal-


ler hexahedral structured blocks; each of the
28-3

faces of a block is either part of a physical


boundary or it is an interface to an adjacent
block. An enlarged computational block is built,
adding to the original core a two-layer halo
extending in all the blocks sharing a boundary is derived instead of (7). Data exchange of II
face, edge or corner with the original subdomain. values is performed at this point): having built II
Updated flow values are available in the halo as a cell quantity instead of a nodal one, no
regions when data exchange is performed. averaging step is required and computational
Although made up of structured parts, the enlar- overhead is minimum.
ged subdomain can show locally unstructured Being the time step in the control volume relati-
regions at core edges or corners (figure 1). ve to a node
Equation (5) for the determination of the conve-
ctive flux depends only on the determination of
the number NF of cell faces that form the con-
trol volume surface; it can be applied
straightforwardly to structured as well as to
unstructured topologies. If updated values are nil
available in all the enlarged copmpuational
blocks, identical values of QK are computed for
the different replicas of node K.
where NC is the number of cells which build up
23. Local time step computation the control volume CV, a convenient modified
time step is obtained from (9):
Local time step must be computed from available
data in each stuctured and unstructured control
volume in a minimum number of computational
steps in order to reduce the number of data
exchanges. At the end of each step, updated
n=l
flowfield quantities are available only in the core
region of the block. Cell spectral radii are com-
puted as sum of contributions in each grid-
coordinate direction: for [-direction one obtains

where the average spectral radius in the control


volume

and similar expressions for q- and (-directions,


which sum up into the local spectral radius

A=A,+A,+Ic .
has been introduced, and II values are available I
in the whole extended domain.
To minimize data exchange needs in the compu- Modified time step (11) is directly introduced
tation of spectral radii at block interfaces, the into the time-discretized form of (4)
product of cell contributions (6) and of cell
volumes

is computed in each core region, and from which updated values of flow quantities are I
obtained.
28-4

2.4. Multigrid residual driving includes the jive-stage Ruiige-Kutta cycle on a


grid level and the restrictiodprolongation of
In multigrid methods, the residual of numerical solution and residuals to the successive grid
solution of (4)on the fine grid is used to "drive" level. In the third implementation the block
the residual evaluation on the coarser ones, i.e. update step only includes a single Runge-Kutta
coarse grid steps are used to determine corre- stage on the current grid level.
ctions to fine grid residuals rather than comple- The first strategy has minimum memory requi-
tely new residual values. rements and maximum parallel efficiency but
A simple algorithm has been used to collect fine leads to an inconsistency in the computation of
grid nodal residuals to coarse grid nodal "driver the flow field at internal boundaries. In this case,
residuals" with small computational effort and the averaging process is applied to the flow
validity on structured as well as unstructured quantities associated with all replicas of a boun-
topological entities. dary node.
A fine grid cell residual is computed as In principle, the third strategy ensures identity
of values assigned to different replicas of a
boundary node at the price of larger memory
requirements and overheads due to the more
frequent exchange of halo data. In practice,
small discrepancies in boundary node replicas
still occur, due to the implicit nature of the
residual smoothing phase which is confined to
work at the block level. With this choice the
where nii,k is the number of cells sharing fine averaging process is applied to the residuals
grid node i,j,k (figure 2-a). After cell data rather than the flow quantities.
exchange, coarse grid nodal "driver residual" The second choice represents a compromise
values are obtained by sum of the contributions between the previous two: halo flow values are
of the fine grid cells which share the coarse grid still frozen during the time integration, but more
node: frequent data exchange between blocks reduces
strongly the generation of interface errors; on
the other hand, solution is faster and requires
less CPU memory than the exact solution.

3.2. Numerical experiments

Numerical experiments show that numerical


Fine grid cell values (14) contribute to one errors introduced by the first interface condition
coarse grid node only and the algorithm guaran- reduce local stability margins and put severe
tees correct evaluation of driver residuals on restrictions on the block subdivision of the grid:
unstructured nodes automatically (figure 2-b). block interfaces falling in the middle of strong
gradient regions can often lead to divergence of
the computation.
3. BLOCK INTERFACE CONDITION A simple geometry, consisting in a cylindrical
body ending with a spherical cap of unit radius,
3.1. Data exchange strategies has been chosen to investigate the limits of the
examined strategies.
Contiguous blocks share nodes on boundary faces Figure 3 shows different topologies used in the
and/or edges and/or vertices, while cells belong analysis of the blunt-nosed body at a Mach
to a single block only; in a cell-vertex formula- number of 2 and zero incidence. The block
tion, where flow variables are defined at nodes, interface in grid "A" crosses intentionally the
different values may be computed in replicas of bow shock close to the symmetry axis, where
the same boundary node owned by different shock intensity is higher; in the grid "B" the
blocks. The PARAGRID framework ensures that division surface has been moved upstream.
the same average value is assigned to all such Single-block solutions are compared with multi-
replicas of a boundary node at the end of each block solutions obtained by application of diffe-
block update step, when data exchange between rent interface treatments; all computations have
blocks is performed. been run for 100 multigrid steps with 3-level W-
Three different implementations of the multi- cycle, after 50+50 initialization steps on two
grid algorithm have been studied. In the first coarser grid levels. They have all been performed
implementation the block update step includes a in single precision.
full multi-grid cycle, with frozen halo data. In Single-block computations (figure 4-a) show a
the second implementation the block update step bow-shock located in front of the nose, at a
28-5

distance of approx. 0.35 nose radii; maximum


pressure coefficient at stagnation is C p , = 1.63
and drag coefficient is Cd =0.7756 with refe-
rence to the cross section area.
Figure 4-b shows the flow pattern resulting from
all the converged computations on the 4-block
grid "A": the block interface passes at (x=-1.35,
y=O), i.e. where the bow-shock intensity is
mmmum, but the position of the bow shock is
identical, Cpst =1.63 and Cd =0.7756. scaled on local spectral radii components (6); nu-
Flow variable exchange and averaging at the end merical viscosity coefficients are based on the
of the multigrid cycle (figure 5) leads to diver- pressure sensor
gence on the 4-block grid "A"at a CFL number
of 8 and forces either a reduction of CFL num-
ber to 6 or the adoption of the modified grid "B".
Figure 6 shows that, at CFL=8, data exchange at
the end of each Runge-Kutta cycle leads to
convergence in 50 %more steps than exchange at
each intermediate stage. Figure 7 shows that
single and multiblock computations are equiva-
lent in the latter case, and that the interface
boundary condition becomes completely transpa-
rent to the computation.
A simulation of the transonic vortical flow and take the form
around a wing-body-canard sharp leading edge
configuration, at a Mach number of 0.85 and an
incidence of lo", has been obtained from a
multiblock computation with data exchange at
each Runge-Kutta stage, and compared with
single-block results [7].
Pressure plots in the cross flow (figure 8-a) and
on the wing surface (figure 8-b) at 0.6 wing
chords show that block decomposition has slight
influence on position or intensity of the vortices, The above formulation cannot be consistently
although block interfaces cross both the wing applied to the nodes of block edges and corners
leading edge and the canard vortex. where the block topology is locally unstructured:
the dissipation computed for different replicas
of such boundary nodes on the basis of equation
4. NUMERICAL DISSIPATION (16) would span different sets of neighbouring
nodes, thus leading to an inconsistency.
The dissipative flux Di,i,kin equation (4) is based
on a background term, dependent on the third An unstructured formulation derived from the
order difference of the flow variables scaled on work of Mavriplis [8] has bcen tested to evaluate
the local spectral radii of the flux Jacobians. A improvements in the analysis of flows in these
sensor based on the local pressure gradient regions.
switches a first order difference term in presence An approximation to the Laplacian at the boun-
of flow discontinuities. dary node K=(i,j,k) is constructed as
On a structured grid the dissipative flux is

n n

where the summation in (20) is performed over


all the n nodes connected by a cell edge to node
and each mixed fust- and third-order difference K.
term [4,6] based on local curvilinear coordinate In this case the dissipative flux becomes the sum
system is of a Laplacian and a biharmonic operator
28-6

(21) leads to complete, although slower, conver-


gence.

5. TIME ANI) MEMORY REQUIREMENTS

CPU time requirements have been measured by


serial runs of the blunt-nosed body test case on
an IBM Risc 6000 550. These measure, together
with data relative to memory occupation, is
obviously dependent on the code FL067P-2 [9]
where A is the spectral radius in the control and on the PARAGRID framework: they are
volume relative to node K= (i,j,k), the pressure presented here mostly as qualitative comparison
sensor is between the multiblock algorithms previously
discussed.
Table 1 shows CPU times and RAM occupation
needed by for the various proposed strategies;
times are expresscd as CPU seconds per node per
iteration, memory occupation is expressed as
Mbytes per thousand nodes.
C
J=l
(PJ+PK)
Exchange at multigrid level needs 20 %less time
than exchange at each Runge-Kutta stage; this
partly compensates for the reduction in CFL
number. On the other hand, it needs less than 50
and the nodal numerical viscosity coefficients, as %memory. Memory occupation can in all cases
in (19), are be reduced by .08 Mbyteknode if metric coeffi-
cients are recomputed at the beginning of each
block update step, at the price of higher time
requirements.

6. CONCLUSIONS

The determination of the most convenient multi-


Coefficients 7/12) and V4) can be set in both block solution strategy among the examined
formulations to obtain desired properties of algorithms is not immediate.
convergence and damping. Optimal convergence Numerical experiments of section 3.2 and 4 show
was obtained in this case with values W)=1and that, if data exchange is performed at each
V4)= 1/32 in the structured formulation, W)= U2 intermediat'e stage, interface condition has no
impact on stability limits and convergence rate;
and M4)=15/1024 in the unstructured formula- other conditions generate a reduction of conver-
tion. gence rate and, in the case of exchange only at
the end of the multigrid cycle, of stability limits.
A different grid around the blunt-nosed body On the other hand, approximate solutions at
has been generated: a 7-block grid showing an block interfaces yield a reduction in time and
unstructured edge, shared by five neighbouring mostly in memory needs.
blocks, in the vicinity of the bow shock wave. Exchanging halo data at Runge-Kutta level is a
Figure 9 shows the block decomposition and the compromise solution which retains stability
flow pattern at Mach 2 and zero incidence; a bounds of the exact description with reduced
slight deflection of the shock wave is present at
time and memory requirements at the price of a
the unstructured edge, but it should be ascribed slower convergence.
to the unsuitable cell distribution in the zone.
Computations have been carried out with both
dissipation schemes; plots or' the logarithm of REFERENCES
density residual in the unstructured edge nodes
are compared in figure 10, showing that errors
1. Poggi F., Dellagiacoma F., Paoletti S.,
due to inconsistent computation of dissipative Vitaletti M., "Multidomain Computations
fluxes (17) prevent from complete convergence, of Compressible Flows in a Parallel Sche-
even if variable averaging is performed at each duling Environment" in Parallel Compu-
Runge-Kutta stage; unstructured formulation tational Fluid Dynamics '92, ed. R.B. Pelz,
28-7

A. Ecer and J. Hauser, pp. 111-122, 6. Jamwon A., Schmidt W.,Turkel E., "Nu-
North-Holland, 1992. merical Solutions of the Euler Equations
by Finite Volume Methods Using Runge-
2. Paoletti S., Po@ F., Vitaletti M., "Para- Kutta Timestepping Schemes", AIAA
grid - a Parallel Multiblock Environment Paper 81-1259, 1981.
for Computational Fluid Dynamics", proc.
of YECIPAR '93 - 1st International Mee- 7. Malfa E., "IEPG - TA15 Results of Aer-
ling on Vector and Pamllel Processing, macchi Euler Code Around the Wing-
Porto, 1993. Body-Canard WB(C)-l Configuration g i n
Transonic Flow Condition", Aermacchi
3. Dellagiacoma F., Vitaletti M., Jameson A., Report 275-TA15-021, 1991.
Martinelli L., Sibfla S., Vitini L.,
"Flo67p: a Multi-block Version of Fl67 8. Mavriplis DJ., "Accurate Multigrid Solu-
Running within Paragrid", in Parallel tion of the Euler Equations on Unstructu-
Computanbnal Fluid Dynamics: New red and Adaptive Meshes", AIAA Journal,
Trends and Advances, ed. A.Ecer et al., Vol. 28, No. 2, February 1990, pp. 213-
pp. 199-206, NOrth-HOU~d, 1995. 221.
4. Jameson A., "A Vertex Based Multigrid 9. Sibfla S., Vitaletti M., "Cell-Vertex Multi-
Algorithm for Three Dimensional Com- grid Solvers in the PARAGRID Frame-
pressible Flow Calculations", RSME work", proc. of Pamllel Computafional
Symposium on Numm~cal Methodr for Fluid Dynamics '95, Pasedena, 1995.
Compressible Flows, AMaheim, 1986.
5. Jameson A., "Transonic Flow Calcula-
tions", Princeton University MAE Report
1651, 1984.

Table 1 Time and memory requirements for the examined multiblock algorithms.

Figure 1 Example of unstructured local topology in amultiblock structured grid edge shared by
five blocks.
core core
fine grid node o coarse ridnode
a) x fine grid cell b) x finegricf cell

Figure 2 Algorithm for multigrid "driver residual" computation: a) contribution of h e grid nodes
to fine grid cell values (14); b) contribution of fine grid cell values to coarse grid nodal
"driver residuals".

l-block Qblocks 4-blocks

Figure 3 Blunt nosed body flow at Mach 2: Flow features and grid topologies.

F i e 4 Iso-pressure plot of the flow around a blunt nosed body (Mach=& B =O):a) singleblock
computation; b) multiblock computation on grid "A".
28-9

-+

Figure 5 Convergence. history of maximum density residual for multiblock solution of the flow
around a blunt nosed body (Mach=& a=O):data exchange performed at the end of each
multigrid cycle.

Figure Convergence history of maximum density residual for multiblock solution c the flow
around a blunt nosed body (Mach=& a=O):behaviour of different strategies for data
exchange between grid blocks at C n = 8 .
0

L%n(R PmaJ
1m

*m
- rlnple-block
#-block w i t h
301

.m

*m

aoD
m a0 -0 w la. 4-

Figure 7 Convergence. history of maximum density residual for multiblock solution of the flow
around a blunt nosed body (Mach=Z, a=O): comparison between single-block and
multiblock computations at CFL =6.
28-10

z .-
.I

ws

Figure 8 Flow around a wing-body-canard configuration (Mach=0.85, a =lOo):a) iso-pressure ot


on wing and canard; b) iso-pressure plot at 0.6 wing chords from a single-block solution
[A; c) iso-pressure plot at 0.6 wing chords from a solution on a 32-block decomposition
of the single-block grid; d) pressure coefficient on wing surface at 0.6 wing chords.
28-11

Figure 9 bo-pressure plot of the flow around a blunt nosed body (Mach=& a=O):multibIock
solution on a 7-block grid with l o d y unstructured topology.

..... structured f o r u l .
unltructvred f o w l
L

Figure 10 Convergence history of density residual on the locally unstructured edge of the 7-block
grid around a blunt nosed body for solution of the flow at Mach=2and a=O:behaviour
of different arti6cial dissipation models.
29- I

AMELIORATIONS RECENTES DU CODE DE CALCUL


D’ECOULEMENTS COMPRESSIBLES FLU3M

L.Cambier,D.Darracq’,M.Gazaix,Ph.Guillen,Ch.Jouetl L.Le Toullec


ONERA, B.P 72, 92322 Chiitillon Cedex, France.

1 ceuvre. La possibilitk de calculs en gaz biespkce est


Abstract illustrde par un calcul de jet chaud.
We present three developments which have been intro- Depuis, FLU3M a fourni la base de nombreux
duced in the code FLU3M. ddveloppements, autant dans le domaine des mod-
A numerical method for solving the unsteady Euler Clisations physiques, que dans celui des techniques
equations with time-varying rigid grids is first studied; numiriques amdliorant la prCcision et la rapiditd des
it uses the van Leer scheme together with a second or- calculs.
der in time implicit algorithm. Ainsi, les Cquations .de Navier-Stokes, en rigime lam-
A bidimensional nozzle and an afterbody shape have inaire, sont maintenant rCsolues numdriquement. Le
been calculated with the Jones-Launder k-c model, the code a CtC CprouvC sur plusieurs cas de validation:
implementation of which in the code is described for par exemple, une rampe hypersonique 3D prCsentCe
one and two species gases. au Workshop d’Antibes [2], ou encore une configura-
Then a new implicit algorithm is shown; precisely the tion ogive-cylindre avec Ccoulement tourbillonnaire [3].
DDLU factorization enables a reduction both in CPU Pour les icoulements hypersoniques, un nouveau dia-
time and in cost memory against the AD1 factoriza- gramme de Mollier a CtC CtudiC [4]; en plus des pro-
tion. priCtds thermodynamiques de l’air B l’kquilibre, il four-
nit les viscositds et conductivites en vue de calculs
Rdsumd Navier-Stokes.
Trois d6veloDDements effectues dans le code FLU3M De nouvelles possibilitks de discrktisation en espace ont
sont prdsentks. Une mCthode de rksolution des Cqua- Ctd explorkes et en particulier les techniques de mail-
tions d’Euler instationnaires pour des mouvements de lages chimkres. Des calculs complexes (sCparation de
solide est d’abord CtudiCe; elle utilise le schCma de van missile) peuvent itre ainsi plus facilement rCalisCs [5].
Leer, ainsi qu’une approche implicite d’ordre deux en Nous prisentons ici plus en detail trois axes de
temps permettant de rkduire les temps de calcul. dkveloppement. Ces dkveloppements, rCalisCs dans un
Une tuykre bidimensionnelle ainsi qu’un arriire-corps code unique, sont facilitCs par la grande modularite du
ont CtC calculCs avec le modkle de turbulence k-c de code et par la clartC de la structure arborescente.
Jones-Launder, dont on dCcrit l’implantation dans le Un axe de dCveloppement est liC B 1’Ctude des
code pour un Ccoulement monoespkce ou biespkce. Des phdnomknes d’aCroClasticitC. La mise en ceuvre des
comparaisons avec l’expkrience sont effectukes. Cquations d’Euler instationnaires en maillage mobile
Puis un nouvel algorithme de rdsolution implicite a est prdsentie, ainsi que diffkrentes approches d’ordre 2
6tC Ctudik; la factorisation DDLU permet des gains en en temps permettant de rdduire les coiits de calcul.
temps de calcul et place mkmoire par rapport B une Dans le cadre des activitks sur les modkles de tur-
factorisation ADI. bulence, nous dCcrivons l’introduction d’un modkle a
1. Introduction deux Cquations de transport de type k-cl pour un gaz
parfait monoespkce ou un gaz biespkce.
Depuis 1987, un code de calculs aerodynamiques Pour terminer, un nouvel algorithme de rksolution
(FLUJM), multidomaines, multiespkces, est ddveloppd du systkme implicite est prCsentC. Nous Ctudions la
B la division de 1’ACrodynamique ThCorique 1, factorisation DDLU rCduisant l’espace mCmoire et le
de I’ONERA. temps de calcul par rapport A une factorisation ADI.
En 1989, les principaux choix numCriques et la struc-
ture informatique du code sont publies au sCminaire
international de Boston [ 11. Des calculs numCriques
Euler gaz parfait et gaz riel B 1’Cquilibre y sont prben- 2. Calculs instationnaires en maillane mobile
tCs sur des configurations multidomaines telles que la
navette Hermks; les Ccoulements itant supersoniques, Dans le cadre des Ctudes d’aCroklasticitC pour les
des techniques de marche en espace sont mises en lanceurs de type Ariane, une mdthode numdrique Eu-
~~~~
ler instationnaire a CtC dkveloppke dans FLU3M. Aprks
‘Doctorant sous convention CIFRE SNECMA la formulation des Cquations instationnaires, les dif-

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
29-2

fkrents choix numkriques sont exposksl puis le cal- du jacobien des flux sont les suivantes :
cul d’un profil NACA en mouvement de tangage est
prksentk, ainsi que celui d’un sph&re-c;ne 3D en oscil- X = T”,.?E’ (ordre 3)
lation autour de son centre de gravitk. X = T ” , . E ’ + c (ordre 1 ) (3)
X = T’,.?” - c (ordre 1 )
2.1 Equations instationnaires
Considkrons un profil 7 en mouvement de tangage, Les vecteurs propres du jacobien des flux sont exacte-
muni d’un maillage n(t).Dans le cadre de cette ktude, ment les mtmes que dans le cas des Cquations en mail-
ce domaine n(t),supposk non dkformable, est en mou- lage fixe.
vement par rapport B un repkre absolu Ro. Pour discrktiser les flux, la dkcomposition de van Leer
Sur n(t),les Cquations d’Euler s’kcrivent, sous forme peut Ctre ktendue aux Cquations d’Euler instation-
de lois de conservation : naires. Van Leer dkcompose le flux sous la forme suiv-
ante :
+
f = f + f - , oh f + ( r e s p . f - ) a des valeurs propres
positives (resp. nkgatives).
En introduisant le nombre de Mach relatif normal
++
Mr, = v r ; n , nous avons :
avec W = ( p , p $ , p E ) , variables conservatives ( p :a Si Mrn > 1, f + = f
densitC, T’ : vitesse absolue, E : knergie totale) si Mr, < -1, =
f -

et
Pour I M,, I< 1, f + a pour expression :
PVr .n
p 7 q Z . E ’ )+ p z
p E ( G . E ‘ )+ p v . n

+
ve est la vitesse d’entrainement, $ la vitesse rel-
ative. A la diffkrence d’autres approches utilisant les
vitesses relatives, les variables de calcul sont les vari-
ables absolues, c’est-&dire les vitesses absolues ex-
primkes dans le repkre absolu Ro. C’est une approche
classique qui, par rapport aux kquations en maillage
fixe, demande une modification des flux numkriques
qui font intervenir la vitesse d’entrainement be, ainsi
qu’un calcul de mktrique variable au cours du temps.
Pour discrktiser les flux, nous utilisons les mkthodes de
dkcentrement; Vinokur en donne une analyse dktaillke 2.3 MBtrique instationnaire
dans [7]. Au cours du mouvement du profil, le maillage est mo-
bile et, par conskquent, les normales aux interfaces
2.2 DiscrBtisation des flux doivent Ctre calculkes B chaque instant. Par hypothbe,
On pourra vkrifier que le jacobien des flux a pour ex- le maillage ne se dkforme pas, les volumes ne changent
pression : donc pas. 11s sont calculks une fois pour toutes a
l’instant initial to.

F
0
I1 faut kvaluer la moyenne de E’’ sur un pas de temps
At, ce que l’on fait en considkrant l’instant tn++.
(7- l ) V 2 . Z - vn. v +
P T Nous avons alors : n ( t , + + ) = R(t,.,++).%(to),OG
R est la matrice de rotation du mouvement prise B
l’instant n + f. Le calcul des flux nkcessite kgale-
ment la connaissance, B l’instant tn++, de la vitesse
d’entrainement aux interfaces des mailles de calcul.
Connaissant les coordonnkes B l’instant t=O du centre
I d’une interface, la vitesse d’entrainement est donnke
par :
* + ---*
avec v, = + +
v . n et veri = v,. n . Les valeurs propres
--t---t
V e l ( t n + + ) = G(tn++)G ( t n + + ) * A l
29-3

oh 3 est le tenseur de vitesse de rotation du solide,


A un point du solide. 2.5.1 Schemas prCcis h l'ordre 1
Ceci complkte et dCfinit les donnCes nkcessaires B un +
Le sche'ma ezplicite : W"+' = W" Atg" est pricis
calcul instationnaire, en maillage rigide. B l'ordre 1. I1 est stable sous la condition c f l 3 1, ce
qui, en pratique impose des pas de temps trks petits.
2.4 Conditions aux limites instationnaires En effet, pour calculer avec un maillage assez fin un
ConsidCrons l'exemple d'un profil muni d'un maillage profil NACA64AO 10 oscillant avec un mouvement de
en C, en mouvement par rapport B Ro. Un Ccoulement battement B une frCquence de 100 Hz, 40 000 pas de
uniforme est imposC B l'amont du profil. temps explicites sont nkcessaires pour effectuer un
cycle.
I
Deux conditions aux limites sont 8. envisager: d'une
part, aux frontikres a l'infini, pour imposer un Ccoule- Le sche'ma implicite : W"+' = W" Atgn+' est +
ment tout en kvitant les rCflexions d'ondes; d'autre prCcis B l'ordre 1. L a fonction g"+' est linkarisde en :
part, B la paroi oh une condition de glissement doit g"+l = g" + AtA"(W"+' - W " ) ce qui conduit au
i t r e imposde. schkma : ( I - (At)2A")(Wn+' - W " ) = g n a t . Ce
schCma est inconditionnellement stable. Pour obtenir
1. Conditions ci la limite ci la paroi. la mime prCcision qu'un calcul explicite, sur des
++ grandeurs telles que la portance et le moment, des cft
D'aprks la condition de glissement, vr . n = 0, oh de l'ordre de 100 peuvent i t r e utilisks pour les profils
+
v, est la vitesse relative. Le flux a la paroi devient : oscillants.
2.5.2 SchCmas prCcis h l'ordre 2

F(W;z)=si. p . 3 ( p<.Z
0
) Sche'ma Runge-Kutta implicite de Iannelli-Baker
C'est un schdma Runge-Kutta B deux ktapes
implicites [IO].

oh p est la pression B l'interface, calculke par une ex- ( I - a(At)2An)AW1= g(W")At (6)
trapolation, Cventuellement corrigke par une relation
car actkr ist ique . (I - a(At)'An)AW2 = g(W" + pAW1)At (7)
t et Wn+l = W" + 71AWl + 72AW2 (8)
2. Conditions d la limite d 'entre'e-sortie.
2-Jz
avec o = -' p = 2(3&- 4), (9)
Nous adoptons la formulation proposCe par Coller- 2
candy [9]. Elle est ktendue aux Cquations en maillage 6-fi
mobile. 71 = - I72 = - 6+Jz (10)
8 8
Cinq caractkristiques de pentes A,
X E {'urn,vrn + c , v , , - c } arrivent B l'interface B Ce schima est inconditionnellement stable, avec des
l'instant n + l . Suivant le signe de la pente A, la vari- cft beaucoup plus grands que ceux utilises avec un
able caractiristique associde sera calculke avec un Ctat schima implicite d'ordre 1, de l'ordre de 400, toujours
extkrieur ou intCrieur. pour obtenir des rksultats de pricision Cquivalente A
Plus prkcisdment, le calcul des valeurs propres est ef- celle d'un calcul explicite.
, fectuC B l'aide d'un Ctat moyen:
+
W, = 51 ( W i n t e r i e u r W e z t e r i e u r ) , ce qui permet de 2.6 Cas de validation
calculer les valeurs propres et de connaitre leur signe. 2.6.1 Profil NACA64A010
On calcule ensuite les variables caractkristiques asso- Le profil choisi est un NACA64A010, correspondant
ciCes B ces valeurs propres, avec les Ctats intCrieur et aux conditions expirimentales suivantes (Fig. 1):
extCrieur .
Si X est nkgative, c'est la variable caractkristique ex- M I 0.796 I
tkrieure qui sera choisie; sinon, on prendra la variable 203321 Pa
I caractiristique intkrieure.
I
2.5 DiscrQtisationen temps 34.4 He
I L a prkcision en temps des schCmas utilisks en instation-
aire est un point important. Plusieurs schkmas d'ordre
i
I
un ou deux en temps sont dCcrits et leurs propridtb
de stabilitk sont brikvement rappeldes. L'kquation A L a loi du mouvement du profil de l'aile est donnee
I discritiser en temps est la suivante: par:
I
(5)

1
29-4

Le profil a dtd ddfini dans un rapport AGARD [ll] Le choix suivant a dte considdrd pour les coefficients du
et calculi par de nombreux auteurs [8]. modkle: C,, = 1.57, C,,= 2.: ak = 1.’ a , = 1.3, C , =
Pour chaque essai instationnaire, on prdsente 0.09 . Les termes Dk et D, ddsignent des termes addi-
l’dvolution de la portance et le moment. Les calculs tionnels lids a la formulation bas-Reynolds et destines a
des diffdrentes approches en temps sont compards a reprksenter l’amortissement de la turbulence au voisi-
l’expdrience et B un calcul explicite de rdfdrence B cfl nage des parois. Dans l’article de Jones. et Launder
=0,8. [13], l’expression de ces termes est donnde en repkre
Sur les figures 2 et 3, les calculs des approches d’ordre de couche limite. Dans le cadre de la rdsolution des
1 et d’ordre 2 en temps, pour un cfl de 400, sont dquations de Navier-Stokes, nous avons considdrk les
reprdsentds. Si nous considdrons le calcul du moment, expressions suivantes pour Dk et D,:
l’approche d’ordre 1 ne donne pas le mCme rdsultat que
2
le calcul de rkfdrence explicite, alors que i’approche Dk = -2pllVhII
d’ordre 2 donne un rdsultat identique.
A cfl = 100, pour la mdthode d’ordre 1, le coiit de
calcul est divisd par 17 par rapport B une mdthode ex-
plicite a cfl = 1 (400 pas de temps/cycle contre 40 000
Les quantitds f 2 et f , sont aussi likes a l’amortissement
en explicite). Le coiit de la mdthode Runge-Kutta im-
de la turbulence prts des parois. Elles sont fonctions
plicite d’ordre 2 est pratiquement identique B celui de
du nombre de Reynolds turbulent Ret:
la mdthode d’ordre 1, puisque la matrice implicite est
la meme dans les deux pas de calcul.
En conclusion, la mdthode implicite d’ordre 2 en temps Pk2
Ret = -
est la plus indiqude pour des calculs instationnaires. Pf
2
f 2 = 1 - 0.3ezp( -Ret )
2.6.2 Sphhre-cBne Adrospatiale
Un corps sphkro-conique 3D a dgalement CtC calculd f , = ezp(-- 2*5 1
l+%
(Fig.4); ce corps est en oscillation autour de son cen-
tre de gravitd G. Le nombre de Mach B l’infini est de Ce choix de fonctions d’amortissement ne faisant inter-
7. L’angle de tangage maximum est 010 = 1‘’ La loi du venir ni la distance B la paroi, ni le frottement paridtal,
mouvement est donnke par: permet de rdaliser une programmation du modkle de
a ( t ) = a0 * sin(k.7) turbulence independante de l’application considdrke,
ce qui constitue un avantage important pour un code
traitant des applications multidomaines complexes, tel
avec k = v;,f = 0,386. Un calcul instationnaire B cfi que le code FLU3M.
= 100, avec la mCthode Runge-Kutta implicite a dtd Le modkle de turbulence ( I C , E) qui vient d’Ctre ddcrit
rdalisi (Fig.4). Une comparaison a CtC faite avec un peut Ctre associd dans le code FLUJM, soit B une for-
autre calcul effectud par Adrospatiale [?I. Un k a r t de mulation monoespkce, soit B une formulation biespke.
+
8% est observd sur C,,, alors que C,,,,. Cmq est Dans cette dernikre formulation, on se place alors dans
identique dans les deux calculs (Fig.5). le cadre de l’dcoulement compressible turbulent d’un
mklange non rkactif de deux espkces, chaque espkce
3. ModBlisation de la turbulence i t a n t supposde Ctre un gaz parfait B chaleurs spd-
cifiques constantes. La discrdtisation des dquations
3.1 Modhle B deux Bquations de transport s’effectue de manikre analogue pour le systkme ”mono-
Le modkle de turbulence (k,E ) de Jones-Launder [I31 espkce ( k , c )” et pour le systkme ”biespkce ( k , E ) ” ~
est implant6 dans le code de calcul FLU3M. Les deux Les flux convectifs sont discritisis l’aide d’extensions
equations de transport pour pk et pc s’icrivent: du solveur de Riemann de Roe aux systimes d’ dqua-
tions coupldes. Les valeurs propres de la matrice Jaco-
+
d t p k div(pk77) = Fj : vv’+div((p + C”’)Vk)
a k
bienne ont pour expression: XI = U (ordre: neq - 2),
+
Xz = U c (ordre: I), X3 = U - c (ordre: l),oh neq
-PE+ Dk (11) ddsigne le nombre total d’dquations (7 en monoespkce
et 8 en biespkce). La quantitd c est une vitesse du son
+ € -
atpc div(pcv’) = c,,-7% : vv’+div((P
k
+
pt
-)v€)
a,
modifide donnde par :

c2 = r(P- + 4)
2
P 3
Le coefficient de viscositd turbulente p t a pour expres-
sion: Le rapport des,chaleurs spdcifiques 7 est supposd con-
Pk2 stant dans la formulation monoespkce, alors qu’en
Pt = C P f l L T (13) biespkce, il ddpend des densitds partielles et des
29-5

chaleurs spkcifiques des deux espkces, de la manikre gknkratrices respectivement kgales B 325 Kelvins et 10
suivante: bars. La pression gknkratrice du jet est plus klevke et
kgale B 42.3 bars. Le nombre de Reynolds calculi B
partir des grandeurs critiques associkes au jet et du
rayon du culot est kgal B 1.15 lo7. Le domaine de cal-
cul est divisk en trois sous-domaines: un sous-domaine
Le flux numkrique de Roe s’kcrit: D1 correspondant B l’kcoulement externe, un sous-
domaine 0 3 correspondant B la sortie de la tuykre et au
jet, et un sous-domaine intermkdiaire 0 2 comprenant
la rkgion du culot. Le nombre total de points de mail-
lage est kgal B 13,879. Des raffinements importants
sont introduits prks des parois. Par exemple, la taille
de maille prks de la paroi externe de l’arrikre-corps
oh hRdksigne la matrice diagonale des valeurs propres, est kgale B 10-4R. Sur la frontikre amont du sous-
et P R et PR-’ dksignent les matrices de passage. La domaine externe D1, on impose des profils issus des
notation R en indice supkrieur indique que les quan- donnkes expkrimentales pour la vitesse et les grandeurs
titks sont calculkes B l’aide de moyennes de Roe. turbulentes, alors que, sur la frontikre amont du sous-
La prkcision du second ordre en espace est obtenue domaine DB,les profils imposks sont issus d’un calcul
gr&e B l’approche MUSCL appliquke sur les variables prkliminaire de l’kcoulement dans la tuykre.
primitives. Les flux visqueux sont kvaluds B l’aide La figure 8 qui reprksente la solution sous forme de
d’une discrktisation centrke en espace. Dans le cas courbes iso-nombre de Mach, montre la forme clas-
de la formulation biespkce, on tient compte de la dif- sique en tonneau du jet, ainsi que l’onde de choc situke
fusion entre les espkces par l’intermkdiaire d’un nom- dans le jet. Une comparaison avec l’expkrience [15]
bre de Lewis Le et d’un nombre de Lewis turbulent est reprksentke sur les figures 9 et 10, sous forme de
Let . Une accilkration de convergence est rkaliske B profils d’knergie cinktique de turbulence et de pression
l’aide d’une phase implicite et de la technique du pas pitot dans deux sections situkes en aval du culot a des
de temps local. La phase implicite s’appuie sur une distances kgales A 0.59R et B 6 R. Les points expkri-
linkarisation des flux de van Leer pour les flux con- mentaux ont Ctk obtenus par vklocimktrie laser et par
vectifs, une linkarisation similaire B celle de Coakley un tube de Pitot. Bien que les donnkes expkrimentales
pour les flux visqueux, une linkarisation simplifike de pour l’knergie cinktique de turbulence ne soient rela-
la partie nkgative des termes source et une inversion tives qu’h la partie externe de l’kcoulement, l’accord
AD1 de la matrice implicite. apparait comme satisfaisant.
A titre d’exemples, on prksente ici les rksultats
obtenus dans le cadre de la formulation ”monoespkce 4. Factorisation DDLU
(k,E)” , sur une configuration d’interaction onde de 4.1 Description de l’algorithme
choc/couche limite dans un canal bidimensionnel, puis L’analyse de stabilitk linkaire de von Neumann de
sur une configuration d’arrikre-corps axisymktrique. la factorisation approchke AD1 rkvkle une instabilitk
La premikre configuration correspond Q une expkrience inconditionnelle en 3D (Cf. Ying [21]). Mime si
[14] rkaliske B 1’ONERA dans une tuykre symktrique. les termes non linkaires jouent un r6le stabilisateur
Le nombre de Mach en amont de l’interaction est kgal comme tendent B le prouver les codes de calcul util-
a 1.45. Sur la figure 6 qui reprksente les courbes iso- isant une telle approche, la factorisation triple AD1
nombre de Mach calculkes, on peut voir la structure reste pknaliske par un nombre d’opkrations important
classique de choc en A dans la rkgion d’interaction et et surtout une skvkre restriction de cfl due B l’erreur
I’important kpaississement de la couche limite rksul- de factorisation en At3. Une factorisation de type
tant de l’interaction avec le choc. La figure 7 prksente DDLU a donc itk dkveloppke pour amkliorer l’efficacitk
une comparaison avec l’expkrience portant sur la dis- de l’algorithme implicite. Les premikres mkthodes de
tribution de pression paridtale. Le plateau de pression dkcomposition DDLU de la matrice implicite ont ktk
obtenu par le calcul dans la rkgion d’interaction est proposkes simultankment par Jameson et Turkel [17]
plus petit que dans I’expkrience, ce qui correspond A et Steger et Warming [20] en 1981. Alors que les tech-
une lkgkre sous-estimation de la taille de la rkgion dk- niques de directions alternkes consistent & substituer
collke. Le rksultat obtenu est sur ce point comparable B l’opdrateur implicite un opdrateur factorisk suivant
a des rksultats obtenus antkrieurement avec d’autres les directions du maillage, la mkthode L U le dkcom-
codes de calcul mettant en oeuvre le modkle (k,E ) , sur pose en deux matrices triangulaires supkrieure et in-
la m i m e configuration. fkrieure. Jameson et Turkel montrent qu’un tel sys-
La deuxikme configuration traitke correspond a un tkme est bien conditionnk si les matrices sont a diago-
arrikre-corps axisymktrique muni d’une tuykre. Les nales dominantes. Aussi ont-ils propose une dkcompo-
conditions de l’kcoulement externe sont les suivantes: sition des matrices jacobiennes de la matrice implicite
nombre de Mach kgal B 4.18, temperature et pression augmentant la diagonale. Dans le schkma original, le
29-6

systkme implicite est mis sous la forme : Dans la dicomposition de Jameson et Turkel [17] 2,0n
vise B augmenter la dominance diagonale :

Dans le cas d’une discritisation dicentrie, il vient :


avec 2 1. Les relations (26) et (26) permettent

La diagonale D posskde une structure bloc dans le cas


oh i+, e+
A^-, g+, g- , , 2- sent les matrices jacobi-
ennes des flux et (, 71, C les coordonnies curvilignes. Ce
de
on
la decomposition (28) mais devient scalaire quand
utilise (29) :
schima reste peu utilisk sous cette forme.
Jameson et Turkel [17] ayant montrk que la condition D = I A-@[,=( + I&l)+ +
m=( 1x11I) max( I& /)]I (31)
de dominance diagonale permet d’assurer le bon con-
ditionnement des facteurs L et U, Jameson et Yoon Remarquons que cette propriktk de rCduction de la di-
~ variante (que 1’011 aPPellera ici agonale bloc & une diagonale scalaire est virifiie pour
[I81 ont d ~ v e l o P Pune
DDLU par analogic avec le DDADI), renforsant la di- les factorisations de type DDLU, DDADI et m6me
agonale pour les dquations d’Euler, puis Yoon et Jame- ADI. A l’opposi, la factorisation LU de base (23)’ B
son [22] pour les iquations de Navier-Stokes, Shuen et cause de sa nature dissYmktriWe, ne Peut binificier
yoon[191 pour les icoulements riactifs, enfin D~~~~~~ de cette diagonalisation. Les matrices jacobiennes aux
(lgg5)[161Pour le
interfaces sont CvaluCes & partir de la moyenne de Roe
afin de priserver une consistance avec le schCma ex-
il vient: plicite.
LD-’U.AQ = - A t 2 (24) Balayage plan oblique
Le balayage du domaine de calcul suivant les direc-
tions diagonales (i+j+k constant) dans le sens crois-
avec :
L = I + At(a;2+ + a;’+ ’-
+ayE+ - 2- - - E - ) 6
sant opirateur L) puis dicroissant (opifateur U )per-
met e vectoriser complktement l’inversion des matri-
D = I + At(A-+ -
= ’+
x-
+
++’
% +
’ + a-E+
c
A

+’
- B^- + c+ - c - )
-

- - - ’+)+

(25) ces L et U en ivitant les ricurrences non vectorisables.
La ricurrence entre points de la factorisation AD1 de-
vient une ricurrence entre plans lors de la factorisation
UULU.
Les matrices jacobiennes A^+, g+ et e+(respective-
En outre, le balaya e diagonal fait intervenir, autour
3
du point courant, es points dont la mise B jour a
ment A^-, B^- et E - ) sont construites de facon B ce effectuCe llCtape pr,+idente. On peut ainsi les
qu’elles ne posskdent que des valeurs propres positives ajouter au membre de droite : il n’y a done aucun
(respectivement nCgatives), c’est-&-dire : bloc B inverser.

A Fig.12 : Balayage oblique en 3D


avec A(,,,,( matrice diagonale des valeurs propres A titre de comparaison, on a implant6 un algorithme
A

k,s. du type SSOR avec les choix de dicomposition (29)


La fonction 7 dkcrit le caractkre du ddcentrement. conduisant B la diagonale scalaire (33). I1 s’agit d’une
29-7

FIACRE Mach 4 . 5
” “““7
a p roche itdrative de sur-relaxation symCtrique avec
barayage oblique :

4.2 Application
On prksente ici l’application de l’algorithme DDLU
dans sa version fluide parfait sur le cas test d’un fuse-
lage lenticulaire avec ritreint. Le nombre de Mach
vaut 4’5 et l’incidence est de loo. Le maillage est com-
pose de 42 x 27 x 44 points. La figure 11 reprksente la
solution obtenue B partir du schdma DDLU. La solu-
tion donnde par le schCma AD1 est identique. La figure
13 reprksente l’histoire de la convergence des rCsidus Fig.13: Comparaison des vitesses de conver-
implicites de la densite. La montCe en CFL est effec- gence des algorithmes
tuCe jusqu’8 une valeur de 500 pour les formulations 5. Conclusion
DDLU et SSOR et jusqu’8 100 pour la formulation
ADI. La vitesse de convergence de l’implicite DDLU Trois dkveloppements rCcents dans le code FLU3M ont
est meilleure que celle de l’implicite ADI. L’algorithme CtC prisentks.
SSOR permet une convergence plus rapide que celle de Les Cquations d’Euler instationnaires (en maillage non
l’algorithme DDLU. Mais le nombre d’itCrations in- diformable mobile) ont it6 discrCtisdes avec un schCma
ternes, une doueaine, augmente les temps de calcul Runge-Kutta implicite d’ordre 2 en temps associC aux
qui deviennent comparables B ceux du schkma ADI. flux de van Leer. Les cas de validation prksentCs, 2D
Le tableau 1 donne le temps de calcul par point et par et 3D, montrent la pricision et la rapiditd de la mCth-
itCration et le nombre de tableaux 3D ndcessaires pour ode, aussi prCcise qu’un calcul explicite, mais 70 fois
le stockage de la matrice implicite, pour les dicompo- plus rapide.
sitions AD1 et DDLU. L’implCmentation du modkle k-e de Jones- Launder,
La version DDLU est 2,3 fois plus rapide que la version y compris pour un gae biespkce, a ensuite Cti dCcrite.
ADI, et, d’autre part, le schima LU requiert prks de 1’5 Nous utilisons le solveur de Roe pour rdsoudre le sys-
fois moins de place mdmoire pour le stockage des ma- tkme complet des Cquations de Navier-Stokes couplies
trices implicites. En effet, on ne stocke en chaque point avec les Cquations de transport pour k et e. Si des rC-
courant qu’un s e d vecteur Di,k alors que la factorisa- sultats satisfaisants ont CtC obtenus, notarnment sur un
tion AD1 demande la riservation mdmoire en chaque cas d’arri6re-corps, des Ctudes concernant l’application
point de trois blocs D t i j k rD,,..et D C ~ .De~ plus,
. du modkle restent B effectuer; en particulier, le champ
*iJ
la factorisation AD1 fait appel a des tabfeaux tempo- initial des variables k et E doit Ctre ddtermind pour
raires lors de I’inversion. commencer le calcul; de plus, des phdnomknes de re-
laminarisation peuvent apparaitre. Pour terminer,
l’algorithme implicite de decomposition DDLU per-
met de rdduire les coiits mCmoire et temps de calcul B
chaque itCration. Cet algorithme peut Ctre Ctendu aux
Cquations avec modkle k-e.
Remerciements: Les dCveloppements relatifs au
modkle k-6, ainsi que ceux relatifs a l’instationnaire,
Algorithme Temps CPU ( p s ) Mkmoire ont CtC soutenus par Adrospatiale Espace et DCfense
AD1 Euler 3D 49 225 et par le CNES. Les travaux sur l’algorithme DDLU
DDLU Euler 3D 21 155
~

ont dtd effectuC dans le cadre de la thkse de D. Darracq,


stagiaire CIFRE ONERA-SNECMA.

Tableau 1: Comparaison des temps de calcul


29-8

References [13] W.P. Jones et B.E. Launder, The Prediction


of Laminarization with a Two-Equation Model
[l] Ph. Guillen, M. Dormieux, Design of a 30 Multi- of Turbulence, 1nt.J.of Heat and Mass Transfer,
Domain Euler Code, International Seminar on Su- vo1.15, pp.301-314 (1972).
percomputing, Boston, Oct. 3-5, 1989.
[14] J. DClery, Ezperimental Investigation of Turbu-
[2] C. Jouet, M. Borrel, Navier-Stokes Computation lence Properties in Transonic Shock/Boundary
over a Three- Dimensional Ramp, Proceedings of Layer Interaction, AIAA Journal, v01.21, pp.180-
Hypersonic Workshop of Antibes, avril 1991. 185, Fkv. 1983.
[31 c. Jouet, p- D'Espiney ,3D Laminar and 2 0 [15] P. Reijasse, Mode'lisation de l'kcoulement super-
Turbulent Computations with the Navier-Stokes sonique autour de l'arrikre-corps du lanceur A R -
Solver FLU3M, 8th Int. Conf. on Numeri-
I I A N E 5: Ezpe'riences de validation de code sur
cal Methods in Laminar and Turbulent Flow, des configurations d 'arrikre-corps azisyme'ttriques,
Swansea, July 1993. ONERA RF No 1/2536AY (1991).
[4] M. Gasaix, Hypersonic Inviscid and Viscous Flow [16] Darracq D., Etude n u m k i q u e des interactions
Computations with a New Optimized Thermody- choc-choc et choc-couche limite turbulente e n
namic Equilibrium Model, AIAA 93-0893, Reno, re'gime hypersonique , T h b e de Doctorat de
Jan. 1993. 1'Universitk de Poitiers, France (1995).
[5] J.P. Gillyboeuf, P. Mansuy, S. Pavsic, Two New [17] Jameson A., "urkel E., Implicit Schemes and
Chimera Methods : Application t o Missile Sepa- L U Decomposition , Mathematics of Computa-
ration, AIAA paper, Reno, Jan. 1995. tion, Vol. 37, n0156, pp 385-397, (1981).
[6] C. Jouet, V. Cheret, Me'thode numirique up- [18] Jameson A., Yoon S., L U Implicit Schemes with
plique'e d des profils oscillants , 316me colloque Multiple Grids for the Euler Equations, AIAA Pa-
AAAF d'adrodynamique appliquCe, mars 1995. per 86-0105, (1986).
[7] M. Vinokur, An Analysis of Finite-Difference [19] Shuen J.S., Yoon S., A Numerical Study of Chem-
and Finite- Volume Formulations of Conservation ically Reacting Flows Using an L U-SSORScheme
Laws, Journal of Computational Physics, Vol. 81, , AIAA Paper 88-0436, (1988).
no 1, mars 1989.
[20] Steger J.L., Warming R.F., Fluz Vector Split-
[8] J. Sidks, Computation of Unsteady Transonic ting of the Inviscid Gas Dynamics Equations with
Flows with an Implicit Numerical Method for Application t o Finite-Difference Methods, Jour.
Solving the Euler Equations, La Recherche Comp. Phys., Vol. 40, pp 263-293, (1981).
ACrospatiale, n02, fCvrier 1985.
[21] Ying S., Three-Dimensional Implicit Approzi-
[9] R. Collercandy, An Improved Approach for the
~~
mutely Factorized Schemes for Equations in Gas
Computation of Transonic/Supersonic Flows with Dynamics, Ph.D. Thesis, Stanford University,
Applications t o Aerospace Configurations, AIAA Stanford, CA, (1986).
paper 92-2613 CP, Jan. 1992.
[22] Yoon S., Jameson A., A LU-SSOR Scheme for the
[lo] G.S. Iannelli, A.J. Baker, A S t i f l y Stable Im- Euler and Navier-Stokes Equations, AAIA Paper
plicit Runge-Kutta Algorithm For CFD Applaca- 87-600, (1987).
tions, AIAA paper 88-416, Jan. 1988.
[ll] R.J. Zwaan, Naca 64A006 oscillating flap ,
AGARD Report n. 702, Compendium of unsteady
aerodynamic measurements, August 1982.

[12] F.
Coron, F. Ruffino , Pre'vision de l'ae'rodynamique
instationnaire autour de lanceurs pour l'e'tude de
l'akoklasticite' , 316me Colloque AAAF d' ACro-
dynakiqpe AppliquCe, Mars 1995.
29-9

FIG.6 - Thydm de D e l c q : 1-cr to-nomhm de Mach

FIQ.8 - Am&-capa: lignu to-nomhm de Moe6


A M = 0,2
29-10

hi-

FIG. 1 - SchCma impliciie d'ordre 1 en icmpa FIG. 2 - Schima Runge-Kuita impliciic d'ordm 2 n
kP=lOO) icmpr (ep-@O)

FIG. 4- Lignu no-nomm de M a d d 3 &ea


Abf = 0.5

Calcd DDLU 9D

..
..
. ,. a

FIG.5 - Momeni par mppml bU anire de #rami6 G


30- I I

The Computation of Aircraft Store Trajectories using


Hybrid (structured/unstructured) Grids.
D.J.Jones, F.Fortin, D.Hawken, G.F.Syms and Y.Sun,
Institute for Aerospace Research,
National Research Council,
Montreal Rd.,
Ottawa K1A OR6, CANADA
E-mail: [email protected]

Summary resulting in more flight and wind-


With the high costs associated with tunnel testing (Ref 1).
flight and wind tunnel testing, the
computation of aircraft store However, with the advancement of
trajectories is becoming more computational fluid dynamics (CFD)
important to the military techniques, a much faster prediction
establishment. In Canada, the of carriage and trajectory data is
Department of National Defence (DND) believed to be possible. In
requested IAR to acquire/ develop particular, a sufficiently reliable
the necessary tools to carry out the computed flowfield data could reduce
prediction of stores on release from the test matrix and supplement the
aircraft - particularly the DND’s measured data such that the
CF-18 aircraft. After debate whether additional testing could be reduced
to use structured Chimera schemes or or eliminated. Further, it is
unstructured schemes, IAR decided to anticipated that computed flowfields
use the latter techniques as there could serve as a diagnostic aid in
was already a development program deciding among possible solutions to
underway in that field of research. design problems. Both multiblock
I A R had already demonstrated that structured overlapping (Chimera -
hybrid (structured/ unstructured) see for example [2,3]), and
grids had produced successful unstructured grid methods (for
results and decided to pursue this example [4]and [ 5 ] ) have been used
approach for the unsteady 3-D to solve multi-component and moving-
computations. To this end, a study body systems.
has been made in the 2-D case of a
‘store‘ moving from the parent Both Chimera and unstructured
‘body’. Grid generation is underway methods have their advantages and
for the full CF-18 aircraft using a disadvantages and after careful
commercial code and several simpler consideration IAR decided to take
cases have been gridded and the unstructured grid route for its
computations made in a steady 3-D main thrust at tackling the problem,
environment. one of the main reasons being
availability of codes. A l s o with
1. Introduction multiblock techniques the grid cells
Accurate prediction of the tend to stay small and very
trajectory of a store released from stretched in some areas remote from
an aircraft is critical in assessing the aircraft making the method less
whether the store can be released efficient. Several commercial codes
safely. The trajectory of stores were at first considered as possible
released in aircraft flowfields has contenders for predicting store
always been difficult to predict. A release. Most of them were rejected
typical wind-tunnel/ flight test after an initial survey and only the
program intended to ensure that the two codes RAMPANT [6] and FASTRAN
store will release properly is [7] finally were on the ‘short
lengthy and costly. It may involve list’. Several test cases were run
20 flight tests, one or two wind on the short listed codes.
tunnel entries, and extend over a
period of several years. In the After in-house evaluation of these
event of an improper trajectory, codes it was found that neither was
pylon and/or attachment point fully satisfactory and attention was
modifications may have to be made turned to acquiring only a suitable
grid generation program. Thus IAR

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
30-2

evaluated the packages I-DEAS (81, obtained by using structured layers


FLITE3D [9](also contains a solver) of grid near the surface followed by
and ICEM [ l o ] and eventually came to an unstructured grid outside these
the conclusion that ICEM was the layers. The structured grid layers
best package. IAR then decided to were generated using advancing
develop its own 3-D Euler solver normals with some averaging to avoid
from the existing 2-D solver which clashing of the normals, in some
turned out to be not too time cases, as they advanced from the
consuming. Validation of this 3-D surface. An example of such grids is
code, called FJ3SOLV, is covered in shown in Fig lb for the RAE 2822
the paper below and it will be seen airfoil. Having generated these
that promising results are obtained. grids €or Navier-Stokes
Eventually the aim is to develop computations, the same grids were
this 3-D solver into a fully time then used for Euler results. These
dependent code with moving store and solutions also appeared to be very
grid but in the meantime a study in accurate as shown in Fig 2a for the
2-D has been underway and is RAE 2822 airfoil for a medium grid
reported in the last section below. of 60 points on each of the upper
This 2-D study will be beneficial in and lower surfaces. Similarly a
the 3-D development as various Navier-Stokes solution is shown in
efficiencies will be transferred to Fig 2b and further results were
the 3-D code. presented at the ETMA workshop in
Ref 16.
In the first section some background
material on the unstructured grid Thus this hybrid approach was one
developments at IAR is reviewed. which was desirable to use for
Then the 3-D grid generation and stores release since accurate
solver code acquisitions/ solutions had been obtained in the
developments and their validations 2-D steady version of the code as
are described. Finally the unsteady mentioned above. It will be shown
2-D code development is covered. later that the 2-D unsteady version
demonstrates good results. Although
2. 2-D Steady State Code it is hoped to eventually use a 3-D
Developments hybrid grid generator, none has yet
A fully unstructured Delaunay grid been acquired.
generation code was developed
several years ago and is reported in 3. Review of 3-D Commercial
Refs 11 and 12. It uses the standard Unstructured Codes.
Delaunay triangulation technique Initially it was planned to acquire
I131 with new points added a commercial code for both the grid
continually at the centroids of generation and the 3-D solver.
existing triangles until a criterion Several possible codes were rejected
of required grid density is after a preliminary evaluation made
fulfilled. by calling users of these codes.
The Euler equations are solved using Codes that only had a grid
a cell centred finite volume generation capability were rejected
technique with explicit artificial as we wanted the whole package
viscosity as in Ref 14 . Standard including the solver and post
acceleration techniques such as processing. Finally the codes
local time stepping, enthalpy RAMPANT and FASTRAN were selected
damping and implicit residual for in-house evaluation and a report
smoothing are used. Solutions of the on these codes has been made in Ref
Euler equations using these grids 17. In summary, it was found that
were obtained for several airfoils FASTRAN could not deliver good
and showed good accuracy compared to Navier-Stokes solutions for RAE 2822
standard AGARD test cases [15]. On and that RAMPANT, although
advancing to Navier-Stokes solutions reasonable results were obtained in
and trying to get cells very close some cases, was not robust and was
to the surface within the boundary very slow even in the Euler mode of
layer it was found that the grids operation. Later the FLITE3D codes
became of poor quality even for wall [9] were evaluated but these were
function type of grids, for example found to be lacking in terms of pre-
Fig la. Thus it was decided that a and post-processing and user
more satisfactory grid could be support.
30-3

Having reached the point of not a 3-D version FJ3SOLV was relatively
finding a complete package for grid easy as in the 2-D code the logic is
generation and solver, we then had set up so that it is driven by edges
to decide whether to develop our own of a triangle with the flux across
codes. The idea of developing a 3-D the edge being computed once and
grid generator was not relished added/subtracted to the total flux
whereas the code to carry out an balance for the triangles on each
unstructured grid Euler solver (and side of the edge. The same principle
later Navier-Stokes) appeared to be was used for 3-D but now the edge is
quite feasible. Thus IAR, with a a 'face' of a tetrahedron. In the
view to first acquiring a grid far field there is no vortex
generation package, contacted correction as in 2-D and the Riemann
vendors and evaluated several invariants alone are used.
unstructured grid generators
including I-DEAS [ 8 ] (mainly used To validate the new Euler code
for structural analysis) and GFEM FJ3SOLV, we first considered the RAE
[18]. These codes were eventually 2822 airfoil spread out as a
rejected as they were either very straight wing between two solid
cumbersome to use or were very slow walls. A boundary condition of no
in generating fairly simple grids. normal flow was imposed at the walls
so that the flow should be two-
Finally, the code ICEM [lo] was dimensional with no variation across
evaluated and was found to be quite the span. The ICEM grid generation
promising. A copy of this software package was used to generate a grid
was also obtained for evaluation. It and two views are shown on Fig 3a.
has a good user interface, This grid was then used to generate
preprocessing and postprocessing. a'solution shown in Fig 3b. Note
Its CAD software can build that the grid was not refined about
complicated wire frame surface the shock wave at about 70% chord on
models efficiently, and can take the upper surface and so produced a
point data in PLOT3D format, and result that was quite smeared out
IGES files from other CAD systems. around the shock wave. On refinement
This grid generation package of the grid around the shock, shown
supports the point, line and volume on Fig 4a, a much improved shock was
sources for density control and can obtained and good two dimensionality
generate 3D unstructured meshes was shown with little spanwise
efficiently. The Octree method is variation in the pressure. The
used, which refines the grid by accuracy for the airfoil pressure
subdividing the tetrahedron into distribution is demonstrated in Fig
eight smaller tetrahedra until a 4b where the solution at one
satisfactory grid density has been spanwise station (FJICEM), obtained
reached. Some examples of these by interpolation from nearby values,
grids are shown later. is compared to a completely 2-D
solution. The latter solution was
The idea of generating structured obtained with the 2-D solver FJSOLV
layers of tetrahedra near the with 30 points on each of the upper
surface will be pursued with the and lower surfaces (called FJDJ30 on
vendor ICEM, or IAR may develop its the figure); it was also run without
own capability using advancing a vortex correction and with roughly
normals as was done for 2-D. In the the same far field distance as in
meantime it will be used solely in the 3-D case. The solution using
its unstructured form which may be FJ3SOLV took 8 hours on the SGI
acceptable for Euler solutions. Power Challenge computer and used
about 299,000 grid cells.
4 . Development of a 3-D Euler
Solver and Validation Having achieved success with the '3-
Rather than trying to acquire a D' airfoil a more challenging case
commercial 3-D solver, IAR decided of some practical interest was next
it was more suitable to develop a 3- considered. The M-100 wing-body
D Euler solver from our existing 2-D configuration (ref 19) had been used
solver especially since the code when checking the RAMPANT code. It
would eventually have to be made is a good case as the experimental
into an unsteady version. To make data is quite reliable and it has
the existing 2-D code, FJSOLV, into been used as a test case by Grumman
30-4

for a Navier-Stokes code evaluation inside point movement two approaches


[20]. It was also considered to be a are used, the first uses spring
more realistic case for the ICEM analogies [22] while the second
grid generator as it has to cope computes velocities at the nodes by
with the intersecting surfaces of some kind of diffusive process and
the wing and body. Grids obtained then evaluates the displacements as
using ICEM are shown in Fig Sa. the product of velocities and time
Results at various spanwise step. It has been demonstrated that
locations are shown on Fig 5b. Here the first approach is not failure
several gridslcodes are compared: proof [23]. The second approach
FJ3SOLV using the ICEM grid seems more promising and is used
(designated Fjicem on the figure), here. The velocity of an inside
FJ3SOLV using the grid generated by point is computed as the average of
RAMPANT software (Fjrampant) and the velocities of the surrounding
lastly, the RAMPANT grid with the points with the velocities of the
RAMPANT solution (Rampant). It can boundaries points as limit
be seen that there is good agreement conditions. At the first time Step,
to the experimental data with the the inside velocities are
differences being typical of a non- initialized to zero. They are then
viscous 'Euler' result compared to successively updated by a series of
experiment. Also note that the Jacobi iterations. This process
present results as compared to gives a velocity distribution
RAMPANT results give a slight similar to that obtained by a
improvement while the computer time diffusive operator. For motions of
is down from 30 hours to 6 hours. big amplitude, since the velocities
The RAMPANT grid had 240,000 of the inside points are smaller
tetrahedra while the ICEM grid had than the ones on the boundaries, the
250,000 tetrahedra. faster moving nodes will overlap the
slower ones, which will require
5. Unsteady 2-D Code local remeshing.
Development and Validation
The existing steady state 2-D The new code was first tested on a
unstructured grid code FJSOLV was standard case [ 2 4 ] for the
developed into an unsteady version oscillating NACA0012 airfoil with
so that moving stores could be M=0.755, a0=2.51, a =0.160 and
simulated in 2-D and some of the reduced frequency 8.0814. The grid
problems investigated in 2 - D before generated €or this case contained
proceeding, at some later date, to a four layers of structured grid as
moving body 3-D code. Thus we can described earlier. Some very small
investigate such items as using a cells just aft of the structured
'window' around the store to keep zone were removed using interactive
the grid fixed there relative to the software [25] to improve grid
'store', moving the grid only within quality. Results for this standard
a second 'window' so that not all test case are shown in Figs 6a and
the grid is moved, grid refinements 6b where CN, Cm and several Cp plots
and grid interpolation. are presented. The results are
consistent compared to other
For moving grids, the geometric theories which use Euler methods
conservation law (GCL) must be (for example Ref 26) and are in fair
satisfied in order to be consistent. agreement with the experimental
This law, which establishes the data. It was quickly realized that
relations for the conservation of the fourth order time marching
surfaces and volumes of the control scheme, as used in [26], was
cells, plays a key role in this flow superior to the first order scheme
simulation. If this law is violated, that several authors are still
a misrepresentation of the using, both in terms of speed and
convective velocities is encountered also smoothness of solution. The CPU
(211. For domains bounded by moving time for this case was about 36
boundaries, the mesh must follow the hours on the SGI Power Challenge
computational domain geometries. computer; this is quite slow mainly
Usually points initially on the due to some of the cells being very
boundaries stay attached to those small. This computation was done
boundaries at the same relative with a window, similar to that used
locations, as is done here. For in [27], around the store located at
30-5

a distance of 0.03 (chord=l) from initial grid before the store starts
the airfoil. Within this window the to move, the grid at the bottom of
grid was fixed relative to the the store's cycle and the grid after
airfoil and movement of the grid was one complete cycle. A third window
only allowed outside of the window. was also used in this case so that
the grid was only allowed to move
Next the code was tested on a within a distance of 4 units from
NACA0012 falling in free air with the centre of the airfoil/pylon. The
M=0.8, a=O and a downward velocity grid was fixed outside this window
of 0.08 (relative to unit allowing for greater efficiency in
freestream). The trend in CN with grid movement. The CPU time for this
increasing time was compared to the case on the Power Challenge was
actual steady state result for the about 5 hours.
equivalent angle of attack. This
grid was completely unstructured and This is the current status of the
the window was now fixed at about unsteady development of the program.
1.4 units from the body. Fig 7a Several more tests will be performed
shows the initial grid and also the to check accuracy and then different
grid after a plunge of about 1.6 schemes for moving the grid and for
units (for a chord length of 1). integrating in time will be studied.
This CN development is shown in Fig Implicit time marching schemes will
7b where it can be seen that the be investigated so that larger time
result looks quite accurate as the steps can be taken. These
normal force CN is tending enhancements will be very useful in
asymptotically to the true steady the future development of the 3-D
state value. version of the code.
The next computation was for a more 6. Conclusions
realistic (store) type of body such All the pieces are now in place to
I
as an ogive-cylinder-ogive as shown complete the development of an
in Fig 8a with an airfoil/pylon as unsteady calculation applied to the
the parent body. For a freestream prediction of the store trajectory
Mach number of 0.4, a reduced after release from the aircraft. A
frequency of 0.8 and a maximum suitable 3-D grid generator has been
velocity of 0.064, this 'store' was identified in ICEM and a 3-D Euler
moved down and up, in a cyclic solver has been developed in-house.
manner, a distance of 0.16 units To optimize the development of the
(based on an airfoil/pylon chord 3-D unsteady version of the final
length of 1) to check physical code a 2-D version has first been
consistency of the results. Figs 8b developed and presented here.
and 8c show the CN and Cm Lessons learned from this
developments with time for three development will be incorporated
cycles. The results look quite into the 3-D version at a later
physical as the lift first increases date. The six degrees of freedom
as it moves downward (seeing an (6DOF) equations defining the
upwash from the fluid), then as the motion, given the aerodynamic forces
gap increases the lift decreases as as computed from the Euler code,
the 'channel' effect above the body will be incorporated into the
is becoming less noticeable. When package to provide a complete
the body returns upward, the lift at trajectory. Another future
first decreases as the body sees a development will be to move from the
downwash from the fluid but finally Euler formulation to a Navier-Stokes
increases as the channel effect is one, where structured grid layers
stronger. The first window in this near the surface will be especially
case was set at a distance of 0.03 beneficial.
from the body which basically
covered only the structured layers
of the grid. A second window for References
fixing the grid was also set around 1. Fox, J.H., Donegan, T.L.,
the wing/pylon. This enables good Jacocks, J.L., and Nichols, R.H.
grids to be maintained near the (1991): "Computed Euler Flowfield
bodies where it is felt to be for a Transonic Aircraft with
necessary to achieve an accurate Stores", Journal of Aircraft,
solution. Shown in Fig 8a is the V01.28, pp.389-396.
2. Lijewski L.E. and Suhs H.E.
30-6

'Time-Accurate Computational Fluid Numerical Fluids), Editors: A.


Dynamics Approach to Transonic Store Dervieux, J. P. Dusage, L. J.
Separation Trajectory Prediction'. Johnston, Vieweg, to be published.
Journal of Aircraft, Vol 31, No 4, 17. Fortin F., Hawken D.F., Jones
July-Aug 1994. D.J., Symms G.F., ' A Comparison of
3. Lijewski, L.E. and Suhs, N.E. Two Commercial Euler and Navier-
(1992): "Chimera-Eagle Store Stokes CFD Codes', CFD Society of
Separation", AIAA 92-4569-CP. Canada Conference, CFD 95, June
4. Lohner, R. and Baum, J. (1992): 1995.
"Comparison of Wing/Pylon/Store 18. GFEM, Electronic Data Systems
Experiment with an Euler Finite Corporation, Unigraphics Division,
Element CFD Code", AIAA 92-4573. 13736 Riverport Drive, Maryland
5. Parikh P., Pirzadeh S. and Frink Heights, MO, USA.
N.T. 'Unstructured Grid Solutions to 19. Carr M.P., Pallister K.C.
a Wing/Pylon/Store Configuration'. (1984), 'Pressure Distributions
AIAA Journal Vol 31, No 6, Nov 1994. Measured on Research Wing MlOO
6. Fluent Inc., Centerra Resource Mounted on an Axisymmetric Body,' in
Park, 10 Cavendish Court, Lebanon, Experimental Data Base for Computer
NH , USA. Program Assessment, AGARD AR-138.
7. CFD Research Corporation, 3325 20. Marconi, F., Siclari, M.,
Triana Blvd., Huntsville, Alabama, Carpenter, G., Chow, R., 'Comparison
USA of TLNS3D Computations with Test
8. I-DEAS, Structural Dynamics Data for a Transport Wing/Simple
Research Corporation, 2000 Eastman Body Configuration', AIAA Paper 94-
Drive, Milford, Ohio, USA. 2237 , 1994.
9. FLITE3D, Computational Dynamics 21. Zhang H., Reggio M., Trepanier
Research, Innovation Centre, J.Y. and Camarero R. 'Discrete Form
University College, Singleton Park, of the GCL for Moving Meshes and its
Swansea, UK. Implementation in CFD Schemes'.
10. ICEM, CFD Engineering, 2600 ETNA Computers and Fluids, Vol 22, No 1,
Street, Berkeley, California, USA. pp 9-23, 1993.
ll:Jones, D.J., and MacLeod, B. 22. Batina J.T., 'Unsteady Euler
'Solution of the Euler Equations Algorithm with Unstructured Dynamic
using Unstructured Grids'. Fourth Mesh for Complex-Aircraft
Canadian Aeronautics and Space Aerodynamic Analysis', AIAA Journal,
Institute Aerodynamics Symposium, Vol. 29, No. 3, March 1991, pp. 327-
Toronto, May 1993. 333.
12. Fortin F. and Jones D.J. 23. Chakravarthy S.R. and Szema K-Y.
'Solution of Compressible Inviscid 'Computational Fluid Dynamics
and Viscous Flows around Single and Capability for Internally Carried
Multi-Element Airfoils on Store Separation'. Rockwell Intl
Unstructured Meshes'. CFD Society of Corp Report SC-71039-TRI Science
Canada Conference, June 1994. Centre, Thousand Oaks, CA.
13. Weatherill N.P. 'The Delaunay 24. AGARD Compendium of Unsteady
Triangulation in Computational Fluid Aerodynamic Measurements', AGARD-R-
Dynamics', Computers and Mathematics 702, Data Set 3, 1982.
with Applications, Vol 24, No 5-6, 25. Trepanier J.Y., Yang H., 'ADX:
pp 129-150, 1992. Algorithms for Adaptive
14. Jameson A., Schmidt W. and Discretization based on Triangular
Turkel E., 'Numerical Solution of Grids', Technical Report EPMIRT-
the Euler Equations by Finite Volume 9313, Ecole Polytechnique de
Methods using Runge-Kutta Time Montreal, 1993.
Stepping Schemes'. AIAA Paper 81- 26. Gaitonde A.L. and Fiddes S.P. 'A
1259 , 1981. three-dimensional moving mesh method
15. Yoshihara H. (1985), 'Numerical for the calculation of unsteady
Solution of Two-Dimensional transonic flows'. Aeronautical
Reference Test Cases,' in Test Cases Journal, April 1995.
For Inviscid Flow Field Methods, 27. Singh K.P. et al. 'Dynamic
AGARD AR-211. Unstructured Method for Flows Past
16. Fortin F. and Jones D.J. Multiple Objects in Relative
'Unstructured Grid Solutions using Motion'. AIAA Paper 94-0058, Jan
k-E with Wall Functions'.Proceedings 1994.
of Workshop on Efficient Turbulence
Models for Aerodynamics (Notes in
Fig l a . Unstructured Delauna]! Grid obtained for RAE 2822.
Showing poor q u a l i t y of grid.

?ig l b . Hybrid (Structumd/Unstructurod) Grid for RAE 2822


I

e e

.-hdium r d l
+ - n m .nh

a0 w I .n
30.8

n
. #
n
W
n n
N

a
W
N

&
0
W

1
J P
a

W
U 0
O

E E
*
4
4
5

m *d
CI
m
4
m
4 L
L,
30-9
30-10

uprimant

?ig 6.. lornal ?orcm Coefficient and Pitching Moment for


10s of the motion.

“IC
0.0 I.l . XIC

rig 6b. Prensure distributions at Various stages of the Motion.


~ 1 . 0 9 ,2.34, 0 . 5 2 , -1.25, -1.41. -0.54 respectively
30-11

N
r(
0
0

!4
0
Y)
30-12

EL
Solution of the Euler- and Navier-Stokes Equations
on Hybrid Grids

Martin GaUe
DLR Institute of Design Aerodynamics
Lilienthalplatz 7
38108 Braunschweig
Germany

1. SUMMARY action. Though much effolf has been spent in the last yean to
A three dimensional finite volume scheme is presented. The develop powerful tools, the generation of appropriate structured
scheme is based on the employment of hybrid grids, containing grids for complex geometries appears to be much more time con-
tetrahedral as well as prismatic cells. suming than the flow simulation.

The application of hybrid grids offers the possibility to combine A possibility to circumvent this bottleneck is the unstructured
the flexibility of tetrahedral mesheswith the accuracy of regular approach [3,4]. The flow simulation is performed on a grid con-
grids. An algorithm to compute an auxiliary grid of control vol- sisting of tetrahedral cells instead of hexahedra. As tetrahedral
umes for the entire computational domain was formulated. The cells offer a high flexibility the discretization of complex three
dual mesh technique guarantees conservation in the entire flow dimensional domains can bedonealmost automatically IS].with
field even at interfaces between prismatic and tetrahedral do- less user interaction as required for generating structured grids.
mains and enables the employment of an accurate upwind flow The weak point of the unstructured approach is the generation
solver. Convergenceto the steady state can be accelerated by a of grids for high Reynolds number flows. The efficient simula-
multigrid algorithm based on the agglomeration of control vol- tion of such flows requires extremely stretched cells. The edges
umes, The formulation of such an algorithm is presented. of tetrahedral cells of high aspect ratio are connectedunder very
acute angles, This may cause numerical errors when the fluxes
Thecodeis testedonseveralviscousandinviscidcasesfortran- for comernodesof suchcells are evaluated. Hence, convergence
sonic and subsonic flows. and even solution accuracy can be deteriorated.

2. INTRODUCTION A compromise between structured and unstructured schemes is


The calculation of stationary flow fields around aircrab can he the application of hybrid schemes 161.Hybrid grids consist of
regarded as one of the major tasks of CFD. Due to the progress regular cells with edges exclusively normal and tangential to
made in the development of high performance computers, the the surface of the geometry. In some distance from the surface,
simulation of flows even around quite complex configurations where the viscous impact on the flow has almost vanished,tetra-
has become feasible. Therefore CFD methods nowadays have hedral cells are employed to discretize the physical space be-
got a considerable impact on the aerodynamic design of air- tween the regular domains and the outer boundaries. Consider-
planes. ing the shape of the cells in the regular pan there are several
es. The surface discretization with quadrilateral ele-
One of the first requirements to be met by the applied numeri- ments leads to hexahedral cells, while a surfacetriangulation re-
cal metbod is related to the problem turn around time. To make sults in prismatic cells. Due to the higher flexibility of triangles
a scheme usable in the design process, it has to fit into indus- compared to quadrilaterals. a higher level of automization can
trial time scaling. Including the generation of appropriate com- be achieved when using prismatic cells.
putational grids, the solution for a certain problem should be ob-
tained within a few days or less. The aim of this work is to develop of a numerical scheme that
offers the possibility to employ pure tetrahedral and prismatic
The accurate resolution of the flow field in the vicinity of solid grids as well as hybrid grids consisting of prismatic cells in the
walls has a considerableimpact on the correct predictionof aero- vicinity of solid walls and of tetrahedral cells in the rest of the
dynamic forces on the configuration. Strong solution gradients flow domain.
normal to the surface occur. A simulation of such flow phenom-
ena requires a high point density in gradient direction. As foref- 3. GOVERNING EQUATIONS
ficiency reasons usually a lower point density in the directions The Navier Stokes equations for the three dimensional case can
tangential to the wall is utilized, high aspect ratio cells are most be written in conservative form as
suited for the flow resolution in those regions.

One class of schemes widely employed in practical use is based


on structured grids, consistingof blocks of hexahedralcells [I].

("i
As it is feasible to stretch hexahedral cells in one or two direc- where
tiom without losing grid quality, structured grids are appropriate
for the simulation of high Reynolds number flows. The major
drawback of structured schemes is related to the generation of m= PW
suited grid for complex geometries. Grid generation normally is
an iterative process [Z]that requires a high level of user inter- PE

Paper presenred ar the AGARD FDP Symposium on "Pmgress and Challenges in CFD Merhods and Algorithm"
held in Seville, Spain, from 2-5 October 1995. and published in CP-578.
31-2

is the vector of the COnSeNed quantities. V denotes an arbitrary From e+quation ( I ) the temporal cbangeof the conservative van-
control volume with the boundary a V and the outer normal ii. ables W can be derived as:

The flux tensor is composed of the flux vectors in the three


coordinate directions:
z = F .&+ C.d,.+lS .zz
with e,. el. and e, being unit vectorsin tbecoordinatedirections. The changeof the flow conditionsin a certain control volume V
1s given by the flux overthe control volume boundary a V related
The flux vectors $ , E and fi may be divided into its convective
to the size of V. For a control volume fixed in time and space,
and viscous parts as
equation ( 5 ) can be written as:
$=++F, E = & t + > R=iF+A'
with

a
with representingthe fluxesoverthe boundatiesnfthe control
a
volume. If the boundary is divided into n faces, is given by

ai
where @ and " denote the inviscid and the viscous flux over
the respective face.

4. DATA S T R U m
The dual mesh technique is perfectly suited to be utilized in
a scheme that is based on hybrid grids. From the initial grid
an auxiliary grid of control volumes is generated. For a vertex
basedscheme, wherethe flowvariablesarestoredinthenodesof
the initial grid, each node is surrounded by a control volume. The
boundaries of the control volumes are determined by the mid-
points of cells, cell faces and edges of the initial grid. This strat-
egy results in non overlapping auxiliary cells that fill the phys-
ical space without gaps. Figure 1 depicts such an auxiliary grid
(dashed lines) for an initial hybrid grid (solid line). As it can be
seen from the figure, the auxiliary grid is defined even at inter-
faces between the different cell types. Hence, focusing on the
fluxes crossing the boundaries of the control volumes, conser-
The normal and tangential stresses depend on the derivatives of vativity can be guaranteed in the entire flow domain.
the velocity and on the dynamic viscosity p:

az --
mu 2
$I
au av my
-+-+-
az (ax ay az
Tq, = P($+$)

The viscosity p can be calculated employing the Sutherland's


law:
1+sc Fig. 1: Mesh of control volumes for a two di-
p = b.TLS. - (3)
T+Sc mensional hybrid grid
where S, is a constant dependingon the free stream temperature
E,: For each initial cell contributions to the auxiliary grid have to
110 4K be determined. As this can be done cell by cell without infor-
sc=- ,
7- mations about neighboring cells, the auxiliary grid can be eval-
uated within one loop running over all cells. Therefore, the eval-
The pressure is calculated by the equation of state uation of the auxiliary grid is quite cheap in terms of computa-

p=(K- I)p(E-
U2 +v= +wz (4)
tional time.
2 1 .
31-3

eight triangles. As the size and the orientation of triangles can Pa Pa

be described
describes the by and thevectors
sizenormal orientation
the sum
of the
of entire
the normal
face. The
vectors
re- &=[ FIL--
sulting vector is also related to the respective edge. Paw
Pan .p d F]
paw
R

and

The speed of sound a can be obtained from the relation


Geometrical coordinates of the grid nodes
a Edge to node pointer
a Components of the face vectors MF denotes the advection Mach number at the cell face:

Hence, both the description of the grid and the computation of M.0 = M'I + h q (9)
the grid fluxes are based on the edges, Informations the initial where the split Mach numbers are defined as
gridcellsarenotrequiredanymore.Thisstrategyleadstoavery
efficient memory allocation of less than 100 real variables per
node and a good vectorization of the flow solver. MJJ = { M
@+I)'
0
i f M 2
if
if
IMI
M
<
5 -1
I
1

The preprocessing, including the determination of the control


volumes of the auxiliary grid as well as the components of the 0 i f M 2 1
face vectors, has to be executed before the flow calculation
StartS. M if M 5 -1
Herein M denotes the Mach number of the flow normal to the
5. SPA% DISCRETIZATION
cell face.
5.2 Calculation of convective Terms
The edge based data structure described in section 4 forms
The pressure p~ at the cell face is calculated in a similar way as
the basis for the employment of the accurate AUSM upwind
scheme, as presented by Liou and Steffen in 171. Considering an P F = Pf+ Pg (10)
edge connecting two nodes p0 and PI,as illustrated in figure 3,
the inviscid flux over Face F can be interpreted as a sum of where pJJ1" denote the split pressure
a Mach number weighted average of the left (L)and the right
( R ) state at a face F and a scalar dissipative term: f D i f M 2 1

(7)
314

The coefficient@ F controls the dissipationof the scheme. In the 5.3 Calculation of Viscous Terms
original schemeofliou. QF is set to The determination of the viscous terms is also performed edge-
wise. The obtained fluxes are related to the nodes associated
*F=IMFI . with the respective edge. For an edge connecting the nodes PO
and PI with the face vector& (figure 3) one obtains:
Forsmall Machnumbersthedissipative charactervanishessince
also @F becomes small. In order to prevent the disappearance 0

1
0 0
of the dissipation for small Mach numbers Q p is determined as
proposed by Kroll and Radespiel in [8].

The values left and right of the face F are taken directly from
Po and PI for first order calculations. For second order accurate
calculations the independent flow vanables are Linearly recon-
&,=
(. b
V,+K$
7v

:;
V,+K$
7z
7.m
9
V,+K$
&,I

withV,,V,andVxasdescribedinsection5.1.
structed on the control volumes around Po and P I .For the con-
trol volume of node PO it reads: The derivatives of a flow variable U have already been ob-
1-
tained for the second order discretization as described in sec-
U L = uo+ vi&.-Vo,t
2
. (12) tion 5.2. They are the components of the gradient vectors Vil =
(uz,uy,uJT.The face values are determined by an arithmetic
The gradient V& of a variable U is obtained by employing a averaging of the respective values in the nodes and P I .
GreenGauDformula:
6.TEMPORAL DISCRETIZATION
The temporal variation of the flow quantities can be written in
general form for a node POas:
where f2o is the volume of the dual cell around No and lo,,is the
normal vector of the dual mesh face F as shown in figure4.
A comparison with equation (6) gives for the residual 20:

The integration in time is performed utilizing an explicit Runge-


Kutta scheme. as described by Jameson in (101:

Fig. 4 Face of a three dimensional control vol-


ume Within the framework of this paper a three step scheme is em-
ployed with the coefficients
Near shocks the values on the edges have to be limited to avoid
al=0.15, a 2 = 0 . 4 and a3=1.0
overshoots.Thelimiting isdoneby anminimumlmaximumclip-
ping like it is proposed by Barth in [91. If a reconstructed value
at any face of the conhol volume exceeds the minimum (or max- For the control volume surrounding node Po in figure 4 the con-
imum) of the values given by node Po and the surrounding nodes vective time step &&and the viscous time step &; have to be
PI..6 (figure 4), the gradient viio is scaled by a factor Qo, so the determined. The resulting time step can be written as:
reconstructed value becomes equal to the minimum (or maxi-
mum) of the nodes b..a: &;At;
At0 = CFL-
&$ At; +
with CFL being the Courant number. The convective time step
with &; can be calculated as:
0, = min( I ,Q1. 0
.) __
where

.-( z p-"
"i,.-u" if u ;<
~ rq
where denotes the maximum eigenvalueof the flux Jacobian.
U,,.-"" if u ;>~rq . (15)
It can be detemned as a integration over the surface of the con-
I if u i=
~ uo trol volume:
Herein U- and U";" denotethemaximumlminimumoftheval- 6
ues of u at the nodes Po..6 and u ; denote
~ the reconstructed value
at the faces of the control volume between POand E . The values
%=
,=I
c 130, x S0,;l +nor'l$l,,l (20)

at the faces are reconstructed as described in equation (12).


with 30,;representingthe face vectors of the conml volume face
for the ith neighbor of POand 30, the face velocity vector.
31-5

Following [ I I ] the viscous time step has to he scaled with a The selection of the Stan node and some strategy, which of the
factor Kv = 0.25: neighboringcontrolvolumesareto be fused with thecontrol vol-
ume of the start node, is the only possibility to control the qual-
ity of the coarse grid. It appears that the best grid quality can
Employing an integration around the control volume, the vis- he obtained when the agglomeration is marching along coars-
cous eigenvalue hr; can be written as: ening fronts throughout the grid. Furthermore, nodes lying on
solid walls should be preferred to remain in the coarse grid. So
the highest priority to become the next start node will be given
wall nodes lying on the coarsening front.

A simple and perfectly working strategy is to fuse all control vol-


7.MULTIGRID ALGORITHM umes of neighboring nodes that are not agglomerated yet with
7.1 Multigrid Sbaategiesfor Unstructured Grids the control volume of the s m node. Anyway, one can think
The acceleration of the convergence is necessary for the sim- about some more sophisticated algorithms, as to fuse only the
ulation of high Reynolds number flows. A very powerful tool n nearest neighbors, with n ir 8, or, as Venkatakrishnan and
that can be utilized with an explicit time-stepping scheme is the Mavriplis propose in [I 71, to maximize the ratio of volume and
multigrid method. surface ofthe coarsegrid control volumes, what results in a kind
of semi coarsening for Navier-Stokes grids.
Focusing on Unstructured grids, there are several approaches
for the formulation of a multigrid algorithm. The differences
between these approaches lay in the strategy of generating the
coarser grids.

One frequently utilized method employs independent grids. A


set of successively coarser grids are generated around the re-
spective geometry independently from each other [IZ. 131. As
the grids are not nested, expensive search algorithms are re-
quired to determine the operators needed to transfer the flow
variables and the residuals between the different grids. Another
drawback is the limitation of the cell size. It should not exceed
the size of details of the geometry, since otherwise the correct
representation of the surfaee can not be guaranteed.

The same holds for a different strategy, the use of telescoping


points. In this case certain points are selected to remain in the Fig. 5: Coarse grid obtained by agglomeration
coarser grids. These points are then reconnectedusing some tri- of control volumes
angulation algorithms [14]. The preprocessing described above
has to be performed for each level again. Since one has to select As depicted in figure 5 the coarse grid consists, as the fine grid,
existing points of the fine grid to become coarse grid points the of a set of nodes surrounded by control volumes and connected
quality of the coarse grids is worse than the quality of indepen- by edges. Each edge is related to one face of the auxiliary grid
dently generated coarse grids. For hybrid grids this approach is of control volumes. The only difference to the fine initial grid is
not suited as different algorithms would have to beused to select that the edges do not have to form any specially shaped cells any
and reconnect the points in the prismatic and tetrahedral regions. more. The informations needed to describe the coarser grid are
determined directly Gom the fine auxiliary grid.
The agglomeration of control volumes, as described e.g. by
Lallemand et al. in [U]or Venkatakrishnan and Mavriplis in
[I 61, can be regarded as a special case of the second approach.
Cenain points of the fine grid remain in the coarse grid as well,
but the control volumes of the fine grid nodes are fused together
in orderto form thecoarsegridcontrol volumes.Sincethefocus
is on the control volumes, the surface representation is guaran-
teed by definition. For a hybridschemeusingthedualmesh tech-
nique this approach is perfectly suited. As in the solution process
theshapeoftheinitial grid cellsisnotimportantintheagglomer-
ation procedureeither.One problem that may occur is the quality
of the coarser grids, as one has to deal with points that exist also
in the finest grid.

7.2 Agglomeration Process


The agglomerationprocess starts with the choice of a start node,
that will remain in the coarser grid. The control volume of the
s m node is fused with control volumes of neighboring nodes. Fig. 6: Agglomeration of control volumes
After having agglomerated the control volumes, the process
starts again with the choice of a new start node. So agglomera- Figure 6 illustrates the agglomeration of several fine control vol-
tion of control volumes is a greedy process. that is not expensive umes to a coarse control volume around node Po. The size of the
in terms of calculation time.
31-6

new control volume is children have to he summed up. The contributions are weighted
4
with respect to the size of the children cells:
v0,k = xvi,k-l
i=O

with t , k - l k i n g the size of control volume i in the grid k - I ,

In this equation V,, denotes the sum of the volumes of the chil-
dren of Po,k. According to equation (22) V,,,, equals the’coarse
grid control volume vo,k around PO,& so equation (25) also be
written as:

As stated in equation (17) the product of residual.? for any node


P and the volumc V of the control volume equals the fluxes d
crossing the boundary of the control volume. Therefore one can
write:
Fig. 7 Determination of coarse grid face vectors

As presentedin figure 7,the normal Vector&],$ of the auxiliary


grid face related to the edge between node Po,kand node P9,kin As the fluxes hetween two fine grid control volumes that are
the coarse grid is the sum of all face vectors related to edges be- fused together cancel each other, the sum term in this equation
tween children of PO and P9. in this case: denotes the flux overthe coarsegrid control volume achieved by
a fine grid discretization.The forcing function can be interpreted
’[0,9),k= 3[ Z , l l ) , k - l ~ ’ ( Z , l ~ ) , k - l +3[4,11),k-l ’
as the differencebetween the fluxes obtained by a fine grid dis-
cretization and the fluxes achieved hy a coarse grid discretiza-
The obtained coarse grid has got the same properties as the fine tion for a coarse grid control volume.
grid. Hence, the governing equations can be discretized on the
coarse grid employing identical algorithms as on the fine grid. The temporal discretizationis performed as described in section
Furthermore. the coarse grid control volumes can k agglomer- +4 .
6, while the residuals i i k are replaced by the expression .?k
Equation (18) then reads:
ated again using the same strategies as for the agglomeration of
the fine grid control volumes. In this way a set of grids can he
created easily based on the fine grid.
= -alAr(.?k(*p)+Pk)

7.3 Transfer of Flow Variables @(“I k = - a a A f ( i i k ( @tp ) 4)


The multigrid iteration stam with the performance of one time-
step on the finest grid. The time step on a fine grid k- 1 gives
@f’ as the basis for the next coarser grid k+ I:
the solution P,!i .This solution is transferredto the next coarser
grid k with a suited transfer operator L: c-l
7.5 Determination of Prolongation Operator
As the physical position of coarsegrid nodes is identical to the After havingdetermined the soluaon @PIm for all grids, correc-
position of the nodes in the grid k - 1, the transfer of flow vari-
ables from the fine grid to the coarser grid is just an injection of
+
tions coming from the coarse grids k I ...m are transferred to
the grid k employing suited transfer operators. For this grid one
the respective values:
obtains the corrected soluhon as:

(26)

7.4 Evaluation of Restriction Operator


Following [IO]the Forcing-Function& can be formulated as: while for the coarsest grid m one can write:
@ib) =@k) .
with .?k(@k) being the residuals obtained respectively to equa-
tion (17) for the solution @k. Figure 8 shows a fine auxiliary grid k and a coarsergnd k + 1.
The nodes PI and P5 10 are coarse grid nodes. The corrections
The restriction operator c-l,k
dependson the relations between
the grids k- I and k. As the physical space of the coarse grid
for node PIthe correction coming from grid k+ 1 are computed

control volume around node in figure 6 is identical to the


space of the children of in the finer grid, the residuals of all
31-7

are connectedto form prismatic cells in the lower part of the grid
and tetrahedral cells in the upper part.

The flow is coming from the left side parallel to the boundary
planes. In rear part of the lower plane a no slip plate is located.
As it can be seen from the pressuredistribution onthe boundary
planes in figure 9 the beginning of the plate is characterized by
a flow stagnation.

Fig. 8: Evaluation of coarse grid corrections

where is the corrected value for grid k t I. Since the po-


+
sition of node Po is identical on both grids k and k l , one can
write:
'(h) -
0,k - 0 , k t l (29)
what leads, together with equation (23). to:

='$ +'O,(ktl,k) '

For fine grid nodes which are also existing in the coarser grid,
the corrections directly added. Fig. 9: Hybrid grid for flat plate and isobars

The node P2 in figure 8 does not exist in the coarser grid k+ 1. In order to create a tetrahedral grid the prismatic cells are sub-
The control volume of this node bas been agglomerated with the divided. Hence. the point distribution is identical in both grids.
control volume around Po. If the corrections are assumed to be Figure IO presents the tetrahedral grid and the respective solu-
constant over the coarse grid control volume, one can write: tion.

A higher accuracy can be obtained if the corrections are recon-


structed linearly over the coarse grid control volumes. The re-
construction is similar to the reconsmction of flow variables
as described in section 5.2. Using the values in the neighboring
control volumes, a correction gradient Vc?o,(ktl,kl can be deter-
mined for the control volume of Po,k+l:

. .
(30)
withS(l,il,ktlrepresentingthenormal vectorrelatedtotheedge
between node Po and Pi. The correction in node 4 can then be
obtained as:

'2,(k+l,k) ='O,(ktl,k) tv'O,(ktl,k) '%2

where ?~,O,Z
denotes the vector from POto Pz, Fig. 10: Tetrahedral grid for flat plate and isc-
bars
When the solution also on the finest grid is corrected, the next
iteration n stam with:

p/O)(n)= iC;(b)(n- I ) In figure 11 the convergence history is presented. For the calcu-
lation on the hybrid grid a convergence of 2.5 orders of magni-
tude is obtained after about 1600iterations. The calculation was
8. NUMERICAL RESULTS performed in the single grid mode in order to make it compara-
8.1 Laminar Flow over a Flat Plate ble to the results obtained on a pure tetrahedral grid. As one can
Thesimulationofalaminarflowoveraflatplatewasusedtoval- see from figure 11 the convergence is worse for the tetrahedral
idate theformulationofthe viscoustermevaluation.Figure9de- case. This may be due to the disturbance of the solution caused
picts boundary planes of the three dimensionalhybrid grid. The by the diagonaledges near the wall. Furthermore. becauseof the
grid consist of three layers with 60x40 points each. The points higher number of edges more computational work is required for
31-8

$!
-0

-
0

.....-.- Hybrid Grid


around an ONERA M6 Wing. The prismatic part of the grid, as it
can be seen from figure 13. has got an 0-Topology. It consists of
seven layers of prismatic cells. The triangular cell faces are lo-
cated on the wing surface, while on the symmetry plane quadd-
lateral faces are visible. Outside the prismatic region the space
is discretized with tetrahedral cells.
Tetrah. Grid
Figure 14 depicts the solution obtained on the hybrid grid for an
incidence of 3.06' and an Mach number of 0.84. The character-
istic h-shock that is visible on the upperwing surface is captured
within two or three cells. No oscillations occur at the interface
between the prismatic and the tetrahedral domains.

500 1WO 1500 2OW 2500


Iterations

Fig. 11: Convergence history for flat plate

each time step than in the hybrid case.

Figure 12 depicts a comparison at different points between


the Blasius solution and the computed solution for a subsonic
(Ma, =OS)laminarRow withaReynoldsNumberof 5000.The
solution is almost identical for both the tetrabedral and the hy-
hrid rase

Fig. 14 Isobars of transonic flow around


ONERA M6 wing

On the same grid also a subsonicflow with the freestream Mach


numberofMrla, = O S andanincidenceofa=3.0°is simulated.
In figure 15 the respective solution is presented.

Fig. 12: Comparison of the computed solution


with analytic solution

8.2 Inviscid Flow around ONERA M6 Wing


A three dimensional test case for the scheme is the inviscid flow

Fig. 15: Isobars of subsonic flow around


ONERA M6 wing

The effectof the convergenceaecelerationof the multigrid algo-


rithm is presentedin figure 16, Forthe multigrid calculation con-
vergence of five orders of magnitude is obtained after about 890
seconds of computational time, while on the single grid the so-
lution has converged less than two orders of magnitude in 1500
Fig. 1 3 Hybrid grid around ONERA M6 wing seconds.
31-9

ment by cell division or on global remeshing will be formulated.

10. ACKNOWLEDGMENT
The three dimensional hybrid grids that are used in this paper
have been generated by J.W. van der Burg and J.E.J. Maseland of
NLR within the framework of the NLR/DLR Cooperation CFD
for complete Aircraf.

11. BIBLIOGRAPHY
[I] J.W. Sloof and W. Schmidt, editors. Computational Aero-
dynamics Based on the Euler Equations. AGARD-AG-
325, 1994.
[2] A. Ronzheimer, 0. Brodersen, R. Rudnik, A. Findling,
and C.-C. Rossow. A new interactive tool for the man-
-5 ' ' ' " ' ' .-., " ' ' "
500 1000 1500 agement of grid generation processes around arbitry con-
Seconds figurations. In N.P. Weatherill, P.R. Eiseman, J. Hauser,
and J.F. Thompson, editors, Numerical Grid Generation in
Fig. 16: Convergence history for the simulation
Computational Fluid Dynamics and Related Fields, pages
of subsonic flow around ONERA M6 wing
441452,1994.
[3] A. Jameson, T.J. Baker, and N.P. Weatherill. Calculation
9. CONCLUSIONS of inviscid transonic flow over a complete aircraft. AIAA-
A finite volume scheme basedon hybrid grids is presented. The 86-0103, 1986.
employed grids consist of prismatic cells near body surfaces and [4] T.J. Barth. A 3-D upwind solver for unstructured meshes.
tetrahedral cells connecting the prismatic domains and the outer AIAA-9 1-1 548-CP, 199 I .
boundaries. The use of prismatic cells offers the possibility to [ 5 ] C. Gumbert, R. Lohner, and P. Parikh. A package for un-
resolve viscous dominated flows such as boundary layers effi- structured grid generation and finite element flow solvers.
ciently and accurately by applying high aspect ratio cells in the AIAA-89-2175,1989.
respective areas. Due to the use of tetrahedral cells, grids be-
[6] T. Minyard and Y. Kallinderis. A parallel Navier-Stokes
come quite flexible and the generation of grids, even for com-
method and grid adapter with hybrid prismatichetrahedral
plex configurations, is relieved considerably compared to struc-
grids. AIAA-95-0222, 1995.
tured approaches.
[7] M.S. Liou and C.J. Steffen. A new flux splitting scheme.
In the preprocessing an auxiliary mesh of control volumes is Computers and Fluids, 107(1):23-39, 1993.
computed from the initial grid. The auxiliary mesh covers the [8] N. Kroll and R. Radespiel. An improved flux vector split
entire computational domain and can be used in both the tetra- discretisation scheme for viscous flows. DLR-FB 93-53,
hedral and the prismatic domains. In the flow solver part of the 1993.
scheme an edge based data structure is utilized, so the cell struc- [9] T.J. Barth and D.C. Jesperson. The design and application
ture given by the initial grid becomes unnecessaq.The feasibil- of upwind schemes on unstructured meshes. AIAA-89-
ity of employing hybrid grids even for three dimensional flow 0366,1989.
calculations are presented.
[IO] A. Jameson. Transonic flow calculations. MAE Report
The multigrid algorithm based on the agglomeration of control 165 I , Princeton University, Princeton, New Jersey, 1983.
volumes is a natural extension of the dual mesh technique. It fits [ 1 I] D.J. Mavriplis. Accurate multigrid solution of the Euler
perfectly to the edge based data structure and results in a small equations on unstructured and adaptive meshes. AIAA-
memory requirement. 88-3706-CP, 1988.
[I21 D.J. Mavriplis and A. Jameson. Multigrid solution of the
The calculation of inviscid fluxes are demonstrated to be effi- two-dimensional Euler equations on unstructured triangu-
cient and accurate. Shocks are captured nicely by the employed lar'meshes. AIAA-87-0353, 1987.
upwind flow solver. Also the formulation to calculate the vis- [ 131 R.H. Bailey. A multigrid algorithm for the solution of the
cous fluxes has proved its accuracy. At the interface between the Navier-Stokes equations on unstructured meshes for lam-
prismatic and the tetrahedral region in some cases wiggles in the inar flows. DLR-IB 129-92/10, 1992.
flow solution occur. Those wiggles will be subject to more de-
tailed investigations. [I41 K. Riemslagh and E. Dick. Multistage Jacobi relaxation in
multigrid methods for steady Euler equations. Submitted
Though the multigrid formulation works nicely for the case pre- November 1993 to International Journal of Computational
sented here, it still have to be improved, as the gain for cases with Fluid Dynamics.
high aspect ratio cells is less than one would expect. In order to 151 M.H. Lallemand, H. Steve, and A. Dervieux. Unstruc-
enable the simulation of viscous flows also around three dimen- tured multigridding by volume agglomeration: current sta-
sional geometries the next step will be the implementation of a tus. Computers and Fluids, 21:397433, 1992.
1 suited turbulence model. With the improvement of the multigrid 161 V. Venkatakrishnan and D.J. Mavriplis. Agglomera-
algorithm even the simulation of high Reynolds number flows tion multigrid for the three-dimensional Euler equations.
are expected to become feasible for complex geometries, like AIAA-94-0069, 1994.
flapped wings or complete aircraft configurations. 171 V. Venkatakrishnan. A perspective on unstructured grid
flow solvers. ICASE report 95-3, 1995.
Finally a grid adaption algorithm, either based on local refine-
32- 1

SIMULATION DU MOUVEMENT RELATE DE CORPS SOUMIS


A UN ECOULEMENT INSTATIONNAIRE
PAR UNE METHODE DE CHEVAUCHEMENT DE MAILLAGES

P. Brenner
Atrospatiale Espace & Mfense
BP2 78133 Les Mureaux CEDEX, France

Despite' the unregular spatio-temporal discretisation


(since time steps are different in each cell, since
Nous prtsentons une mbthode adapt& h la simulation meshes are unstructured...), the algorithms used
numerique des largages d'dtages de fusee en associate accuracy with robustness.
prdsence de contrain tes ahdynamiques.
Une technique de chevauchement de maillage _INTRODUCTION
conservative est utili& pour simuler le dtplacement
des differentes parties en mouvement dfi aux efforts Les methodes de calcul dkoulements instationnaires
drodynamiques et propulsifs. autour de corps mobiles sont d u n i n a t certain pour
Les Quations d'Euler en multi-gaz compressible simuler le largage d&ges de f u k vides.
sont resolues au moyen d'une discrttisation du type En effet, les moyens dessais susceptiblesde permeure
Volumes Finis non structurk (t&a&Ires, pnsmes et de telles simulations sont tr&s difficiles A mettre en
hexaaes). Les caract6nstiques de l'hulement sont oeuvre:
localides au centre de gravitd de chaque maille. Le - I'tcoulement externe est fortement supersonique
schtma numtrique en espace est ddcentrd et du donc la taille de la veine de soufflerie sera faible,
second ordre de prkision sur les flux. 11 est bas6 sur - les moteurs continuent h ejecter des gaz dont la
l'algorithme de Godounov. pression statique est grande par rapport h la pression
La mdthode d'intdgration temporelle adaptative externe, ce qui conduit h des dclatements de jet
mise en oeuvre permet de simuler des koulements importants capables de provoquer le blocage de la
fortement instationnaires avec dtplacement de chocs veine,
forts tout en limitant le coat des calculs puisque les - les gaz propulsifs sont thermodynamiquementtr&s
largages peuvent durer plusieurs secondes. difftrents de l'air donc il faut simuler aussi cette
Le choix des algorithmes utilists confhe au code diffbrence de compositions,
robustesse et precision bien que la discretisation - enfin, il faut assujettir le mouvement des corps aux
spatio-temporellesoit non rbgulibre (puisque le pas de forces exerckes en temps r k l , ce qui suppose un
temps est different pour chaque maille, que les systbme complexe permettant de peser correctement
maillages sont non structurts et qu'ils se ces efforts puis de les interprtter pour modifier la
chevauchent). position des mobiles.
Ce demier point semble trks contraignant car la taille
ABSTRACT de la veine est limitk. Ainsi, il ne faut pas que le
syst&meen question soit trop encombrant et son temps
A computational method for the simulation of rocket de rtponse doit ttre trbs bref car la dude de
stages separations under aerodynamical and fonctionnement s i m u l k est proportionnelle h l'khelle
propulsive loads is presented. mise en oeuvre (c'est h due qu'un largage durant une
To simulate the motion of bodies, a conservative seconde en &lit6 durera cinq centibmes de seconde
overlapping grid technique is used. pour un moyen dessai h l'khelle un vingtibme).
The flow solver is based on a cell centred Finite Lorsque les phtnomknes ttudits sont rhllement
Volume formulation on unstructured grids (made instationnaires, il semble donc plus realisable
of tetraedra, prisms and hexaedraj. dutiliser une approche essentiellement numtrique
The Euler equations with mixing gases are solved comme nous l'avons fait avec le code FLUSEPA (rtf.
through a second order upwind scheme using the 7 et 8).
Godunov algorithm to compute the numerical fluxes. Le choix des formulations,des schtmas numenques et
To integrate equations in time, a temporal adaptive des algorithmes utilises sont conskcutifs aux
algorithm is used since the real duration of the contraintes rencontrks pour ce type de simulations:
simulated phenomena is long. It saves computer time 1- la gbmbtrie des Ctages peut ttre complexe mais
and leads to accurate simulations of unsteady surtout, la gtomktrie de I'inter ttage est toujours
phenomena like acoustic waves and shocks complexe (prksence d'tquipements, de systbmes de
displacements. &paration.. .),

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms''
held in Seville, Spain, from 2-5 October 1995. and published in CP-578.
32-2

2- I'tkoulement varie trbs brutalement du haul dik?rations sur cette mCme maille est important,
supersonique au bas subsonique et le rapport de sinon, il est faible. I1 s'agit d'une methode
pression rencontr6 dans les chocs forts p u t atteindre d'integration temporelle adaptative qui peut Cue
plusieurs milliers. consider& comme une technique consistante et
I 3- le mouvement relatif des diffbrents corps est conservative de pas de temps locaux.
totalement quelconque (rotation complexe, translation Dans la description qui suit. nous insisterons sur les
importante...), problbmes de conservativid, de consistance et de
4- l'hulement m m e la position des &ages peut &e stabilid qui ont conditionnC le choix des algorithmes
rapidement Cvolutif et les phdnombnes acoustiquesou utilids.
de propagation de chocs sont souvent pr6pondhnt.s
quanta l'kvolution du champ ahxlynamique. THODE & R I O E
La premiere contrainte nous a amen6 h utiliser une
formulation non structurk qui pdsente une grande La mCthode des volumes finis (F.V.) repose sur la
souplesse dun point de vue ergonomique pour la rhlution des &pations sous forme indgrale, c'est h
&finition des maillages. dire que l'on fait un bilan des valeurs conservatives
La seconde now a conduit au choix dune methode de sur un element de contrdle. Notons que, en toute
volumes finis du type Godounov (dordre deux) qui rigueur, le bilan doit Ctre vBrifiC que1 que soit
est trbs robuste et precise. Notons ,que dans la 1'61Cment de contrdle considere appartenant au
formulation utilisee, les caracteristiques de domaine &udiC.
l'koulement sont localides au centre de gravid des
mailles. I- EQUATIONS G ~ ~ R A L E S
La troisibme 6limine toute technique utilisant les
dbformations de maillage. En effet, lorsque les Les &pations d'Euler sous forme indgrale en multi-
mouvements Sont importants et quelconques, le gaz compressibles pour les dcoulements
resultat dune deformation peut conduire B une tridimensionnels et lorsque l'6lCment de contr6le est
modification locale des cellules de contr6le tellement mobile, peuvent se m e w sous la forme suivante:
forte que le vrillage peut retourner les mailles jusqu'h
l'obtention de volumes nkgatifs.
Quant aux methodes utilisant des remaillages
adaptatifs. elles nous ont semble trop lourdes et
contraignantes d'un point de vue instationnaire.
lorsque l'on desire assurer la conservativite d'un
systkme (ou l'accroissement entropique par exemple).
Nous avons donc opt6 pour une technique de
recouvrement de maillage conservative. Pour tenir
compte du mouvement relatif des maillages associCs
aux corps mobiles, nous utilisons une formulation des
flux Euler-Lagrange mixtes (A-L-E) qui simplifie la
mbthode et surtout assure la conservativite du Oh, 6Q est la frontibre qui entoure l'elbment de
systtme. En effet, bien que les maillages soient contr6le R, p est la masse volumique, V est la vitesse
du fluide dans le referentiel Galilkn de calcul ,U est
rigides, cette technique permet de travailler dans un
seul referentiel contrairement aux formulations
la vitesse de la frontibre 6R,Et est 1'6nergie totale
Euleriennes pures qui nhssitent I'emploi d'un Specifique. Cv la chaleur Specifique a volume constant
dfdrentiel par maillage puis l'introduction de forces et N. le nombre de moles par unid de masse.
dentrainement qui sont traides comme des termes La premiere Quation est relative h la variation du
source nuisant B la conservativitdglobale. volume de contr6le. Puisque les maillages sont
Notons que pour une m6thode d&n& de calcul des rigides, elle n'est utile qu'au niveau de l'interface entre
flux, I'A-L-E ne modifie que dune faGon mineure maillages comme nous le montrerons par la suite.
l'algorithme car il suffit de prendre en compte la Les deux Quations sur Cv et N sont prises en compte
vitesse des faces lors du dhntrage. Enfin, l'interface pour simuler le melange de gaz suppods parfaits et
existant entre maillages est trait& comme les faces thermodynamiquement parfaits. Cette modelisation
d'une quelconque maille sans faire intervenir de trbs simple n'est correcte que si le milieu est non
changement de dferentiel. rl?actif.
Quant h la dernitre contrainte, elle nous a conduit B
Climiner les methodes dinteggration implicites qui 2- SOLVEUR &ODYNAMtQUE
6touffent une grande partie de I'acoustique et, au
mieux, Ctalent les chocs qui se propagent. Nous avons Bien qu'il soit difficile de dtkoupler la discdtisation
donc mis au point une mkthode explicite dintkgration spatiale de la discretisation temporelle, nous allons
permettant toutefois de tenir compte des caracBres proceder ainsi afin de mettre en lumibre le
specifiques de l'tcoulement local: lorsque les cheminement que nous avons suivi.
phenomtnes sont rapides dans une maille, le nombre
32-3

2.1- Discrbtkation spahle (5) -


1 J1P.a6=div(gG)+e(h2)+€l(hn-1)
l2ihwau aa
Approximons I'enveloppe 6Ri de I'ClCment i par un
polyMre A N faces planes not& Sij orient&%de i Notons que, sous cerlaines conditions dapproximation
vers ses voisinsj. et de configurations gkm6triques particulitres, le
La discrdtisation la plus naturelle provient de second terme de troncature gagne un ordre de
Godounov (RCf. 1): il s'agit de considerer des Ctats prkision et que, en dehors du point G,le premier
constants par morceaux sur chaque maille, alors, la terme chute 21 l'ordre 1.
m6thode de calcul des flux numCriques sur chaque
interface consiste 2I dsoudre le probltme de Riemann Nous voyons donc que le s c h h a que nous utilisons,
ainsi pose. dordre 1 pour les flux est en ghntral dordre dro donc
Plusieurs techniques pour le rCsoudre dune facon inconsistant au sens des differences finies sur les
approchQ ont Ct6 proposks. maillages quelconques et en particulier en non
A notre avis, les problbmes qu'elles suscitent tant du ShUCtd.
point de vue du manque de fiibilit.6 (flux de Roe,
dOsher...) que de la viscosit6 numbrique importante
(flux de Van Leer...) les rendent peu attrayantes en Min dassurer la consistance du schCma, il faut donc
comparaison de la methode exacte de Godounov (Ref. que le^ &its de part et dautre des interfaces Sij soient
no2). Bien qu'elle ait la dputation dCue coOteuse au moins calculCs a l'ordre 2.
puisque it6rative. notre exp6rience montre que, pour Pour ce faire, nous utilisons l'approche M.US.C.L.
un niveau comparable doptimisation sur calculateur (RCf. 3): elle consiste 2I reconsmire linhkement les
vectoriel (CRAY). l'algorithme de Godounov n'est au variables p, pV, P ,Cv et N sur chaque cellule.
plus que de 20% plus cher que celui de Osher. Mors les &us nhssaires 2I la dsolution du problbe
Pour toUtes ces. . raisons. nous avons oDtC P ~ Ir L de Riemann sont du second ordre donc les flux
m
- al de GQ&Nnov. calculCs sont bien dordre 2.
Notre schema d'integration contient un point par
Notons que nous avons CliminC les mtthodes dites face, il faut donc, pour prdserver l'ordre de
cenrdes car elles nkessitent l'introduction dun m e precision, que ce point soit imperativement situ4
de viscosid artificielle paramCuable et donc ne sont au centre de gravite de la face.
gCnthlement pas utilisables en "boite noire". Quant aux calculs des gradients, now utilisons une
methode de moindres carres dont le support
Etudions la pkision des approximationsF.V.: repose sur les voisins principaux (c'est B dire ceux
qui ont une face en commun avec l'element
considere). De cette fason, nous obtenons des
gradients "centres" dont la prkision est au moins

I ;zt
du premier ordre (sur les maillages reguliers ou
non), ainsi la reconstruction est bien du second
soit (2) = 1(0- v) continue dCrivable ordre.

Nous utilisons pour plus de souplesse des maillages


I pcv I non smctuds constituCs de t6uahires, de prismes, de
pyramides et dhexaWes: ce demier type d'kltments
permet, lorsque les maillages sont reguliers,
d'obtenir une precision d'ordre 2 dans les zones
"importantes", les autres types servent A faire du
"remplissage". De ce fait, lorsque les faces
quadrangulaires (des hexaihlres par exemple) ne sont
plus planes, il faudrait deux points dindgration par
+
soit (4) F(M) = G(M) &h") une approximation face pour assurer la consistance. Nous envisageons
cette modification U u schtma A court tenne.
du flux exact 2I l'ordre n en M. h Ctant une dimension
caractkristiquede R dont le volume est Cgal A a. eradim
-
Comme F est une fonction lisse on peut Ctudier la
L'approximation spatiale pr6ddente est consistante
malheureusement. son indgration en temps pose
pkision de la discdtisation au centre de gravitC G de quelques di ficulds:
i2 lorsque l'on utilise les flux dordre n. 1 - pour un schCma dindgration Euler explicite,
Tous calculs faits : l'analyse de Fourier monodimensionnelle (de
1'Cquation de transport 1inCariste) dCmontre
I'instabilit6 de la mtlhode pour les grandes longueurs
d'onde (Rtf. 4).
32-4

2 - que1 que soit le schtma d'inttgration, la mCthode de Godounov, en multidimensionnelle pour des
est fortement oscillante h proximitt des zones de maillages de parallClCpip2des rtguliers.
discontinuid (chocs, variations importantes de taille
de mailles...). G&e Zi ces conditions, nous retrouvons facilementles
rtsultats ttablis en monodimensionnel pour des
LA solution la plus simple au premier problbme maillages rkguliers. lorsque I'on intkgre par un Euler
consiste h mettre en oeuvre un schtma temporel plus explicite:
Clabort (schtma de Heun explicite. Euler implicite...). - la mtthode de Godounov du premier ordre est stable
A CFL tgale 1,
Pour rtsoudre le second point dur, par contre, la - le limiteur minmod est stable Zi CFL, tgale 2/3,
solution que nous utilisons est apparent& B celle de - le premier limiteur de Van Leer, qui correspond la
Van Leer qui consiste h limiter les pentes comme suit: condition (9c) est stable h CFL tgale 1/2...

Considtrons pour simplifier, le cas d'une Quation de Bien que le raisonnement prtctdent soit issu de la
convection linthire multidimensionnelle A la vitesse lintarisation d'une Cquation scalaire, I'inttrCt
primordial de cet ensemble de conuaintes provient du
C (constante)de la variable scalaire a. fait que:
1- il est utilisable en multidimensionnel,
aa
soit (6) -+ E.GrZd(a) = 0 2- il est applicable Zi de nombreux schtmas en temps,
3- il n'est pas lit au type de reconstruction.
at
4- il est local, donc il autorise l'6tude de la stabilid
que I'on intkgre en temps sur la maille i, par une des mtthodes h pas de temps locaux.
mtthode A un pas, aprbs discrttisation du type .. . .
volumes finis, sous la forme: Dans notre code. nous uti1isons le Iim iteur
. .
GorresDondant Zi CFL Ceal 1/2 en I i m m
globalement le madient (su- r
b e variable) en le mu ltipliant DZ un c s f fICient
wi Dermet de sau'sfaire (9& (9bl et (9c),

Oh At repr&nte la dude comprise entre les instants n Nous noterons au passage que I'emploi d'un tel
et n+l et 6..la valeur de a qui determine le flux au limiteur peut parfois (rarement si le maillage est
13 rtgulier) faire chuter la prkision des flux h I'ordre 1.
centre de gravid de la face Sij entre n et n+l. Par exemple pour le limitew correspondant Zi CFJ.,
&gal 1/2, I'ordre 2 n'est effectif que lorsque le polyiklre
On desire c r k r un schtma localement Zi variation maximal ayant pour sommet les centres des voisins j
born& c'est h due vtrifiant la contrainte suivante: de i contient Zi la fois les centres de gravit.6 des faces
Sji et leur symttriques par rapport au Centre de gravid
G de 1'CICment i considtrt?.
Le cas bidimensionnel ci-dessous permet d'illusuer
cette condition: le polygone maximal est mathialist
Nous remarquerons qu'un tel schtma est positif, et par le triangle (1.2.3). le centre de gravid de la face
aprbs quelques manipulations, on en d6duit les SG1 et son symttrique le point (a) sont bien contenus
conditions suffsantes mais non nckessaires: dans ce polygone. par contre, bien que le centre de
gravitt de S(33 vtrifie la condition, son symtuique, le
point (c) ne la vtrifie pas et pour S G ~ c'est . le
contraire, seul le symttrique (b) est dans le triangle
(1,2,3). Cette configuration de maillage peut donc
rendre le schtma numtrique inconsistant du fait de la
limitation globale.

(9d) CFL = -
1 At
-
2 0 j
p.SijI
Cette formulation du CFL coihcide bien avec la
formulation usuelle monodimensionnelleet avec celle
32-5

Nous n'en ferons pas la dtmonstration mais. nous regroupement avec la limitation locale sans perte de
pouvons dire que cette condition est suffisante sans rendement.
ttre toujours ntcessaire: elle dtpend du champ Nous rappelons que le schtma de Heun consiste h
ahdynamique ttudit. calculer les flux (et les gradients) h l'instant n, puis
grice h cette approximation, on calcule l'ttat h
Enfin, la limitation que nous utilisons ne garantit pas l'instant n+l donc les flux h l'instant n+l. Alors. la
la monotonie du schtma en monodimensionnel. variation entre l'instant n et n+l correspond h la demie
Lorsque I'on ttudie par exemple le cas d u n e dttente somme des flux prkddemment calculCs en n et n+l.
dans le vide, cette caracttristique peut devenir Ce schtma est trh stable pour les phhombnes non
phlisante: il est alors difficile sans passer h l'ordre 1 lineaires.
de conserver des CFL corrects. Pour pallier ce
1.
problbme. nous avons mis au point une procuure Schema d
"monotonisante" qui consiste h prendre la valeur la Le schtma de Heun possikle de nombreuses qualids
plus proche de a i panni 6..et 6 .. (donc la valeur la numtriques, malheureusement, comme tout schtma
11 J1 explicite. il est pratiquement inutilisable lorsque la
plus proche du premier ordre) et nous procuons de durk h simuler est importante.
mCme du cod j. Nous voyons bien que de cette fwon, Si l'on analyse les phtnombnes intervenant lors dune
la reconstruction devient monotone et de plus, sur un dparation d'ttages. on remarque immtdiatement
maillage rbgulier, cette correction n'est que du qu'ils sont quasi-stationnaires sur presque tout le
misibme ordre pour les flux donc le r&ultat global champ de calcul. Seules, quelques rtgions sont
reste du second ordre lorsque le limiteur global n'est balaytes par des courants fondamentalement
pas effectif. instationnaires. Donc, il est inttressant, dans ces
zones, dutiliser de petits pas de temps, par contre,
2.2- Discrktisation temporeue ailleurs, de grands pas de temps sont suffisants.
Nous avons donc mis au point une technique de pas
de temps local qui est conservative, consistante et
Nous avons prktdemment soulignd le fait que le stable: I'int6gration temporelle adaptative (Rtf. 5 et
schCma du second ordre non limit6 est instable 8)-
lorsque l'on intkgre en temps par une mtthode de Dans chaque maille, on travaille en utilisant le pas de
Euler explicite. Pour remuier h cet inconvtnient et temps le plus proche possible du pas de temps
afin daugmenter la stabilitb du schtma limid (donc explicite maximum admissible.
travailler avec un CFL, suphieur 21 1/2), nous avons Soit Atmin le plus petit pas de temps sur tout le
impltmentt un schema d'intbgration explicite du domaine, pour simplifier la gestion des difftrentes
second ordre. classes temporelles, on ordonne les pas de temps en
En effet, l'ttude de la stabilit6 lint%re par I'analyse puissance de 2, proportionnellement h Ahiin..
de Fourier montre que les schdmas classiques Cest h dire que si le pas de temps admissible dans la
d'intdgration du second ordre sont stables pour un maille i vaut Dti alors on le transformera en:
CFL dgal A 1, lorsque I'on utilise une discrdtisation (10) Ati = AtminZLi
spatiale ddcentrke du second ordre avec QSI&
oii Li reprtsente le niveau temporel de la cellule i tel
centrCe. que:
Pour obtenir un schtma rclativement bon march& il
faut tenir compte du coOt informatique des difftrents
algorithmes intervenant lors de la discrttisation (11) Atmin .2Li I Dti < Atmin .2Li+1
spatiale. Ainsi, approximativement, la rtsolution du
problbme de Riemann reprtsente 15%du coQttotal, le Entre deux mailles, on posera comme principe que
calcul des gradients 30%.la limitation globale environ I'interface est du niveau temporaire le plus fort.
30% et la limitation locale environ 5%. le reste ttant Pour que la mtthode soit conservative, il faut que les
difficilement dpemriable inttgrales de flux de part et dautre de l'interface Sij
Nous voyons donc qu'il faut tviter si possible de soient identiques. Donc, il suffit de dtfinir en tout
recalculer les gradients et de les relimiter instant le flux de facon univoque sur Sij. Ensuite, si
globalement. Par contre, il est acceptable de refaire par exemple Li=Lj+l alors, dans la maille j on fera
une limitation locale et de recalculer les flux. deux iterations pour une seule dans i. Ainsi dans j
I1 faut, dautre part, prendre en considtration le fait l'inttgrale de flux vaudra:
que les Quations h rtsoudre ne sont pas lintaires: tous
les schtmas l i n h i d s sont Quivalents pour l'analyse :+At: t+Z.At : t+Z.At:

de Fourier. I1 faut mettre en oeuvre le schtma dont le


comportementnon lintaire est le meilleur.
(12) jkS..at+
t "
jG.S..at= jP.S..at
J1 :+At.J1
J J

t
Nous avons donc choisi le schkma de Heun mais J
avec un seul calcul de gradient. Concernant la
limitation globale, une pr&ure simplifik permet un et dans i:
32-6

depend de la methode mise en oeuvre pour g&er


I'cnscmble des maillcs.
Une approche plus facile consiste h utiliser les
contraintes permettant au schema d'Qtre h variation
born&.
Ce qui prouve la conservativid du systkme. CommenGons par limiter le pas de temps sur le
Soit A r6soudre: voisinage tel que
(19) Li = mjn(Lj)
J
puis, dduisons le saut de pas de temps entre mailles
qui, sur les solutions lisses, Quivaut k tel que:
aa ILi - Ljl I1, V( i, j)
(15) -
at
+ div(6) = 0 (20)
Cette dernibe op6ration n h s s i t e Lmm-1 idrations
Supposons maintenant une approximation du flux Ctant le niveau temporel maximum.
telle que: Nous obtenons ainsi le type de configuration suivante
(16) P(g, t) = @(g.t) + 8(Atm) + 8(h") + 8(Atp. h)
16
d g r e p h t e le centre de gravid des faces Sij . 14
Alors. si l'on intkgre ce flux sur la bordure de 12
l'bllement i en considerant que les troncatures spatio-
temporelles sont ind6pendantes pour mutes les faces,
E
c)
10
on peut Ctudier la precision de l'approximation h
a 8
l'instant Tm (milieu des deux bornes temporelles u 6
d'indgration) et en G (centre de gravid de n): E 4
2
. . 0
-+ d i ~ ( F ; ~ =) 8(hn-1)+8(-)+8(AtP)
Atm i i+l i+2 i+3 i+4
at
-~ h no de maille
+e(h2) + e(At2)
Pour que le s c h h a reste consistant, il faut que m soit
superieur h 1 doncJe schema doit Qtre au minimum en monodimensionnel.
Ainsi la maille i+4 fera une itbration pour seize
cordre 2 en temDs sur les flu^ (et s'il n'est que d'ordre
itesdtionsde la maille i.
2, une condition du type At/h borne est nkcessaire ) Un raisonnement simple nous montre que le schema
Le terme en A @ correspond h l'approximation . .
localement a vanahon born& l o r s a ~1'on limik
temporelle des gradients: dans le schema de Heun que leSgradients au debut de 11tb- *
1-
a la fin
nous utilisons, nous calculons les gradients une seule
fois, au debut de chaque idration, donc p 6gale 1 et le unserva nt le ~n -xa-i
schema est globalement du premier ordre en temps fiterau'on.De D~b. s'I la condition Li = min(Lj)
(dsultat &ja acquis puisque m egale 2).
. .
n'est Elus resDect&. le schema n'est Dlus A vanation
Quant a la condition AtJh borne, elle est k m k
automatiquementremplie par celle de CFL. En pratique, nous avons remarquk que la mCthode est
stable pour un CFL tgal a 1 alors que le limiteur ne
L'Btude de stabilid des schCmas temporels adaptatifs garantit la stabilid que pour un CFL de 1/2.
est relativement difficile puisque l'analyse de Fourier RCduire le saut de niveau temporaire est a priori
n'est plus utilisable. On peut par contre Ctudier la inutile du point de vue de la stabilid mais cette
diffusivid du schema pour 1'Quation la plus simple procedure simplifie CnormCment la gestion des
sur un maillage regulier: rr4lles (en particulier lors du calcul des gradients).

Concernant !'efficacid de la methode (qui est ddfinie


dt dx comme ttant le rapporr coQt du calcul en pas de
Si le s c h h a est diffusif, il p o d e des chances d6tre temps global sur coOt de calcul en temporel adaptatif),
stable, sinon, il est instable. on peut l'evaluer simplement lorsque l'on connait la
fonction de repartition des pas de temps. Dans la
Pour un schema de Heun du premier ordre en espace, pratique, cette fonction depend du maillage et des
la condition de positivitb de la diffusion numerique phenomhes locaux. Elle est donc variable. Le tableau
est simple puisqu'il suffit que le plus grand pas de ci dessous dktermine les limites de cette efficacitb
temps vkrifie la condition de CFL. Malheureusement, ainsi qu'une valeur moyenne et une valeur
pour un schema d'ordre 2 en espace, la condition
32-1

expdrimentale (la valeur moyenne correspond A une 1- Pour chaque face de M2, dbterminer la partie
rdpartition dquiprobabledes pas de temps) comprise dans chacune des mailles de M1. La somme
de ces parties &termine la couverture des faces.
2- Pour chaque face formant la frontitre de M1,
dtterminer la partie comprise dans chacune des
mailles coup&s (une maille 6tant consid6rb comme
coup6e si l'une de ses faces est partiellement ou
totalement couverte p si au moins une de ses faces
n'est pas totalement couverte).
Nous voyons donc qu'il s'agit bien du mCme
algorithme de base: determiner la surface dun triangle
contenue dans un polyMre.
Pour rtsoudre ce problbme, le plus simple est de
travailler dans le plan de la face triangulaire. On
d&rmine alors la trace polygonale du poly&re dans
ce plan (chose simple puisque le polyMre est form6
de triangles) puis la partie commune au triangle et A
Notons que ces valeurs sont intimement l i k s A ce polygone. Cette demitxe opt?ration ntkssite juste
l'algorithmique utilid. et que les formulations non la connaissance des segments onends qui constituent
structurtsessont bien adapt& A ce type de technique. la m e . I1 faut 6viter tout algorithmequi d&xmine le
chainage des segments entre eux: c'est inutile et
3- CHEVAUCHEMENT D E MAILLAGES excessivement coQteux.
Calculer la surface couverte n'est pas suffisant, il faut
L'idk de base est trbs simple (Ref. 6 et 7): aussi d6terminer son centre de graviu5. On d6termine
considkons qu'un maillage M1 se comporte comme alors le centre de gravid des interfacescoup& de M2
un masque qui se d6place et fecouvre partiellement un et celui des morceaux de la bordure de M1 qui
autre maillage M2. L'interface entre maillages est ferment les mailles couptk.
c r 6 k naturellement: il s'agit de la surface Les volumes et les centres de gravid des mailles
dintersection form& par 1'6videment dans le second coup6.e~de M1 sont alors calcul6s en utilisant les
maillage A42 du volume occupt? par le masque M1. formules suivantes (Green):
Les mailles de M1 ne subissent donc aucune
modification. Par contre, dans M2. il y a prbence de (21) ai =
1
-E
3 J
OM...%.
1J 1J
trois types de mailles (cf. figure 1):
1- les mailles totalement couvertes. 1
2- les mailles partiellementcouvertes, (22) oG=-cog
40.
..(OM ...S..)
3- les mailles totalement d6couvertes. 1 J 1J 1J 1J
Les mailles de la seconde cattgorie sont donc
Ob:
modifiks car une partie des faces qui les constituent
est couverte et de nouvelles faces correspondant A la - Mij est un point quelconque de la face plane Sij
limite exteme du masque sont cr&s. Ces nouvelles orient& vers l'exu5rieur de i,
faces forment l'interface entre mailles de M2 - j est soit un voisin "naturel" (donc une autre maille
(coupks) et mailles de MI (non modifih). de M2). soit recouvrant (donc une maille de M1 qui
est voisine par I'interface Ml-M2),
3.1- Cakul des intersections - G ttant le centre de gravig de i et gij. le centre de
gravitt de Sij.
Pour simplifier le probltme gbmttrique nous avons
considCr6 que toutes les faces des 616ments sont 3 2 - Optimisatwn du nombre d'opkratwns
planes. Les faces quadrangulaires sont donc trai&s
comme deux faces triangulaires. Nous n'avons donc Pour que la m6thode soit utilisable, il faut que le
plus qu'un seul type de faceue: le mangle. temps de calcul des intersections soit au plus du
L'algorilhme de calcul d'intersection se limite A deux mCme ordre de grandeur que le !?'TIPS de calcul d'une
&apes: ithtion de solveur drodynamique.
1- d6terminer le niveau de couverture de chaque face Soit N le nombre de mailles, alors, le nombre de
du maillage par le masque, facettes h la bordure de M1 est de l'ordre de N u 3 et le
2- determiner la partie de la limite exteme du masque nombre de cellules coup6es est du mike ordre.
qui feme chacune des mailles de M2 coup6es. Pour d6terminer la surface de la bordure qui ferme
Les deux Ctapes sont en fait identiques dun point de chaque cellule coup&, il faudra, ceIlule, environ
vue algorithmique lorsqu'on les reformule comme N ~ ofimtions.
D
suit: Pour toutes les cellules, il faudra donc de I'ordre de
N4D op6rations. Le solveur a6rodynamique nkessite
32-8

de I'ordre de N op6rations par itbration. II faut donc recalcule toutes les intersections (ainsi, le bilan
optimiser le calcul des intersections. volumique est exactement v6rifiC sans utiliser Aq).
La solution que nous avons retenue consiste A Cette technique permet de diviser le coat des calculs
dCminer, sur une grille c&sienne (ij,k) contenant dintersectionspar un facteur tr&s important (de I'ordre
N mailles, I'appartenance des differentes facettes de cent).
formant la bordure de M1. Ensuite, pour chaque
cellule coup&. on determine sa position dans la grille Finalement, il reste A traiter le problbme des mailles
et donc quelles facettes peuvent la fermer. Le fortement couvertes.
prkonditionnement des facettes nkessite environ N En effet, lorsque le recouvrement dune maille par le
ofiration, et le calcul par cellule coup& est de I'ordre "masque" conduit A des volumes tr&s faibles, la
de une op6ration soit, pour mutes les cellules envuon condition de CFL (9d) devient trop penalisante
N ~ op6rations.
D puisque le pas de temps doit tendre vers dro. La
Le coOt global est de N operations (le solution consiste A assembler ces mailles avec des
prkconditionnement est plus cher que le calcul mailles voisines suffisamment dkouvertes. De cette
d'intersection !) donc compatible avec facon. I'ensemble forme dune maille suffisamment
I'a6rodynamique. dkouverte et de ses associes constitue une "macro-
maille" dont le volume est assez grand pour ne plus
3.3- Priorit6 de mailhges p6naliser le pas de temps.
Nous utilisons comrne critkre le taux de couverture:
L'utilisation d'un masque et dun maillage masque une maille doit ttre assemblk lorsque sont volume est
manque de souplesse. En effet, les mailles du maillage couvert A plus de 70%. D'autre part elle sera
masque sont par exemple mieux adapttks au calcul assemblCe avec le voisin qui possikie avec elle en
dune couche limite autour du corps lie h ce maillage commun l'interface la plus grande et qui est d h u v e r t
que les mailles du masque. A plus de 30%en volume.
I1 est donc intbressant de dCfinir des zones prioritaires Lorsque l'assemblage n'est pas possible directement,
que le masque ne peut couvrir mais qui au contraire, une procedure iterative est mise en oeuvre: on
couvrent le masque. assemblera alors par I'intermddiaire dune cellule qui
La figure 2 nous montre le rtsultat d'une telle elle mtme est assemblk (...).
strat6gie. Sa mise en oeuvre ne pose pas de problbme Lorsque I'assemblage n'est pas possible du tout, on
particulier. evince du calcul les mailles incriminks.

3.4- Calcul desjlux a la fronti2re et assemblage APPLICATIONS

Les flux A la fronti&resont calculCs de la mtme facon Nous prdsenterons ici des cas de calcul illusttant les
que les flux entre deux mailles appartenant au m t m e possibilitbs de la mCthode.
maillage: puisque nous travaillons en non structure, la Les figures 3 et 4 montrent le type dapplications
topologie importe peu donc une interface entre trait&s &e au code de calcul FLUSEPA.I1 s'agit de
mailles sera trait& toujom de la mtme facon. que ces simulations tridimensionnelles. Le Mach exteme est
mailles appartiennentou non au mtme maillage. compris entre 5 et 6.
Quant A la premibre equation du systkme (1) Pour la separation d'Ctage de missile sous incidence
concernant la variation de volume, elle permet, (figure no 3), la phiode simulk est d'environ 150 ms
lorsque les mouvements sont lents, d'eviter de et la dur& du calcul est de 12 heures en pas de temps
recalculer les intersections aprb chaque iteration global. Notons que ce type de simulation nkessite
aerodynamique:' aussi bien le calcul de I'koulement externe que de
- pour chaque maille coup& i, on Cvalue I'increment I'koulement inter Ctage puisque ils interagissent tr&s
de volume A % dQ au mouvement relatif des fortement entre eux. Dautre part, dans I'inter Ctage.
maillages, les pressions peuvent devenir importantes (lorsque la
- on determine I'increment relatifmaximum sur mutes section de passage vers I'exdrieur est faible). De ce
ces mailles fait, I'koulement dans la tuyhre peut Ctre fortement
dkollC: il faut donc imp6rativementle calculer aussi.
(23) AIma
0
1 I
= m p ( do. / 0;)
11
Pour le largage d'acctlerateurs (figure no 4). la
pCriode simulCe est d'environ 1.1 seconde. Les
oh designe le volume initial, maillages comportent environ 100 000 mailles et la
- lorsque AIrna est faible, on fait le bilan durk du calcul est denviron 40 heures en temporel
adaptatif (sur Cray YMP). Le gain de temps par
volumique pour tenir compte des dkplacements sans rapport A un schema A pas de temps global est
remettre A jour les caractCristiquesdes interfaces, d'environ un facteur 20. Afin de souligner la
- lorsque AImm est grand (ou bien lorsque la robustesse de la mCthode. nous prkisons que les
somme des AI,, calculks depuis la dernikre phCnomencs rencontrks lors de cette simulation sont
fortement instationnaires (acoustique...) et que dans
remise A jour des intersections est grande), on
32-9

les zones de chws fons, les rappons de pression son1 "Temporal Adaptive Euler/Navier-Stokes Algorithm
de I'ordre de 4 OOO. Involving Unstructured Dynamic Meshes"
AIAA Journal, Vol30, n"8.1992
Notons pour finir que des etudes de validation (aussi
bien bidimensionnelles que tridimensionnelles) (6) S.L. HANDCOCK
comprenant des comparaisons avec des mesures "Finite difference equations for PISCES 2 DELK. a
exrnmentales ont dtk m e n h aveC sucds. coupled Euler-Lagrange continuum mechanics
Afin de ruuire wtre expod, nous ne les prbenterons computer program"
pas ici. Physics Internatonal Technical Memo - TCAM 76-2,
1916
CONCLUSIONS
(7) P. BRENNER
Les ddmarches thbriques que nous avons men& "Three-Dimensional Aerodynamics with Moving
nous prouvent aussi bien la consislanceque la stabilitk Bodies Applied to Solid Propellant"
lindaire de la mdthode sur des maillages AIAApaper91-2304,1991.
multidimensionnels non reguliers en ,espace et en
temps. L'exphimentatjon numdrique nous a dkmontr6 (8) P. BRENNER
le bon comportement des schdmas lors de la "Numerical Simulation of Three-Dimensional and
r~%lutimde syskmes non lineaires. Unsteady Aerodynamics About Bodies in Relative
Quant A la pkision de la mdlhode, elle est d'ordre 2 Motion Applied to TSTO Separation"
en espace et en temps sur les maillages d'hexatdres AIM paper 93-5142.1993.
srmctuds dguliers et d'ordre 1 au moins ailleurs.
Nous notemns finalemem que le potentiel d'dvolution
du code est imponant puisque, par exemple. il est
envisageable dadapter le maillage par ddformation
(dtant donnde notre formulation A-L-E.). par
enrichissement (nous travaillons en non suuctun5) ou
gdce au chevauchement d'un maillage localement
adaptk 2 I'koulemenl...

EJiF&mS
(1) S.K.GODOUNOV,A. ZABRODINE,
M. IVANOV, A.KRA&O, G. PROKOPOV
"R6solution numdrique des p r o b l h e s
multidimensionnels de la dynamique des gaz"
Editions MIR - Moscou

(2) M. POLLET
"Comparison of transport schemes for Navier-Stokes
equations. application lo mket propullion"
7h International Conference on Numerical Methods
in Laminarand Turbulent flow, Stanford USA.1991.

(3) B. VAN LEER


"Towards the Ultimate conservative Difference
SchemeV. A Second Ordre Sequel of Godunov's
Method".
J. Comput. F'hys. 32.1979.

(4) F. GODFROY. P.JACQUEMIN and F. JOUVE


"Three dimensional simulation of unsteady and
inviscid flows using a second order finite volume
method. Application to flows inside solid propellant
motors"
Computing methods in applied sciences and
engineeringEDITION Glowinski, INRIA

(5) W.L.KLEB andf.T. BATINA'


32-10

figure 1: chevauchemnt de maillage

figure 2: pricnib5 locale de reCOuvrement


32-11

figure 3: separation sous incidence par allumage direct

figure 4: largage d'accd6rateurs par fusees d'6loignement


33- I

Efficient Numerical Simulation of Complex 3D Flows with Large Contrast

R. Radespiel, J.M.A. Longo, S. Briick", D. Schwamborn"


DLR Institute of Design Aerodynamics
Braunschweig, Germany
*DLR Institute of Fluid Mechanics
Bunsenstrasse 10
37073 Gottingen, Germany

1. SUMMARY - Robustness in regions of strong flow expansion


Recent progress in flux vector splitting is reviewed with - Capturing of grid-aligned slip lines without numeri-
the aim to obtain high resolution and robustness for hy- cal smearing
personic reacting flow simulations. The numerical be- - Provision of an adaptive dissipative term in order to
havior of promising AUSM und CUSP discretization achieve sufficient numerical damping under adverse
variants is reported and compared. These schemes can grid or flow conditions
be combined with explicit multistage time stepping and The first requirement addresses the ability of the scheme
multigrid. Large chemical source terms introduce stiff- to resolve complex 3D shock interactions with a limited
ness into the system of equations which is removed by amount of grid points. Moreover, oscillations at shocks
point implicit treatment. The present results demonstrate may prevent convergence of the overall methods to the
that efficient 3D simulations of viscous reacting flows desired steady state solutions. The second point relates
with large contrast generated by strong shocks are now to the failure of various prominent discretization
feasible. schemes when applied to rapid flow expansions into
near vacuum conditions. Additionally, the scheme
2. INTRODUCTION should resolve viscous shear layers with minimum nu-
Accurate computations of 3D complex flow fields will merical smearing in order to keep the number of grid
play a key role in the aerothermal design of high speed points reasonably small. The fourth requirement results
vehicles such as reentry configurations. Not only can from the experience that one can usually not avoid ad-
flow simulations shorten design cycles and save cost but verse grid situations in 3D, particularly high values of
they reduce uncertainty margins in heat loads and aero- cell aspect ratio. However, the available convergence
dynamic forces. A prominent example is the US-Orbiter acceleration techniques such as residual smoothing and
vehicle, whose thermal protection system is heavy due multigrid rely on the damping of transient high fre-
to heat transfer uncertainties. Moreover, the space shut- quency modes in the solution for which controlled artifi-
tle experienced an unexpected hypersonic pitch up cial dissipation is necessary.
which had not been predicted by conventional cold hy-
personic wind tunnel testing. 3. SHOCK CAPTURING, HIGH-ORDER
SCHEME AND ADAPTIVE DISSIPATION
The extensive use of 3D flow simulations for complex
configurations within the aerodynamic design cycles has Progress in flux vector splitting has recently demon-
been precluded until recently by several reasons. strated that the aforementionedrequirements can be ful-
Among these are long computation rimes of the codes filled without characteristic decomposition of the invis-
simulating viscous flow or nonequilibrium chemistry. cid flux and the corresponding matrix operations. The
Also, many codes are not sufficiently robust in flow re- present paper covers two promising approaches into this
gions with strong shocks and flow expansions. Other direction which were initiated by Liou [1,2] and Jame-
codes are robust but they fail to resolve contact disconti- son [3,4]. Other related pieces of work on the subject are
nuities such as boundary layers. found in Refs. [5-71. The principal idea of flux vector
splitting is shown by application to the 1D Euler equa-
As a result of various attempts to solve complex hyper-
tion
sonic reacting flows numerically we can formulate some r 1
key requirements for the underlying solution algorithm.
These are
- Capturing of strong shocks without oscillations of
the dependent flow variables

Paper presented at the AGARD FDP Symposium on "Progress and Challenges in CFD Methods and Algorithms"
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
33-2

Assume that the computational domain is discretized in Equilibrium of the state (L) is obtained if the flux in be-
intervals with the centers denoted by i-1, i, i+l, ... and tween (LJ and (R) is obtained by full upwinding. Also,
the cell faces by i-1/2, i+1/2, .... Then, the numerical the state (R) is in equilibrium if an upwind flux is used
flux at the interface i+1/2 can be approximated accord- for the interface. This requirements can be fulfilled by
l
ing to Liou [ 1,2] defining the speed of sound at the shock

(4)

for upwinding the flux where c* is the critical speed of


sound. Hence, the state (R) is made supersonic and it is
fully cancelled by the flux formulation (2). Generaliza-
tion of equ. (4) for arbitrary flow direction and speed is
given in Ref. [2]. This scheme is called AUSM'.
Another way to shock resolution with AUSM was ob-
where L and R denote the right and left states at the cell tained by the observation that the highly dissipative flux
face, M m is the upwind weighted Mach number at the
of Van Leer [9] differs from AUSM by a dissipative
cell face and p, p" are Mach number weighted contri-
term.
butions of the left and the right pressure values. Upwind
weighting is accomplished by proper polynomials of the
local Mach number [2], by which the scheme is made
purely upwind for supersonic flow whereas central dif-
ferencin is obtained for M + 0. This scheme is called
AUSM. B forO<M<l
The alternate approach followed by Jameson [3] is to
take a central average and subtract a diffusive term in subsonic flow and it is identical in the supersonic re-
gion [5]. Smoothly captured shocks m a y then be ob-
tained by defining
L L' L
(3) FHybrid = ( l - FVan Leer + oFAUSM
d = p1 c ( W R - W L ) +$(FR-FL)
1 4

and o depends on the second difference of the pressure


l+z
in order to detect shocks. This scheme is called hybrid
AUSM.
with w = (p, pu, p ~ ) NOW, ~ . the diusion coeffi- With HCUSP, on the other hand, the shock structure is
cients 01 and p must be chosen such that upwinding is analyzed according to Refs. [3,4] for a shock with a sin-
. obtained in the supersonic range and d + 0 for M + 0. gle interior zone shown in Fig. 2. Again, the states (L)
This scheme is called HCUSP. and (R) satisfy the jump conditions and (L) is super-
Both approaches use scalar dissipation functions so that sonic. Equilibrium of the shock is obtained if fluxes
the computational expense of the overall method is pro-
fL/A = f, andfA/R = f R
portional to N where N is the number of flow equations
to be computed. Note, that there is particular motivation Then the flux balance for points L, A, R is zero. The
to use the equation (3) rather than (2) if the discretiza- condition at the entrance to the shock is fulfilled if the
tion is combined with explicit multistage time stepping. flow is supersonic at (VA). The condition at the shock
Then, very effective hybrid multistage schemes are at exit leads to a Hugoniot equation for a moving shock
hand [8] for which the dissipation terms are only evalu-
ated at m out of totally n stages.
The conceptual differences between both flux vector
split schemes show up for the problem of resolving a This equation can be solved by Roe linearization [lo]
stationary shock wave. Fig. 1 sketches the situation en- and yields the relation
countered in the analysis of AUSM. It is assumed that uc = ( 1 + p ) ( c - U ) f o r O < M < l . (6)
the states (L)and (R) fulfill the jump conditions. between the dissipation coefficients.
Jameson [3] has used equ. (6) in GT?- to derive a suit-
'The extension to multidimensionson structured grids able form of dissipating coefficients a and p, i.e.
is standard and may be found in Ref. [5]
33-3

a c = IMI -pa sults is generally improved.


Finally, it is necessary to add controlled artificial dissi-
pation in flow regions where the damping characteristics
of the basic scheme are too bad in order to allow proper
by which central differencing is again approached for convergence to the steady state solutions. Fig. 4 shows
M+O. We note that equ. (6) is not respected for M 4 . 5 the situation of a cell with high aspect ratio in two di-
so that one can expect problems with shock capturing if mensions. In this case transient modes in the direction of
the Mach number at interface (A/R) is less than 0.5. the short cell edge will be well damped by an explicit
The spatial accuracy of the flux vector split schemes de- time integration method whereas modes along the long
pends on the determination of the left and right states at side of the cell remain almost undamped. This problem
the cell interfaces. For a firstader scheme the flow can be solved by a modification of the advection func-
quantities at the left and right states are given by their
values at the neighboring mesh points, i.e. i, and i+i, re-
spectively. In the present work higher order accuracy is
obtained with the MUSCL approach by which the flow
tion (AUSM scheme).

Fc = $SI
1 [
M,, 1 sum
advection
I-@[ diff
advection
1)
quantities are extrapolated to yield left and right states. where (0 is a function [SI of the spectral radii, A,. in the
The extrapolation function is designed such that the ac- coordinate directions i and j so that
curacy is limited to first order at discontinuities in order
to guarantee shock capturing without spurious oscilla- @ = IML,RI for 4 >>+
tions. Unfortunately, we find that the two flux vector @=6 for 4 << Aj
split approaches described above should not be com- m i c a l values of 6 used in the present work are 6=1/4.
bined with the same extrapolation functions. This adaptive dissipation formulation makes sure that
The AUSM scheme works well with the van Albada boundary layers are not numerically smeared but there
limiter function is sufficient damping of modes in the direction of long
cell sides. A similar formulation has been implemented
into the HCUSP scheme.
The capabilities of the present discretization schemes
for perfect gas flows with shocks and shear layers are
where A+ = U ~ + ~ - U, A- ~ = ui-ui-l assessed by computations of transonic and hypersonic
We exvapolate the primitive variables and the total en- two-dimensional flows. Fig. 5 compares distributions of
thalpy using equ. (7). Extrapolation of the latter quantity pressure coefficient,total pressure loss and grid conver-
is needed in the energy flux in order to allow steady gence of the aerodynamic coefficients for transonic in-
state solutions with constant energy. Also,the parameter viscid flow over NACA 0012 airfoil. AUSM' and
E is made large if the contravariant velocity is smaller
HCUSP yield comparable shock resolution whereas the
than a certain fraction of the speed of sound. Doing this, hybrid AUSM appears to be more dissipative at the
clipping within boundary layers and false interpolation shock. The HCUSP scheme generates more entropy at
values of the contravariant velocity components are the leading edge and lift and drag values converge
avoided. 'I)pical results of limiter applications for high somewhat slower with grid density as compared to
Reynolds number viscous flows are shown in Fig. 3. AUSM. On the other hand HCUSP is more rapid with
The pressure contours at the rear part of RAE 2822 air- respect to the residual convergence as compared to
foil at transonic flow conditions shows oscillations near AUSM. Typical convergence rates of the multigrid
the edge of the turbulent boundary layer if limiting of method described below are 0.90 for HCUSP and 0.94
the Cartesian velocity components is applied in the tradi- for AUSM.
tional manner. These oscillations disappear if the limit- The resolution of very strong shocks and hypersonic
ing operator is switched off for small values of the Mach shear layers is shown in Fig. 6. The Mach number con-
number in the contravariant coordinate direction. More tours obtained for inviscid flow around a blunted wedge
technical details of the limiter can be found in Ref. [ 5 ] . demonstrate almost perfect shock capturing within one
Unfortunately, we find that the application of the van cell for AUSM' and HCUSP whereas hybrid AUSM
Albada limiter with the HCUSP scheme yields some needs one interior point for this case. The resolution of
preshock oscillations. Hence, we use the limiting func- the thermal boundary layer which is displayed on the
tion described in [4] for the HCUSP scheme. That func- right part of Fig. 6 is similar for HCUSP and AUSM.
tion has also been extended to avoid limiting in low We note that both schemes give much better results
Mach number regions, by which the accuracy of the re- compared to a conventional central-difference scheme
33-4

with a single scalar viscosity (not shown here). schemes can be combined with multigrid algorithms in
order to accelerate convergence to steady-state, accord-
4. OPERATOR SPLI'ITING AND IMPLICIT ing to Ref. [8].
TREATMENT OF THE CHEMICAL SOURCE
Coarse meshes for the multigrid are obtained eliminat-
TERMS
ing alternate points in each coordinate direction. Both
For flows with nonequilibrium chemistry additional the solution and the residuals are restricted from fine to
conservation equations with chemical source terms oc- coarse meshes. A forcing function is constructed so that
cur, which render the system of equation stiff if the time the solution on a coarse mesh is driven by residuals col-
scale of the chemical reactions is significantly smaller lected on the next finer mesh. The corrections obtained
than the fluid mechanics time scale. A simplified form on the coarse mesh are interpolated back to the fine
of the conservation equation is given by mesh. This multigrid scheme is now widely used in the
a
-W = -F+ S
CFD community and it works quite well for a wide
at range of subsonic and transonic flow problems.
However, a number of modest modifications of the orig-
inal multigrid scheme are necessary for high Mach num-
ber flows with strong shocks and strong variations of
S = (O,O,O,O, 0, S,, ...Sn)T , F=discr. flux
viscosity and conductivity coefficients. We employ a
The full set of equations used for reacting flow is given special set of Runge-Kutta coefficientswhich are opti-
in Refs. [ 11, 121. In order to overcome the time step lim- mized for damping with upwind discretization and re-
itations due to small chemical time scales we employ sidual smoothing [14]. Courant numbers of about 5 are
implicit discretizationof the source terms, used in the present work which is about twice the ex-
plicit stability limit. Strong variations of viscosity and
= -Fn+Sn+l conductivity occur in hypersonic viscous flows. Typical
At
time scales of the viscous diffusion process may be
Using a linearization of the source term at time level (n) much smaller than the convection time scale which puts
one obtains a point-implicit update of the solution vec- a severe restriction on the time step if purely explicit
tor W for the time level (n+l), time integration is sought. This problem may be circum-
vented by locally adjusting the coefficientof the implicit
(8) residual smoothing scheme [15], such that the original
time step based upon the inviscid flux vector is recov-
The Jacobian matrix has no entries in the first five rows ered,
because these equations have no source terms. Hence, At = CFL V
the update of equ. (8) can be broken up into? fully ex-
plicit update forW, = (p, pu, pv, pw, PE) followed Ih,I lhtll+ lhrl
+

T where h, denotes the spectral radius of the inviscid flux


by apoint-implicit update for W, = (p ,...p,) which
involves solution of (n-1) linear system for each grid Jacobian in the 6-coordinate direction and V is the cell
point. volume. At strong shocks large Courant numbers ob-
The evaluations of the explicit source vector, S", the ele- tained with the help of residual smoothing will result in
ments of the flux jacobian, aS/aW In,
and the solution solution divergence. Therefore an adaptive time step is
of the linear system usually take much more computer employed such that the Courant number is reduced to
time than the remaining elements of the solution algo- about 1 at strong shocks [ 141.
rithm. For multistage time stepping schemes the linear- The multigrid scheme involves restriction and prolonga-
ization of the source vector around the old time level is tion operators which are both modified for hypersonic
appropriate [ 131and hence, the derivatives aS/aW can flows. At strong shocks the restriction of residuals from
be held constant during all stages. coarse to fine meshes is damped by using the second dif-
ference of the pressure as a sensor in order to reduce the
5. MULTIGRID METHOD FOR HIGH SPEED coarse-mesh corrections in that region.
FLOWS We have also observed that the coarse meshes can pro-
Explicit multistage time-stepping schemes are used for mote upstream propagation of transient modes if central
advancing the solution in time. Choosing the number of interpolation is used for prolongation of the corrections.
stages and the stage coefficients allows an optimization This problem is resolved by using an upwind biased in-
of the high-frequency damping properties of the scheme terpolation of the corrections where the Mach number in
at relatively high Courant numbers. Hence, these the contravariant coordinate direction is used to define
33-5

the bias [16].


6. NUMERICAL RESULTS FOR HYPERSONIC
REACTING FLOWS
Comparisons between the different flux vector splitting
variants have been presented for 2D calorically perfect
gas flows within Chapter 2. Our experiences for reacting
flows and for complex 3D flows are based upon the hy-
brid AUSM scheme, until now. The hybrid AUSM spa-
tial discretization described in Chapter 2, the implicit
source treatment given in Chapter 3, and the multigrid
elements of Chapter 4 are implemented into the 3D
DLR multiblock code CEVCATS [17, 51. The imple-
mentation of the thermodynamic model, the chemical
reactions and the viscous terms are general such that any
chemistry model can be employed without modifica-
tions in the source code. Moreover, the code runs effec- the vehicle. The computations were done on a mesh
tively on vector computer by vectorization over all grid with 2.6 million grid points. Additionally, local grid re-
points within a block of the computational domain. In- finement was investigated in the separation region
ner loops containing the number of species or the num- around the deflected body flap. The numerical solutions
ber of reactions are unrolled by compiler directives [12].
Hence, we have obtained a computational speed of
about 1500 MFLOP/s on a single processor of NEC-
SX3 computer. This corresponds to 50 p computing
time for the update of a single grid point by one multi-
grid cycle and assuming a reacting gas mixture of five
species with 17 chemical reactions. The use of point im-
plicit operators and multigrid for reacting flows was also
investigated with the help of a quasi 1D code for nozzle
flows which contains the algorithmic elements pre-
sented in the previous chapters.
The capabilities of the multigrid method for reacting
flows with large contrast are investigated by applica-
tions for one-, two- and three-dimensional flows. At
first, we have chosen inviscid transonic reacting flow in
a diverging nozzle in order to demonstrate the effects of
point-implicit time stepping and multigrid acceleration
separately. Fig. 7 displays the distributions of tempera-
ture and the concentrations of the three species present
in the flow. The dissociation reaction rate coefficients
for oxygen have been chosen such that strong reactions
take place for temperatures above lo00 K. Hence, the ACKNOWLEDGEMENT
flow simulation represents shock induced dissociation at The authors would like to thank R.C. Swanson for his
reentry flow conditions. The dissociation time scale is cooperation on the 1D flow simulations. The communi-
small enough so that explicit time stepping alone does cations with N. Kroll on the spatial and time discretiza-
not yield a converged flow solution within several thou- tion schemes are gratefully acknowledged.
sand time steps. With point-implicit time stepping the
code converges slowly to the steady state. Convergence 7. REFERENCES
is noticeably accelerated by application of 4-level multi- 1. Liou, M.S. and Steffen, C.J.: A New Flux Splitting
grid, by which a convergence rate per multigrid cycle of Scheme. J. Comput. Phys. 107.23 (1993).
about 0.95 is realized. 2. Liou, M.S.: A Continuing Searchfor a Near-Perfect
The second application is the viscous reacting flow over Numerical Flux Scheme, Part I: AUSM'.
a 2D cylinder which is displayed in Fig. 8. This case has NASA TM 106524,1994.
3. Jameson, A.: Positive Schemes and Shock Modelling
33-6

3. Jameson, A.: Positive Schemes and Shock Modelling


for Compressible Flows. Lecture at 8th Finite Ele-
- W

- states Land R satisfy


ments in Fluid Conference, Barcelona, 1993. To ap- jump conditions

pear in Int. J. of Num.Methods in Eng. - L is supersonic


4. Tatsumi, S.; Martinelli, L; Jameson, A.: A New High
Resolution Scheme for Compressible Wscous Flow R

with Shocks. AIAA Paper 954466,1995. Fig. 1 AUSM+ concept of exact shock capturing
5. Radespiel, R. and Kroll, N.: Accurate Flux Vector
Splitting for Shocks and Shear Layers. J. Comput. - W

- slates L and R
Phys., Vol. 121, No. 1, (1995). satisfy lump conditions
6. Wada, Y. and Liou, M.S.: A Flux Splitting Scheme - L is supersonic
with High Resolution and Robustness for Disconti-
nuities. AIAA Paper 94-0083,1994.
7. Coquel, F. and Liou, M.S.: Hybrid Upwind Splitting Fig. 2 CUSP shock concept of single interior point
(HUS) by a Field-by-Field Decomposition. NASA
T M 106843,1995.
8. .Jameson, A: Multigrid Algorithm for Compressible /Mach number independent limiter]
Flow Calculations. MAE Report 1743. Princeton
University, October 1985. Text of Lecture given at
2nd European Conference on Multigrid Methods,
Cologne.
9. Van Leer, B.: Flux-Vector Splitting for the Euler
Equations. Lecture Notes in Physics 170, pp. 507-
512, (1982).
10. Roe, P.L.: Approximate Riemann Solvers, Parameter
Vectors, and Difference Schemes. J.Comp. Phys. 43,
pp. 357-372, (1981).
11. Brenner, G.: Numerische Simulation von Wechsel-
wirkungen zwischen Stllpen und Grenzschichten in
chemisch reagierenden HyperschallstrOmungen. ~t=2.79"
DLR-FB 94-04, (1994). Re=6.5 E6
[Mach number dependent limiter]
12. Briick, S.; Radespiel, R.; Schwamborn, D.: Exten-
sion of the Euler-mavier-Stokes Code CEVCATS to
Inviscid Nonequilibrium Flows. DLR-Internal Re-
port 223-95 A03, (1995).
13. Kroll, N.Private communication, (1994).
14. Radespiel, R. and Swanson, R.C.: Progress with
Multigrid Schemes for Hypersonic Flow Problems.
J. Comput. Phys., Vol. 116, (1995).
15. Radespiel, R. and Kroll, N.:Multigrid Schemes with
Semicoarsening for Accurate Computations of Hy-
personic Wscous Flows. DGLR Report 90-6, (1991).
16. Blazek, J.: Verfahren zur Beschleunigung der U-
sung der Euler- und Navier-Stokes-Gleichungenbei
stationdren Ober- und HyperschallstrOmungen. Fig. 3 Effect of false interpolation by traditional
DLR-FB 94-35, (1994). limiters for AUSM
17. Atkins, H.: A Multiple-Block Multigrid Method for
the Solution of the Three-DimensionalNavier-Stokes
Equations. DLR-FB 90-45, (1990). S
18. Weilmuenster, K.J.; Gnoffo, P.A.; Greene, EA.:
~

Navier-Stokes Simulations of the Shuttle Orbiter the spectral radii of the inviscid flux Jacobians in i and j,
Aerodynamic Characteristics with Emphasis on +>>A
Pitch Trim and Body Flap. AIAA Paper 93-2814, Fig. 4 Cell with high aspect ratio in two dimensions
(1993).
33-1

1.oo
CP
0.50

0.00

-0.50

-1 .00

1.oo
CP
0.50

0.00

-0.50

-1.OO

1.00
CP
0.50

0.00

-0.50

-1.oo
0.0 0.2 0.4 0.6 r/c 0.8 1.0 0.0 0.2 0.4 0.6 x/c 0.8 1.0

- HCUSP
grid convergence: N = no. cells

__--A

0.060

0.3 0
0.000 0.002
1/N O'Oo4
Fig. 5 Pressure. total pressure loss and grid convergenceof total force for hansonic inviscid airfoil flow
33-8

0.06-
M_=lO, a=O"
TJT_=lORe,=lOOOo
0.04

o'oo-120 -100 -80 -60 -40 -20 0


X

0.06-
St
0.5
0.04 - .\
0.02
-
+ grid32x24
grid64x48

0.00 1

120 -100 -80 -60 -40 -20 0


X

0.06
St

0.04

0.02 -
- grid-4
gridMx48

0.oc
20 -100 -80 -60 -40 -20 0
X

Mach contours heat flux


inviscid flow viscous flow

Fig. 6 Mach contoursand wall heat flux for hypersonic flow over blunted wedge
33-9

T
1000

500

0
' 0
0.5 1 .o 1.5

2.0; 6
\
0.0: 1

II)
mulglid. 3 levels
2
v
4.0 mdugrid, 3 levels pits. m -me mash
cn
-
0 -single glid
3.0

2.0
-
1.o

0.0 -.__._
-.-I'-----_
',_r--__

---_
-1.0 1
0.0 100.0 200.0 300.0 400.0 f 1.0
multigrid cycles
Fig. 9 Convergence histories for flow of dissociating air over 2D cylinder
1 - bow shock

4 - 2nd reanachment
5 - 2nd separation

2 630 596 points

Fig. 10 S t r u m of 3D flow over HALIS mnuy shape at wind m dconditions M,=10

0.5
0.0 0.5 XI1 I.a

Hg. 11 Grid convergence study for HALIS and annparison with wind m ddata in ONERA S4Ma
33-11

M624, H=72km 0 ~ 4 0 ~ 2.00- ~ M-=IO. Re=0.61 E6, fine mesh


M-=IO, Re=0.61 E6,mediummesh
nonequilib. chem. ...... M-==24. H=75km, fine mesh
CP
Stanton nonequii. chemistry

1.50 -

1.oo -

10 0.50 x/L 1.00

Fig. 12 Grid convergence for heat flux along windward Fig. 13 Ressure distributions along windward
si& of HALIS forebody at flight trajectory symmetq line of HALIS
point

M_=lO, perf. gas M_=24, H=72km


S4Ma W.T.cond. nonequilib. chem.

3:
h .
v)

1
0

oarse

mediu

"
-lo' ' ' ' I " '
500 1000
cycles
fig. 14 Convergence histories for HALIS forebody computations
34- 1

METHODES DE DECENTREMENT HYBRIDES POUR LA SIMULATION


D’ECOULEMENTS EN DESEQUILIBRE THERMIQUE E T CHIMIQUE

Fridiric COQUEL t Vironique JOLYt et Claude MARMIGNON t


*I

* Laboratoire d’Analyse Numirique, CNRS URA 189, Universitk Paris VI, 75252 Paris Cedex 05
t Division de 1’ACrodynamiquethiorique 1, ONERA, B.P 72 92322 Chitillon Cedex, France.

sition de flux et l’approche de type Godunov. Nous


soulignerons les principales itapes qui la composent et
Abstract nous ne dicrirons dans le prisent papier qu’une tech-
nique d’hybridation particuliire basie sur la mithode
d’Osher et la mithode de van Leer. Afin d’illustrer
The purpose of this paper is to present an hybrid u p
wind splitting method fully adapted to viscous chem- la mithode numirique utilisie, des risultats de calculs
ical and thermal nonequilibrium flows. Such flows are pour des configurations d’icoulements externes et in-
the site of strong viscous-inviscid interactions and are ternes sont prisenties.
dominated by real gas effects due to dissociation and
internal mode excitation. Furthermore, the hyper- 1. Introduction
velocities along the reentry trajectory induce a large
degree of thermo-chemical nonequilibrium. ONERA Nous nous intiressons A la risolution du systime
has developed a code for simulating such flows: the gouvernant les icoulements de gaz en disiquilibre ther-
code CELHYO. Detailed works concerning the physi- mique et chimique. De tels Ccoulements se produisent
cal modeling having already been presented in previ- lors de la rentrie dans l’atmosphkre d’un corps ou d’un
ous papers [4] (121, emphasis is put here on the nu- vihicule hypersonique. A ces vitesses, l’icoulement
merical method, and particularly on the extension of atteint de trks hautes tempiratures prks du vihicule.
hybrid upwind splitting methods to nonequilibrium Ces tempiratures sont suffisamment importantes pour
flows. The hybrid upwinding is achieved by combining induire des effets de gaz reels complexes comme la
the basically distinct Flux Vector and Flux Difference dissociation de l’air, la relaxation vibrationnelle et
Splitting approaches in retaining their own interesting Cventuellement l’ionisation. L’ ONERA a diveloppi
features. The hybrid method implemented in the code un code de calcul simulant numiriquement de tels
CELHYO has been obtained by hybridizing the Osher icoulements: le code CELHYO. Des Ctudes ddtail-
approach with the van Leer scheme. In order to illus- lies relatives a la modilisation physique ayant dija
trate the numerical method, internal and external flow fait l’objet de plusieurs articles [4] [12], nous nous
configurations are presented. attachons ici aux travaux effectuis dans le domaine
numirique, et plus particulikrement a l’extension des
R6sum6 schimas hybrides au cas d’icoulements en disiquilibre.
La motivation de l’utilisation de tels schimas ripond
On prisente ici des mithodes numiriques adapties au souci de porter le code a un niveau de robustesse
a la pridiction d’icoulements visqueux en disiquili- mais igalement de pricision nicessaire a la simulation
bre thermodynamique et chimique. Cet article con- d’icoulements plus complexes correspondant par ex-
cerne en particulier le diveloppement de schCmas emple aux icoulements ionisis.
dicentris bien adaptis a l’approximation des flux Initialement, l’algorithme de traitement des flux de flu-
de fluide parfait dans le contexte des problkmes ide parfait Ctait dOvolu b la mithode de Roe. Du fait
dVcoulements visqueux hyperenthalpiques. Initiale- de la mise en ceuvre dilicate de cette mithode dans
ment, l’algorithme de traitement des flux de fluide le cadre des icoulements visqueux, une nouvelle a p
parfait itait divolu A la mithode de Roe. Du fait proche pour le dicentrement est proposie [8]. Elle
de la mise en euvre dilicate de cette mithode dans le combine les deux approches classiques, l’approche de
cadre des icoulements visqueux, une nouvelle approche dicomposition de flux et l’approche de type Godunov.
pour le dicentrement est proposie [8]. Elle combine L’approche de dicomposition de flux conduit pro-
les deux approches classiques, l’approche de dicompo- poser des approximations simples se rkvilant itre trks

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
34-2

robustes dans la pratique mais prisentant le principal oh E est l’inergie totale du milange par uniti de masse
difaut d’ignorer la structure de la solution relaxie, en et v = ( u l l u 2 ) la vitesse barycentrique du milange.
particulier les ondes liniaires (discontinuitis de con- eu;p disigne 1’ inergie de vibration par uniti de masse
tact) qui la composent. Le respect des ondes liniaires de l’espice moliculaire p.
est crucial dans le cadre des icoulements visqueux La pression du gaz est donnie par la loi de Dalton:
et sa violation rend les mithodes de dicomposition
de flux inappropriies a leur simulation. L’approche (3)
de type Godunov permet de satisfaire cette exigence
nioyennant une complexiti accrue de l’approximation. oh R,, disigne la constante universelle des gaz parfaits
Mais elle trouve un avantage decisif sur l’approche et M, est la masse atomique de l’espice a.
de dicomposition de flux dans le cadre des problimes A ce systime est associie une relation de fermeture
visqueux. Toutefois, ces mithodes peuvent prisenter thermodynamique ginirale telle que la pression du
divers difauts de stabiliti dans la capture des ondes milange virifie:
non-liniaires (ondes de choc et de detente).
L’approche du dicentrement par hybridation (mith-
odes HUS “Hybrid Upwind Splitting”) combine les
deux approches pricidemment cities de maniire A n’en
retenir que les propriitis jugies idoines pour la sim- (4)
ulation des icoulements visqueux hyperenthalpiques. a
Dans le code de calcul, c’est le schima dicentrd risul-
oh hr = rtr - 1. e, et hz disignent respectivement
tant de 1’ hybridation du schima d’Osher et de celui
1’ inergie des modes internes a l’iquilibre avec la tem-
de van Leer qui a i t 6 implanti. D’autre part ont igale-
pirature de translation et la chaleur de formation de
ment Cti implanties dans le code la mithode d’Osher
l’espice a par uniti de masse.
et une mithode de type van Leer.
Les expressions ditaillies des termes source et du
2. ModClisation et Cquations de bilan
tenseur des phinomines dissipatifs peuvent 2tre trou-
vies dans de pricidents articles [4], [12]. Nous
Dans cette etude, nous considirons un milange idial
rappellerons seulement que le modile de chimie
de gaz parfaits constitui de ns espices dont n m espices
choisi est celui de Gardiner [9]. I1 met en ceuvre
moliculaires. Dans le cas de l’air, lea cinq espices prin-
17 riactions comprenant quinee riactions de dis-
cipales Nz10 2 , NO,N et 0 seront prises en compte.
sociation et deux riactions d’ichange. Les iqua-
Les modes de translation et de rotation, et le mode
tions pour les inergies de vibration peuvent inclure
ilectronique sont toujours considiris a l’iquilibre et
les ichanges d’inergie Translation-Vibration (T-V),
sont donc caractirisis par une temperature unique T
Vibration-Vibration (V-V) ou Vibration-Dissociation
alors que les modes de vibration peuvent s’icarter de
(V-D). Le taux d’ichange d’inergie T-V est modilisi
l’iquilibre. Nous supposons que parmi les nrn es-
par un modile de Landau-Teller, lea temps de re-
pices moliculaires, nv, nv 5 n m , d’entre elles ont
laxation entre espices itant donnis par la loi semi-
leurs modes de vibration en disiquilibre (N2, 0 2 et
empirique de Millikan et White [14].
iventuellement NO pour l’air). Nous nous intiressons
Le tenseur des contraintes visqueuses utilise pour la
aux ivolutions bidimensionnelles de ce milange. Ces
viscositi du milange le modile d’brmaly et Sutton
ivolutions sont gouvernies par le systime de lois de
[2], la viscositi de chaque espice i t a n t diterminie par
conservation suivant:
la relation de Blottner [3]. La vitesse de diffusion des
+
8 , ~div(f(u) - D(u)gradu) = n. (1)
espices virifie une loi de Fick et un coefficient de dif-
fusion binaire. Les flux de chaleur du milange et de
f disigne les flux de fluide parfait. Les phinomines vibration sont supposis suivre des lois de Fourier. Le
dissipatifs sont ici modilisis par le tenseur 2). Le ditail des expressions des coefficients de conductiviti
terme source n traduit la prisence des phinomines thermique est donni dans [4].
de disiquilibre. Dans la suite, U ouvert de RP avec
p = ns + n v + 3 disigne l’espace des 6tats.L’inconnue
U : R+ x R a-+ U a pour expression: 3. MBthode numerique

Les solutions du systime convectif-dissipatif (1) sont


approchies par une mithode de volumes finis implicite
34-3

dcrite pour des maillages curvilignes. Cette mithode Considirons f ( u ~u ,~ ,71)


; un flux nurndrique consis-
est dicrite ci-dessous, tout d’abord dans sa formulation tant avec le flux exact fTl.En introduisant la rotation
semi-discrkte en espace puis dans sa formulation im- envoyant le vecteur de base l i sur la normale n K , e ,
plicite en temps. Seules seront discuties ici les mith- nous difinissons l’opirateur de rotation T K , qui
~ nous
odes numiriques ayant trait l’approximation du sys- permet d’obtenir un flux numirique bidimensionnel en
tkme Euler extrait. La discritisation de l’opirateur posant:
du second ordre traduisant les phinomknes dissipat-
ifs s’appuit sur une mithode centrie, dija prisentie B ~ ( u K ’UK.; n K , e ) = TK,e-lf(TK,eUK>TK,eUK.; 71).

l’occasion d’une pricidente contribution I’AGARD (7)


[12]. Les mithodes associies au systkme Euler et im- Dans la suite, nous remplacerons abusivement f71 par
planties dans le code CELHYO, font appel A diverses f , ceci afin d’alliger les notations.
techniques de dicentrement. Elles relkvent respec-
tivement du dicentrement par risolution approchie 3.2 Principales propri6tQs d u systi?me
du problkme de Riemann, du dicentrement par d6- Sous des hypothkses thermodynamiques ginirales, le
composition de flux et enfin d’une technique origi- systkme est hyperbolique. La matrice jacobienne as-
nale qualifiie de dicentrement par hybridation champ sociie, notie Vf (U)’ est diagonalisable et posskde p
par champ. Cette dernikre technique risulte d’une <
valeurs propres Ah, 1 k 5 p dont deux valeurs pro-
recherche mende en collaboration entre I’ONERA et pres simples A 1 = w l - c, Xp = u1 + c et une valeur
la NASA (81. Nous ddcrivons ci-aprks trois des schi- propre multiple A k = u l , 2 5 k I p - 1. Ici, c = fi
mas dicentris utilisis dans le code: le solveur de Rie- disigne la vitesse du son.
mann approchi d’Osher-Solomon, la dicomposition de Dans le cas d’une thermodynamique ginirale, les in-
flux de van Leer et enfin le schema HUS resultant de variants de Riemann associks aux 1 et pchamps vrai-
l’hybridation des deux pricidentes mithodes. Le code ment nonlindaires ne sont pas explicitement connus. 11s
CELHYO dispose igalement d’autres schimas, en par- sont donnis par les iquations diffirentielles suivantes:
ticulier ceux de Roe, de Godunov et de Collela-Glaz,
que nous ne rapporterons pas ici. dYa = 0 l<cr<ns-l (8)
d ev;g = 0 1<a<nu (9)
3.1 M Q t h o d e s de volumes finis bidimensionnels
L’approximation numirique de l’inconnue U du sys-
tkme est obtenue A l’aide d’une mithode de volumes
finis dont la formulation continue en temps s’icrit, en . ,
CP
omettant les termes source et les phinomines dissipat-
ifs, pour une cellule K de frontikre 8 K : Lorsque la relation de fermeture thermodynamique
n’est pas une fonction liniaire de la tempirature de
translation, 7 dipend de la tempirature et les Cqua-
tions (IO) et (11)ne peuvent itre facilement intkgries.
Dans ce travail, nous proposons d’intigrer de manikre
oi u~ (respectivement u ~ . disigne
) la valeur con- approchCe les relations (10) et (11) en nigligeant la
stante de la solution approchie sur la cellule K (re-
dependance en tempirature de 7 pour obtenir les in-
spectivement K e ) . Par dCfinition, la cellule voisine variants suivants:
I ( , posskde l’arite commune e; n q e est la normale
unitaire B e extirieure a l’iliment K. L’application 2c P
(Ya)l<a<n,,ul - -1 VZ, (Y,ev;~)l<~<nvt
f : U x U x 72’ 4RP disigne un flux numirique bidi- (Y - 1) P-f
mensionnel astreint aux conditions de conservativiti et (12)
de consistance usuelles. En vertu de l’invariance par Concernant les k-champs liniairement diginiris, les
rotation des iquations d’Euler, l’ivaluation des flux invariants sont u1 et p.
numiriques bidimensionnels est diduite de l’ivaluation
d’un flux numirique consistant avec un problkme de 3.3 MBthode d’Osher-Solomon
Riemann monodimensionnel oi le flux exact s’icrit : Cette mCthode repose sur la risolution approchie
du problkme de Riemann obtenue en assimilant chaque
! = ( (paVl)l<cz<nr, P I 2 + P I P ~ W , onde simple a une onde de ditente-compression. Elle
est ainsi difinie par la construction d’un chemin dans
PHUl, (Pgev;gVl)l<S<nv)1 (6)
l’espace des Ctats reliant UL B UR et obtenu en suiv-
avec 71 le premier vecteur de la base canonique deR2. ant dans le plan vitesse-pression les parties admissible
34-4

et non admissible des courbes de ditentes issues de L’application p$(u) - pi(.) est strictement croissante
ces deux itats. L’ordre de parcours des courbes de et admet au plus une racine, notie U*. Afin de calculer
detente retenu dans ce travail correJpond a l’ordre na- cette racine, il est utile d’introduire :U et uk difinies
+
turel (U - c, U ,U c). Un tel chemin, noti @ ( u L , U R ) Par
dans la suite, existe sous des conditions thermody-
namiques ginirales tant qu’il n’y a pas cavitation. I1
est compos6 de deux sous-chemins de type vraiment
nonliniaire (VNL) notis 91 et @3 et d’un sous chemin, et de remarquer [5] que U* s’exprime comme combinai-
@2’ de type liniairement diginiri (LD) associi a la son convexe de ces deux vitesses particuliires. I1 existe
discontinuiti de contact. Ce chemin une fois construit donc un riel z* E [O, 11 tel que
permet de difinir compktemen. . flux numerique as-
% 4 6 a u schima d’Osher-C ’om . seion : U* = z*u; + (1 - z*)uk. (20)
Notons que U; - U> > 0 sauf pricisiment lorsqu’il
y a cavitation. En utilisant (20)’ le problime de la
recherche de la racine de l’iquation (19) peut alors
Ctre reformuli en ces termes. Trouver le riel z* E [ O , 1 ]
solution de l’iquation :
Nous renvoyons A [5] pour l’icriture ditaillie de ce flux.
Nous nous consacrons ici a l’exposi de l’algorithme
de construction du chemin @(uL’ U R ) que nous avons
associi au milange de gaz qui nous intiresse. Nous oi nous avons posi
renvoyons a Abgrall et Coll. [I]pour un autre procidi.
Disignons par U; et U$ les itats siparis par la dis-
continuiti de contact se propageant H la vitesse U*.
Ces itats sont construits en risolvant le problime suiv-
ant, exprimant la conservation des invariants de Rie-
Lorsque 7~ # 7 ~l’iquation
’ pricidente n’admet pas
mann et la continuiti de la pression et de la vitesse A de racine explicite, sa ditermination nicessite la mise
la traversie de la discontinuiti de contact.
en oeuvre d’une procidure itirative de type Newton.
Afin d’en optimiser la vitesse de convergence, nous pro-
posons de substituer ir la risolution du problime (21)
celle du problime Cquivalent suivant, prisentant un
tris bon conditionnement. Trouver le riel z* E [0,1]
2 2 solution de l’iquation g ( z ) = 0 oh:
U* + lc;
7L-
= U L + -CL’
7L-1
(16)

I1 est ais6 de voir que la risolution du pricident prob- Afin d’initialiser l’algorithme de Newton, nous difinis-
lime peut itre ramenie ir la recherche de U*, solution sons
de l’iquation
(23)

traduisant la continuiti de la pression et de la vitesse I1 est possible de virifier que la plus proche valeur de
a la discontinuiti de contact. Ce problime une fois la racine z* est donnCe par
risolu conduit a la ditermination des autres quantitis.
Ici, nous avons : m=(z11z2), si < 1’
%nit =
{
min(z1, za), sinon
1 (24)

valeur qui sera utilisCe comme valeur d’initialisation.


Remarquons que dans le cas 7~ = 7 ~q n’i t = z1 = 2 2
I

34-5

coincide avec la racine z* de l’iquation considirie. et ce pour n’importe que1 chemin ip connectant U L
L’algorithme (22)-( 24) converge giniralement en au et UR dans l’espace des itats. Cette propriCtC est a
plus 3 itirations pour un test d’arrit de portant la base de la technique d’hybridation des mithodes
sur l’erreur relative - 11. d’Osher-Solomon et de van Leer [7] dont nous pro-
PZ posons l’extension ci-dessous au cadre des milanges
3.4 MQthode de decomposition de van Leer de gaz en disiquilibre chimique et thermique.
L’extension de la mithode de van Leer aux Cqua-
tions d’Euler multi-espkes et multi-tempiratures a 3.5 MQthode de d k c e n t r e m e n t par h y b r i d a t i o n
fait l’objet de quelques travaux (voir en particulier c h a m p par champ
[ 111). Les extensions proposies conduisent a une L’introduction du dicentrement par hybridation a i t 6
famille de schimas un degri de liberti paramitrant motivie par l’analyse des avantages et des difauts re-
la dicomposition de flux d’inergie. Le choix d’une spectifs aux schimas d’Osher-Solomon et de van Leer.
dicomposition de flux particulikre doit itre opiri de Ainsi si la mithode de van Leer se revile itre trks ro-
maniire a assurer que les matrices jacobiennes des flux buste dans la capture des ondes non liniaires (choc
*
dicomposis f V f (U) n’admettent que des valeurs et ditente), elle est en revanche trks peu pricise dans
la risolution des ondes liniaires (discontinuiti de con-
propres rielles positives ou nulles. Toutefois, il ressort
de ces travaux qu’une telle propridti est difficile a tact). Ce manque de pricision la rend inappropriie
garantir pour tout U de l’espace des itats dans le cadre dans le contexte d’iquations de fluides visqueux qui est
d’une thermodynamique non liniaire en T. Nous avons le nijtre. A l’opposi, la mithode d’Osher-Solomon au-
privildgii dans le code CELHYO le reprisentant de torise par construction la risolution exacte des discon-
la famille considirie permettant de priserver la con- tinuitis de contact stationnaires. Toutefois associi a
stance de l’enthalpie totale A la traversie d’un choc cet avantage, cette dernikre souffre d’un manque de ro-
stationnaire. Ce schima, briivement dkcrit ci-dessous, bustesse dans la capture d’ondes nonliniaires intenses.
s’est rivilk entiirement satisfaisant dans les applica- Les avantages et les difauts inhkrents aux deux mith-
tions pratiques. odes se rivklent donc disjoints et complimentaires.
En introduisant le nombre de Mach M = v/c, les La technique d’hybridation se propose de tirer parti
flux dicomposis se riduisent A f+(u)= f(u), f - (U)= d’une telle complimentariti avec pour but d’associer
0 lorsque M 2 1 et symitriquement ir f+(u) = la robustesse de la mOthode de van Leer dans la ri-
0, f-(u) = f(u) lorsque M 5 -1. Pour ]MI 5 1, solution des ondes non liniaires et la pricision du
ces flux sont difinis par les expressions suivantes : schima d’Osher-Solomon dans la risolution des ondes
liniaires. C’est ainsi que chacun des trois sous chemins
composant le chemin d’Osher-Solomon est associi soit
avec la mithode de van Leer soit avec la mithode
d’Osher suivant la nature nonliniaire du sous chemin
considiri. Dans la suite, nous notons V N L ( i p ) =
ipl U ip3 et LD(ip) = 9 2 . Le flux numirique resultant
de l’opiration d’hybridation trouve alors l’expression
suivante

oi nous avons posh

Bien que fort iloigni de la mithode dicentrie d’Osher- ( O f + (U) - V f - (u))du).


Solomon, le schima de van Leer peut nianmoins re-
cevoir une formulation analogue en terme de chemin. Notons que par construction, le flux hybride coincide
I1 est ainsi possible de virifier que celui-ci peut s’krire avec le flux d’Osher en prisence d’une discontinuiti
de contact seule et inversement se riduit au flux de
van Leer lorsque seules n’interviennent que des ondes
nonliniaires dans la dicomposition en ondes approchie
d’Osher-Solomon. En reprenant les notations du para-
graphe 3.3, la relation pricidente peut itre explicitie
34-6

en conservation des espkes ilimentaires qui autrement


serait giniralement perdue a cause des nonliniarites
fHUS(UL1UR) = fVL(uL,UR) inhirentes a la procidure de reconstruction. La fonc-
-(f-(u$) - ~ - ( U Z ) ) si
~ U* > 0, tion limitrice considirie est la fonction proposie par
(32) van Albada ou la fonction minmod.
+ { + ( f + ( u i )- f+(u;)). sinon
4. MBthode imdicite
Soulignons la rielle simpliciti du flux hybride en com-
paraison a la mithode d’Osher originale. En partic- La construction du schima implicite est obtenue par
une liniarisation des flux numiriques et des termes
ulier, les points soniques n’interviennent pas. De plus,
l’unique test rentrant dans la formulation du flux hy- source. L’implicitation des termes de flux de fluide
bride peut itre automatiquement pris en compte en parfait est basie sur la mithode de Flux Vector Split-
utilisant les propriitis d: ..-mktrie du flux de van Leer ting de van Leer, et ce indipendamment du flux ex-
par rapport au nombre cie Mach. Nous avons ainsi plicite utilisi. La robustesse obtenue est a priori peu
sensible a la nature du schima explicite. Le terme
H U S - VL
fpo -fpo + (f;o(-lMil) -f:(-lM;l))’ (33) source est trait6 de manikre centrie. Les termes de
H U S - VL dirivies croisies sont nigligis dans l’itape de liniari-
fp2 -fpo + (f&(-lMil)- f&a(-l%l))’(34) sation des flux de fluide visqueux. Un bon traitement
fSUS=fpVEL + (f,+E(-IMil) - f;E(-lMa))’ (35) implicite des conditions aux limites conditionnant la
H U S - VL qualit6 d’acciliration de la convergence vers l’itat sta-
fe,,, -fc,*, + (fi+,,J-lMiI) - fL+,,,(-IMZI)XW) tionnaire, une attention particulihre y a i t 6 apportie.
et L’opirateur implicite ainsi obtenu est liniaire et est
risolu par une mithode itirative. Une telle mithode
prisente l’avantage d’itre bien moins sensible au choix
du pas de temps que ne le sont les mithodes par fac-
torisation approchies. Elles ivitent igalement la di-
Notons qu’a l’instar du schima d’Osher-Solomon, le composition parfois inadiquate de la matrice jacobi-
flux hybride ne permet pas de satisfaire un principe enne des termes source. La mithode itirative mise
du maximum sur les fractions massiques et les iner- en euvre est basie sur une stratigie de minimisation
gies de vibration massique. La correction proposie par des risidus telle que la mithode GMRES. La mithode
Larrouturou [lo] peut lui itre appliquie sans digrader itirative convergeant d’autant mieux que le systkme
ni la robustesse ni la pricision. Soulignons enfin qu’il est bien conditionni, une factorisation ILU est util-
est possible de donner a la technique d’hybridation un isie.
cadre beaucoup plus giniral que celui expos6 ici [8].

3.6 MBthode du deuxihme ordre explicite 5. Rbultats numhriques


La procidure d’extension de la mithode de volumes
finis au second ordre d’approximation en espace est la
Afin d’illustrer les capacitis de la mithode pour cal-
mithode classique MUSCL de van Leer qui, A chaque
culer des Ccoulements en disiquilibre dans des con-
pas de temps, repose sur une reconstruction affine par figurations variies, des calculs d’icoulements externes
morceaux de la solution approchie. L’extension de autour d’une configuration d’hyperboloide plus volet
la mithode MUSCL que nous utilisons au cas d’un et d’icoulements internes dans une tuyhre qui Cquipe
milange de gaz permet de respecter la conservation des la soufflerie hyperenthalpique F4 de I’ONERA ont i t i
espkes ilimentaires et igalement d’assurer la positiv-
rialisis.
it6 des fractions massiques et des inergies de vibration
massiques sous certaines conditions de type CFL dans
5.1 Ecoulement dans une tuyBre de la soufflerie
le cas d’un schima explicite [13]. Dans le contexte des F4
maillages curvilignes qui est le nctre, la mithode est
La soufflerie ONERA F4 peut itre iquipie de qua-
appliquie direction curviligne par direction curviligne.
tre tuyhres diffirentes correspondant chacune a des
La mithode utilise les variables suivantes:
rigimes d’icoulements diffirents. Les conditions des
calculs que nous prisentons ici correspondent au cas
test numiro 1 du “Fourth European High Velocity
en inhibant la procidure de reconstruction sur les frac- Database Workshop” (qui s’est tenu le 2425 Novembre
tions massiques. Cette stratigie permet de garantir la 1994 Noordwijk). La giomitrie est celle correspon-
34-7

dant a la tuykre n02.Sa longueur totale est de 3.42 sion gindratrice itant de 441 bar (soit les conditions
m et le rayon du col est de 0.005m. Les conditions du cas test numiro 4):
dans la charnbre correspondent une enthalpie totale Tm=187 K;
riduite de 260 et une pression de 430 bar. La T,,~2=4078 K; T,,o2=2485 K;
%,To
tempirature de la paroi est de 300 K. Elle est supposie p,=1.557 10-3Kg/m3; u,=3934 m/s; T,=300 K;
totalement catalytique jusqu’a une distance de 0.5 m c ~ 2 = 0 . 7 2 5 4 ;co2=0.1354; CNO=0.0895; CN=lo-”;
en aval du col, puis noncatalytique aprks. co=0.0497.
Le domaine de calcul a i t 6 divisi en huit parties. La paroi est supposie totalement catalytique.
Dans le premier domaine, l’icoulement dans le conver- Le deuxikme calcul correspond B des conditions en vol
gent et dans la rigion proche du col est calculi. Ces sur une giomitrie homothitique de la pricidente dans
risultats servent ensuite pour diterminer la solution un rapport 1.4:
dans les zones suivantes de l’icoulement hypersonique. Tm=268 K;
Chaque domaine comprend 89x85 points. ~ ~ ~ 2 . 10-%g/m3;
6 0 8 u,=5083 m/s; T,=1000 K;
Pmz201.5 Pa.
Un calcul laminaire et un calcul turbulent, respec-
La paroi est igalement supposie totalement, cataly-
tivement notis (1) et (2) ont i t 6 rialisis. Pour le cas
tique.
turbulent, le modkle de turbulence utili& est le mod-
kle algibrique de Baldwin-Lomax et le point de tran- Le m2me maillage est utilisi pour les deux calculs
sition est situ6 a 0.5 m en aval du col. Les risultats qui tiennent compte de la diffirence d’khelle. I1 con-
prksentis ont it6 obtenus aprks 6000 itirations dans tient au total 401x110 points. Trois sous-domaines ont
le premier domaine. Pour les autres domaines, 600 6 t i utilisis afin de diminuer les temps de calcul et la
a 200 itirations environ suivant le domaine considiri taille nicessaire de la mimoire. Les domaines se re-
ont it6 nicessaires. Le nombre de Courant peut at- couvrent sur quatre points. Ces domaines (nez, rigion
teindre une valeur de 500 (pas de temps global) dans intermidiaire et rigion du volet) comprennent respec-
les derniers domaines du divergent. Dans tous les cas, tivement 80x110, 123x110 et 206x110 points. Pour le
les rksidus maxima dicroissent au moins de 10 ordres nez et la rigion intermidiaire, on obtient une dicrois-
de grandeur. sance des risidus quadratiques explicites de 8 ordres de
grandeur a p r L 2000 itirations. Le nombre de Courant
La figure 1 montre l’ensemble du maillage utilisi pour atteint 10 pour le nez et 70 pour la deuxikme zone.
la tuykre. Les risultats pour les calculs laminaire et Dans la rigion du volet, une dicroissance de 5 ordres
turbulent sont prisentis sur les figures 2 a 6 . Le champ de grandeur des risidus est obtenue aprlrs 20000 itira-
des nombres de Mach est visualisi sur la figure 2 dans tions et un nombre de Courant de 10. Notons que
le cas de l’icoulement laminaire et montre une onde les risidus n’atteignent pas de plateau et continuent
venant perturber l’icoulement proche de l’axe de la de dicrgtre lorsque les calculs sont poursuivis. Cette
tuykre. La naissance de cette onde correspond A un convergence lente eat due a la prisence d’une impor-
point d’inflexion de la giomitrie. Sur la figure suiv- tante zone de recirculation dans la rigion de volet.
ante sont porties les distributions de tempiratures le
long de l’axe dans le cas laminaire ou turbulent, aucun Les risultats sont prisentis sur les figures 5 a 17. Les
effet notable de la prise en compte de la turbulence sur figures 5 A 9 rnontrent des courbes d’isovaleurs du nom-
ces distributions ne pouvant Ctre observi. Les distri- bre de Mach et de la pression pour les deux calculs.
butions transversales de nombres de Mach pour le cas Dans la rigion du volet, la zone de siparation est bien
laminaire et le cas turbulent sont montries sur la figure difinie pour les deux calculs (figures 5 a 7). La fig-
4 en sortie de tuykre. On observe une bonne uniformiti ure 6 montre un agrandissment de cette zone pour le
du nombre de Mach dans le noyau de l’icoulement. cas du vol. Les effets visqueux sont importants du fait
du faible rayon du nes. Des oscillations ligkres sur les
5.2 Calculs d’koulements externes courbes d’isopression sont visibles et correspondent a
Deux series de calculs ont i t 6 rialisies autour des sauts de mailles dans la rigion du choc. La pres-
d’une configuration d’hyperboloide plus volet. Cette sion atteint la valeur maximale de 22432 Pa pour le
giomitrie a i t i proposie pour le cas test numiro 4 du cas en soufflerie et 63073 Pa pour le cas du vol.
Workshop. La longueur totale de la maquette est de L’icoulement est relativement fig6 derrikre le choc,
0.1114 m et l’angle entre le volet et l’axe est de 43.6 comme le montrent les profils de tempirature obtenus
degris. Le premier calcul correspond aux conditions pour le premier calcul (figure 10). La distance du choc
de l’icoulement dans la tuyire no 2 de la soufflerie F4; est dans ce cas &galea 3.7 m. Les figures suivantes
l’enthalpie totale riduite itant igale A 122 et la p r e s montrent des valeurs B la paroi pour les deux calculs.
34-8

Le nombre de Stanton le long du corps est prksentd [4] Coquel F., Flament C., Joly V., Marmignon C. :
sur la figure 11 pour le calcul en soufflerie. La valeur Viscous Nonequilibrium Flow Calculations , Com-
maximale (0.284) correspond B un flux de chaleur Cgal puting Hypersonic Flows, Volume 3, ed. Bertin
J.J., Pkriaux J., Ballmann J., Birkhzuser, Boston,
a 1.43 l o 7 W/m2. Les quatre figures suivantes mon- 1993.
trent, dans la rCgion du volet, les nombres de Stanton
et les coefficients de frottement pour les deux calculs. [5] Coquel F., Joly V. : De‘veloppement d’un
Code de Calcul d’Ecoulements Hypersoniques en
La zone de siparation mesure environ 1.1 IO-’ m pour De‘se‘quilibre Chimiq ue et Vibrationnel, Rapport
le cas de la soufflerie et 2 m pour le cas du vol. interne ONERA non publiC, Juillet 1991.
Un tourbillon secondaire peut 2tre observd dans le cas
du vol sur ligne charnikre avec le volet. Enfin. nous [6] Coquel F., Joly V. : Diveloppement d’un
Code de Calcul d’Ecoulements Eypersoniques en
montrons les courbes de convergence dans la rCgion in- Dise‘quilibre Chimique et Vibrationnel, Rapport
termddiaire et dans celle du volet (figures 16 et 17). interne ONERA non publie, Octobre 1992.

6. Conclusion [7] Coquel F., Liou M.S. : Field by Field Hybrid U p


wind Splitting Methods, AIAA-93-3302-CP11993.
Nous avons present6 les mCthodes numdriques util- [8] Coquel F., Liou M.S. : Hybrid Upwind Splitting
isCes dans le code CELHYO pour le calculs des Ccoule- (HUS) by a Field by Field Decomposition, NASA
ments en disdquilibre thermique et chimique. Ce code TM 106843, ICOMP-95-2, 1995.
permet d’utiliser les schdmas de Roe, d’Osher, de van [9] Gardiner W.C.Jr.: Combustion Chemistry,
Leer et leur hybridation. L’accent a Ctd mis sur la tech- Springer Verlag 1984
nique de ddcentrement par hybridation. Cette tech-
nique de dicentrement combine les deux approches 101 Larrouturou B. : On Upwind Approximations of
classiques de manikre B n’en retenir que les pro- Multi-dimensional Multi-species Flows, Compu-
tational methods in applied sciences, Ch. Hirsch
priitCs jugdes favorables pour la simulation numCrique
(Eds), Elsevier, 1992.
d’icoulements oh coexistent d’importants phinomhnes
non lindaires et liniaires. C’est en particulier le cas 111 Liou M.S., Van Leer B. and Shuen J.S.: Splitting
des dcoulements visqueux hyperenthalpiques prben- ofhviscid Fluxes for R e d Gases, J. Comp. Phys.,
tis. L’opirateur -implicite est construit sur les flux Vol. 81, pp.1-24, March 1990.
de van Leer et est inversi par une mCthode itCrative (121 Marmignon C., Joly V. et Coquel F.: Calculs
de type GMRES. Ce code permet de calculer des con- d’Ecoulements Visqueux en Ddse‘quilibre dens
figurations variCes bidimensionnelles et axisymitriques des Zhy+res, Agard Conference Proceedings 514,
d’Ccoulements hyperenthalpiques, et ce en utilisant des 1992.
schCmas numiriques prCcis et en obtenant une bonne [13] Mehlman G. : An Approximate Riemann Solver
convergence. for Fluids Systems Based on a Shock-Curve De-
composition, Third International Conference on
Hyperbolic Problem, Uppsala, Sukde, 1993.
Remerciements
Ce travail a CtC en partie effectud avec le soutien fi- [14] Millikan R. C., White D. R. : Systematics of Vi-
nancier de la DRET et l’ESA/CNES. brational Relaxation , J. Chem. Phys., Vo1.36,
No.12, 1963.

References
(11 Abgrall R., Fezoui L., Talandier J.: Extension of
Osher’s Riemann Solver to Chemical and Vibra-
tional Nonequilibrium Gas Flows, INRIA Report
1221, May 1990.

[2] Armaly B. F., Sutton K. : Viscosity of Multicom-


ponent Partially Ionised Gas Mixtures , AIAA-
80-1495, AIAA 15th Thermophysics Conference,
1980.
[3] Blottner F. G., Johnson M., Ellis M. : Chemically
Reacting Viscous Flow Program For Mu1ticompe
nent Gas Mixtures , Sandia Laboratories Report
87115, 1971.
34-9

Fig. I - Maillage d e la tuyere

1
laminaire ( 1 )

Fig. 2 - Champ des nombres d e Mach (pas =0,25)

aooo. PO -
m.
6ooo.
11.0
10.0 --- (2)

---
9.0
m. 8.0
7.0
4ooo. 6.0
3ooo. 5.0 -
3ooo.
4.0
3.0 --- ~ 3 . 5 9m
1m. 20
-
1.0
0. 0.
34-10

Plage: 0- 14 Plage: 0 + 15.5


Pas: 0.2 /l Pas: 0.5
Valeur min.: 0.
Valeur max.: 15.45

[SO-MACH

Fig. 6 - Champ des nombres de Mach (volet) Fig, 7 - Champ des nombres de Mach (cae Val)
Deuxiemc doniei ne A Deuxieme domaine /l
Plage: 0 Pa 4 I1000 Pa Plage: 0 Pa --.. 30000 Pa
Pas: 1000 Pa Pas: 2000 Pa
VaIeur min.: fl0 Pa Valeur min.: 201 Pa
Valeilr msx.: 10599 Pa Valeur max.: 29457

ISO-PRESSION
Troisieme domaine Troisiime domaine
Plage: 0 Pa 4 2 0 0 0 0 Pa Plage: 0 Pa -~60000 Pa
Pas: 1000 Pa Pas: 2000 Pa
Valeiir min.: HH Pa Valeur min.: 201 Pa
Valeiir max.: 19651 Pa VaIeur mar.: 55600 Pa

Fig. 8 - Champ des pressions (cas tuyire) Fig. 9 .- Champ des pressions (cas vol)

300.
m.
1 Trib(N2)
n
/ \
0.15 1
cas tuyire

3ooo.
m.
1ooO.

Fig. 10 - Distributions des tempiratures sur I'axe Fig. I1 - Distribution des nombres de Stanton
B la paroi
34-1 1

r \
*.I

0.86 O.m 0.10 0.12

Fig. 12 - Distribution des nombres d e Stanton Fig. 13 - Distribution des nornbres d e Stanton
cas tuygre. volet cas vol. volet

C! Cf

.a
04

0.
.& 0 1
I 1 L xp(m)
0.04

0-3=-&-
Fig. 14 - Distribution des coefficients d e friction Fig. 15 - Distribution des coefficients d e friction
c a s tuvere. volet cas vol. vole1

Log( res.max) bg( r e r . w )


0. - 0 . -
-140
deuxieme domaine domaine d u VOkt
em
-4.m-
-5.m-
-6.m- 4m -
-7.m-
-e.m - -0.00 -
-9.m I it -7.00 8 it
0. 18exa. 2GIZIzJ. 502n. 0. 5#x). laxla 15Q)[). amn 25#K).

Fig. 16 - Courbe d e dicroiseance des rCsidus Fig. 17 - Courbe d e dCcroissance des rthidus
3’5-1

! A PROJECTION METHODOLOGY FOR THE SIMULATION OF UNSTEADY INCOMPRESSIBLE VISCOUS


FLOWS USING THE APPROXIMATE FACTORIZATION TECHNIQUE

A. Pentaris
S. Tsangaris

L a b o r a t o r y o f Aerodynamics, N a t i o n a l T e c h n i c a l U n i v e r s i t y of A t h e n s
PO Box 6 4 0 7 0 , 1 5 7 1 0 Z o g r a f o u , A t h e n s , Greece

SUMMARY r i t h m w h i c h was i n i t i a l l y d e v e l o p e d by
B e a m a n d Warming (Ref 5 ) f o r c o m p r e s s i b l e
I n t h i s paper, an i m p l i c i t projection f l o w s b u t h a s s u c c e s s f u l l y u s e d f o r incom-
m e t h o d o l o g y f o r t h e s o l u t i o n o f t h e two- p r e s s i b l e s t e a d y f l o w s a s w e l l (Ref 6 , 7 ) .
dimensional, t i m e dependent, incompressi- Regarding t h e m a t h e m a t i c a l model, a pro-
b l e Navier - S t o k e s e q u a t i o n s i s p r e - j e c t i o n method i s d e v e l o p e d , which u s e s a
s e n t e d . The b a s i c p r i n c i p l e o f t h i s m e t h o d Poisson equation f o r t h e e x p l i c i t pressure
i s t h a t t h e e v a l u a t i o n of t h e t i m e evolu- derivation, while t h e numerical algorithm
t i o n i s s p l i t i n t o i n t e r m e d i a t e s t e p s . The i n v o l v e s o n l y t h e momentum e q u a t i o n s .
c o m p u t a t i o n a l m e t h o d i s b a s e d on t h e a p -
p r o x i m a t e f a c t o r i z a t i o n t e c h n i q u e . The C o n c e r n i n g t h e t u r b u l e n c e model t h e r e a r e
coupled approach i s used t o l i n k t h e equa- p l e n t y o f o p t i o n s . The s t a n d a r d k--E m o d e l
t i o n s o f m o t i o n a n d t h e t u r b u l e n c e model w i t h t h e w a l l f u n c t i o n s e q u a t i o n s (Ref 8 )
e q u a t i o n s . The s t a n d a r d k--E t u r b u l e n c e was s e l e c t e d b e c a u s e i t i s w e l l t e s t e d a n d
m o d e l i s u s e d . The c u r r e n t m e t h o d o l o g y , w i d e l y u s e d , i n s p i t e of i t s d i s a d v a n -
which h a s been t e s t e d e x t e n s i v e l y f o r t a g e s . I n a d d i t i o n , s m a l l v a l u e s o f t h e y-
s t e a d y p r o b l e m s , i s now a p p l i e d f o r t h e p l u s a r e n o t r e q u i r e d , so c o a r s e g r i d s can
numerical s i m u l a t i o n of unsteady flows. be used near t h e w a l l s and t h u s l a r g e t i m e
Several cases w e r e t e s t e d , such as plane s t e p s are p o s s i b l e . I t i s expected, t h a t
o r axisymmetric c h a n n e l s , a backward f a c - t h i s t u r b u l e n c e model w i l l sometimes p e r -
i n g s t e p and a flow behind a s q u a r e c y l i n - form p o o r l y , e s p e c i a l l y i n t h e r e c i r c u l a -
der. t i o n zones.

1. I N T R O D U C T I O N The o b j e c t i v e o f t h i s p a p e r i s t o d e s c r i b e
a new p r o j e c t i o n m e t h o d o l o g y d e v e l o p e d f o r
The n u m e r i c a l p r e d i c t i o n o f u n s t e a d y i n - c o l l o c a t e d g r i d s and t o p r e s e n t p r e d i c -
compressible f l o w f i e l d s h a s always been t i o n s f o r s e v e r a l t e s t c a s e s w h e r e t h e un-
one of t h e most c h a l l e n g i n g a r e a s o f f l u i d steadiness is e i t h e r forced or inherent.
d y n a m i c s . The p r i m a r y d i f f i c u l t y i s i n
f i n d i n g a s a t i s f a c t o r y way t o l i n k c h a n g e s 2 . THE GOVERNING EQUATIONS
i n t h e v e l o c i t y f i e l d s t o changes i n t h e
p r e s s u r e f i e l d . T h i s i n t e r a c t i o n must be The f u l l f o r m o f t h e momentum e q u a t i o n s i s
a c c o m p l i s h e d i n s u c h a manner a s t o e n s u r e u s e d , w h e r e a l l v a r i a b l e s a r e i n non-
t h a t t h e d i v e r g e n c e of t h e v e l o c i t y van- d i m e n s i o n a l form. Concerning t h e t u r b u l e n t
i s h e s a t e a c h l e v e l o f p h y s i c a l t i m e . The f l o w s t h e h i g h - R e y n o l d s number (Ref 8 )
m o s t common s o l u t i o n t o t h i s p r o b l e m i s f o r m o f t h e k--E m o d e l i s u s e d .
t h e use of an a r t i f i c i a l c o m p r e s s i b i l i t y
methodology o r a p r o j e c t i o n methodology. T h i s f o r m u l a t i o n r e q u i r e s t h e u s e of t h e
w a l l f u n c t i o n s t o b r i d g e t h e v i s c o u s and
The p r o j e c t i o n m e t h o d f o r t h e s o l u t i o n o f boundary l a y e r s i n p r o x i m i t y t o t h e s o l i d
t h e time-dependent Navier-Stokes e q u a t i o n s w a l l . T h i s approach i s s t r i c t l y v a l i d
w a s i n t r o d u c e d i n d e p e n d e n t l y by Chorin o n l y f o r a t t a c h e d s h e a r l a y e r s a n d may
( R e f 1) a n d T e m a m (Ref 2 ) . S u b s e q u e n t l y , perform poorly i n t h e r e c i r c u l a t i o n zones.
a n e x p l i c i t v e r s i o n o f t h e method w a s p r e - I n a d d i t i o n t h i s model i s v a l i d u n d e r t h e
s e n t e d b y F o r t i n e t a 1 (Ref 3 ) . The p r o - h y p o t h e s i s o f e q u i l i b r i u m a n d may n o t s a t -
j e c t i o n method i s a n i n t e r p r e t a t i o n o f a i s f a c t o r y perform i n unsteady flows.
f r a c t i o n a l - s t e p method as a d a p t e d t o t h e
u n s t e a d y N a v i e r - S t o k e s e q u a t i o n s (Ref 4 ) . On t h e o t h e r h a n d , e x p e r i m e n t a l o b s e r v a -
t i o n s showed t h a t t h e g e n e r a l b e h a v i o u r o f
The p r o c e d u r e o f t h e p h y s i c a l t i m e l e v e l t h e b o u n d a r y l a y e r a n d t h e s t r u c t u r e of
i n c r e m e n t i s s p l i t i n t o two s t e p s . F o l l o w - t h e turbulence a r e not fundamentally a f -
i n g t h e d e c o m p o s i t i o n o f C h o r i n ( R e f l), a f e c t e d by t h e u n s t e a d i n e s s of t h e f l o w
t e n t a t i v e velocity f i e l d is f i r s t calcu- ( R e f 9 , 1 0 , 11). From t h e s e o b s e r v a t i o n s
l a t e d b y t h e d i s c r e t i z e d momentum e q u a - i t i s w e l l f o u n d e d t o s u p p o s e t h a t t h e hy-
t i o n s without t h e pressure gradient. A t p o t h e s e s used i n c a l c u l a t i o n s methods f o r
t h e second s t e p , t h e v e l o c i t y components t h e s t e a d y case a r e s t i l l v a l i d f o r t h e
a t t h e new t i m e l e v e l a r e e v a l u a t e d b y unsteady case.
correcting the tentative solution i n order
t o s a t i s f y t h e i n c o m p r e s s i b i l i t y con- The r e f e r e n c e q u a n t i t i e s a r e some r e f e r -
straint. e n c e v e l o c i t y u r e f , a r e f e r e n c e l e n g t h Lret.,
a r e f e r e n c e d e n s i t y pref a n d a r e f e r e n c e
The s o l u t i o n a l g o r i t h m w e u s e i n t h e p r e s - k i n e m a t i c v i s c o s i t y v r e f . The r e f e r e n c e
e n t study, is t h e approximate factoriza- v a l u e f o r t h e t i m e i s d e f i n e d a s tref=
t i o n technique. This is an implicit algo- Lref/uref a n d f o r t h e p r e s s u r e i s t h e p r o d -

Paper presented ut the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
35-2

uct pref=prefuZref.
The reference quantity is a matrix that contains the pressure de-
for the turbulent kinetic enerqy is uZref rivatives of the momentum equations.
and for the dissipation rate U ref/Lref.
In the expressions above, <,q are the cur-
Performing a generalised coordinates' vilinear coordinates, connected to the
transformation from the physical (x,y,t) Cartesian ones x,y through the generalised
to the computational ( < , q , r ) domain, the coordinates' transformatior.:
following non-dimensional form of the
equations is obtained (Ref 12):

a,Q + dSF + d,G + aE + K = d,V + d,W + aC + D and J is the Jacobian of the transforma-
tion :
where is a=O for the two dimensional equa-
tions, a = l for the axisymmetric equations
and the subscripts x,y,< , T ) , T denote deri-
vation. For convenience we express the In addition, U, V are the contravariant
above equation in the following form: velocities along the <,q directions re-
spectively, given by the following rela-
tions :

where

[F(u, v)] = d,V + d,,W + aC + D - d,F - d,G - CLE Re is the Reynolds number and 5 is the ki-
netic energy production term:
In equation (l), Q is the vector of the
conservative variables:
E = 2[(UJ2 + (VJ] + (UY + v.)'
J The stresses are:

F,G,E are the convective fluxes:


I

where veff is the effective viscosity.


G = -

E =
J '[ U V+
2
3

- [U, v, k, &IT
V
3
2
- qxk,v V+ - qyk,k V, E V
4 Finally, for the turbulence model equa-
tions are:

JY
V,W,C are the viscous fluxes:
where v I is the kinematic viscosity and v,
is the turbulent viscosity, which is given
by the relation:

k2
V, = Re C, -
E

i
ZY The constants are:
5YY - Tw
V2 cp = 0.09, C, = 1.44, C, = 1.92, ck = 1.0, cc = 1.3
c=- rkky + 2v, -
1 y
J Re Y For the above model the concept of wall
py
I E v2
+ 2v,c, - -
k Y
functions has been employed. The central
idea is that the flow in the region near
the wall can be assumed to behave as an
D is a vector that contains the source one-dimensional Couette flow. This is a
terms of the k and E equations: reasonable assumption except for regions
of high pressure gradient, separation or
k2 -- +
reattachment. Once this assumption is
G - &, C,C,kG - C, - made, it is rather easy to arrive at exact
k or semi-empirical relations (Ref 8, 14),
which link the shear stresses and the
and, finally other variables at the wall to the values
of velocity, turbulence energy, etc. at
the outer edge of the Couette layer, where
the first interior grid point is located.

3 . NUMERICAL ALGORITHM

The time marchinq scheme


0
For the solution of the system of equation
I

I 35-3

(1) t h e i m p l i c i t , f a c t o r e d , f i n i t e d i f f e r - scheme. E q u a t i o n ( 3 ) i s a c t u a l l y t h e s a m e
e n c e scheme o f B e a m a n d Warming ( R e f 5 ) i s w i t h ( Z ) , except t h a t it c o n t a i n s equation
u s e d . The t e m p o r a l d e r i v a t i v e i n e q u a t i o n (1) w i t h o u t t h e p r e s s u r e g r a d i e n t s , a n d i s
(1) i s a p p r o x i m a t e d v i a a g e n e r a l i z e d t i m e
differencing: AQ"=Q* -Q"

A n o n - l i n e a r e x p r e s s i o n , eq. ( 3 ) , o c c u r s
f o r t h e t i m e increment of t h e conserva-
t i v e s v a r i a b l e s ' v e c t o r AQ" (Ref 1 2 , 1 4 ) .
In order t o derive a l i n e a r algebraic sys-
which t a k e s t h e form:
t e m of e q u a t i o n s , a l i n e a r i z a t i o n of v i s -
c o u s and i n v i s c i d f l u x e s must be p e r -
AQ"
- e 1-8
+- f o r m e d . The i n v i s c i d f l u x e s , w h i c h a r e
1+<
a,Qn"
A7 l+< f u n c t i o n s o f Q, a r e l i n e a r i z e d u s i n g a
T a y l o r series e x p a n s i o n , f o r example:
+--
l + < A7
AF" = A " . AQ" + O(A7')
where A a n d are v t h e f o r w a r d and back-
ward d i f f e r e n c i n g operators, respectively, w h e r e An=8Fn/aQ" i s t h e J a c o b i a n m a t r i x o f
the superscript n denotes the t i m e instant t h e v e c t o r F".
and 0 d e n o t e s t h e o r d e r of t h e t r u n c a t i o n
error. The a b o v e l i n e a r i z a t i o n o f t h e i n v i s c i d
f l u x e s e n s u r e s t h e second o r d e r t i m e accu-
A f t e r s u b s t i t u t i n g (1) i n t o ( 2 ) a n d p e r - r a c y o f t h e scheme. I n o r d e r t h a t t h i s a c -
forming c a l c u l a t i o n s t h e f o l l o w i n g rela- curacy is retained i n the corresponding
tion is derived: l i n e a r i z a t i o n of t h e v i s c o u s f l u x e s , i t
must be t a k e n i n t o a c c o u n t t h a t t h e l a t t e r
Q"" - Qn e 1-8 a r e f u n c t i o n s of a l l Q,Q<,Q,, f o r example:
-
- -Mu, V)]"+l + -[& v)]"
A7 I+< I+<
-- e 1-8
K"" - - K" +-- < AQ~-'
V" ( Q , Q r r Q q ) = V , " ( Q , Q < ) + V , "( Q I Q , )

1 + C 1 + C l+C A7 The l i n e a r i z a t i o n of m a t r i x v," l e a d s to


the following r e l a t i o n :

AV; = 4-p. + R;) AQ~ + ( ~ A


n Q ~5 ) + O(Ar2)
Using a f r a c t i o n a l s t e p method s i m i l a r t o
t h a t d e s c r i b e d by Anderson and K r i s t o f f e r - while t h e matrix v," is t r e a t e d i n a ex-
s e n (Ref 1 3 ) t h e a b o v e r e l a t i o n i s s p l i t
i n two p a r t s : p l i c i t way:

-Q'- - Q" e 1-8 AV: = AV:-' + O(A7')


A?
- -[E(& v,]' +
I+<
-
I + <
[ftu, v)]"
w h e r e P" a n d R" a r e t h e J a c o b i a n m a t r i c e s .
+-- <
l+<
AQ"-'
A7
+ .[(e - $ - <)47 + AT'] (3) A detailed description for a l l the line-
a r i z a t i o n s i s g i v e n i n Ref 14.

and The s u b s t i t u t i o n o f t h e l i n e a r e x p r e s s i o n s
o f t h e f l u x v e c t o r s i n t o t h e o r i g i n a l non-
l i n e a r e q u a t i o n f o r AQ", l e a d s t o a
s t r o n g l y c o u p l e d s y s t e m of e q u a t i o n s i n
b o t h s p a t i a l d i r e c t i o n s . T h i s coupled sys-
t e m i s s o l v e d by t h e Approximate F a c t o r i -
z a t i o n T e c h n i q u e (Ref 5 , 14), w h i c h l e a d s
t o t h e f o l l o w i n g two t r i d i a g o n a l s y s t e m s ,
w h e r e Q' i s a n i n t e r m e d i a t e , o r t e n t a t i v e , o n e f o r e a c h o f t h e two d i r e c t i o n s <,q:
f l o w f i e l d . U s i n g e q u a t i o n (1) t h e above
r e l a t i o n i s w r i t t e n i n t h e form:
{I + 5 [a,(A - P + R,) - a,R - aN, + OaH]

e 1-8
--
1 + < K' - -
l+CK"
and a f t e r sope s i m p l e c a l c u l a t i o n s and as-
{I +
[a,,(B - Y + S,) - a,$ - u(N, + N3 - T) + @,H]
"I
s u m i n g Knt'=K i s obtained:
w h e r e A,B,P.,Y,R,S,N,,N,,N,,T and H a r e J a -
c o b i a n m a t r i c e s ( R e f 14), a n d

E q u a t i o n ( 4 ) i m p o s e s t h e c o n d i t i o n 1+<-
f k 0 . T h u s w e u s e 8=1 a n d < = 0 . 5 w h i c h l e a d s R. H. S. =
t o t h e second o r d e r t h r e e p o i n t backward
35-4

AT o s c i l l a t i o n s from t h e s o l u t i o n are r e - .
-[a<(-F + V)n + a,(-G + W)" + a ( C - E)" + Dn] moved. I n t h e p r e s e n t work o n l y e x p l i c i t
I+< t e r m s De a r e u s e d i n ( 5 ) . T h e s e t e r m s a r e
a b l e n d e d s e c o n d a n d f o u r t h o r d e r non-
l i n e a r m o d e l w h i c h i s w i d e l y u s e d i n com-
p r e s s i b l e f l o w s (Ref 1 6 , 1 7 , 1 8 , 1 9 ) a n d
w a s u s e d f o r t h e f i r s t t i m e i n incom-
p r e s s i b l e f l o w s b y P e n t a r i s e t a 1 (Ref
14), where i s p r o v e d t h a t t h e e x i s t e n c e of
where Q = JQ i s t h e v e c t o r o f c o n s e r v a - t h e second o r d e r d i s s i p a t i o n t e r m s do n o t
t i v e v a r i a b l e s i n t h e p h y s i c a l domain, De a f f e c t t h e s p a t i a l a c c u r a c y o f t h e method.
i s t h e a r t i f i c i a l d i s s i p a t i o n t e r m s (Ref
14), a n d e, 8, a r e w e i g h t i n g f u n c t i o n s The d e f i n i t i o n o f t h e t i m e s t e p
(Ref 1 5 ) u s e d t o a d d t h e J a c o b i a n m a t r i x H
i n b o t h t h e sweeps. Although t h e s o l u t i o n method i s i m p l i c i t ,
t h e a c t u a l s t a b i l i t y o f t h e scheme i s n o t
The P o i s s o n e q u a t i o n independent of t h e t i m e s t e p used. I n t h i s
work s m a l l t i m e s t e p s a r e u s e d w h i c h h e l p
E q u a t i o n ( 4 ) leads t o t h e f o l l o w i n g rela- t h e f a s t convergence of t h e Poisson equa-
t i o n s (8=1): t i o n . When a p r o b l e m w i t h o s c i l l a t i n g f l o w
rate is t o be simulated, t h e Navier-Stokes
e q u a t i o n s m u s t b e i n t e g r a t e d f o r a s many
c y c l e s a s are needed t o r e a c h a p e r i o d i c
steady state, i f such a state e x i s t s . I n
t h e p e r i o d i c s t e a d y state, of p e r i o d T,
t h e s o l u t i o n s a t t i m e i n s t a n t s t and t + T
must r e a c h a s p e c i f i e d convergence c r i t e -
r i o n , w h i c h i n t h e p r e s e n t work i s l ~ l O - ~ .
With t h e p r e s e n t m e t h o d t h i s c r i t e r i o n i s
Assuming t h a t t h e c o n t i n u i t y e q u a t i o n i s reached a t t h e second period, because
satisfied a t the n+l t i m e instant: 10000 t i m e i n t e r v a l s are u s e d p e r p e r i o d .
U s i n g l e s s t i m e i n t e r v a l s p e r p e r i o d , more
i t e r a t i o n s a r e needed f o r t h e convergence
o f t h e P o i s s o n e q u a t i o n . I n a d d i t i o n more
t h e f i r s t two o f ( 6 ) a r e combined t o g i v e p e r i o d s are n e c e s s a r y t o r e a c h t h e above
the Poisson equation: c r i t e r i o n and t h u s t h e t o t a l computational
cost is increased.

When a p r o b l e m w i t h s t e a d y u p s t r e a m c o n d i -
t i o n s i s s o l v e d , where t h e P o i s s o n e q u a -
t i o n i s r a p i d l y converged, t h e t i m e s t e p
(7)
i s e s s e n t i a l t o be as l a r g e as p o s s i b l e .
Then t h e t i m e s t e p i s d e f i n e d a s :

where = (U, v) is the velocity vector. CFL


dt =
1 + c
The p r o c e d u r e t h a t i s u s e d i s t h e f o l l o w -
i n g . F i r s t t h e t i m e - m a r c h i n g scheme o f where J,,, i s t h e maximum o f a l l t h e J a c o -
( 5 ) i s s o l v e d t o pr?vid.e t h e t e n t a t i v e ve- b i a n s i n t h e c o m p u t a t i o n a l doma n a n d CFL
l o c i t y components U , v and t h e t u r b u l e n t i s t h e C o u r a n t number.
v a r i a b l e s k , c . Next t h e P o i s s o n e q u a t i o n
( 7 ) i s s o l v e d u s i n g t h e c l a s s i c A D 1 method 4. BOUNDARY CONDITIONS
and t h e p r e s s u r e f i e l d i s o b t a i n e d . F i -
n a l l y t h e v e l o c i t y c o m p o n e n t s a t t h e new The u s e o f a c o l l o c a t e d g r i d a 1 ows t h e
t i m e l e v e l are e v a l u a t e d by c o r r e c t i n g t h e impose o f t h e s u i t a b l e b o u n d a r y c o n d i t i o n s
t e n t a t i v e v e l o c i t y f i e l d u s i n g ( 6 a ) and i n - c o n v e n i e n t f o r m . T h r o u g h o u t - t h e compu-
(6b). I t is e s s e n t i a l , f o r unsteady flows, t a t i o n s , e x p l i c i t boundary c o n d i t i o n s a r e
t o f u l l y converge t h e Poisson equation a t u s e d . F o r t h e P o i s s o n e q u a t i o n t h e s e con-
each t i m e s t e p i n o r d e r t h e m a s s conserva- d i t i o n s are derived by i n t e g r a t i n g equa-
t i o n t o be s a t i s f i e d . t i o n ( 7 ) o v e r t h e s o l u t i o n domain a n d a p -
p l y i n g t h e G a u s s ' s t h e o r e m (Ref 2 0 ) :
The a r t i f i c i a l d i s s i p a t i o n t e r m s

The s p a t i a l d e r i v a t i v e s i n t h e a b o v e s y s -
t e m of e q u a t i o n s are approximated by t h r e e
p o i n t c e n t r a l second o r d e r d i f f e r e n c i n g
e x p r e s s i o n s . So t h e s o l u t i o n o f t h e s y s t e m where t h e l a s t p a r t o f e q u a t i o n ( 7 ) v a n i s h
of e q u a t i o n s ( 5 ) r e q u i r e s t h e i n v e r s i o n of f o r unsteady flows because i s Ar=const i n
two b l o c k t r i d i a g o n a l s y s t e m s , o n e i n e a c h t h e e n t i r e domain. I n t h e e q u a t i o n a b o v e ,
d i r e c t i o n . On t h e o t h e r h a n d , t h e u s e o f ij i s t h e o u t w a r d u n i t v e c t o r n o r m a l t o t h e
c e n t r a l d i f f e r e n c e s on c o l l o c a t e d g r i d s
b o u n d a r y A w h i c h e n c l o s e s t h e s o l u t i o n do-
l e a d s t o t h e n e c e s s i t y of a d d i n g e x t e r n a l
main.
a r t i f i c i a l d i s s i p a t i o n t e r m s , so t h a t t h e
s t a b i l i t y i s r e t a i n e d and high frequency
Concerning t h e o t h e r variables, t h e veloc-
35-5

i t y p r o f i l e s uin a r e s p e c i f i e d i n t h e i n - f o r t h e p r e s s u r e a r e g i v e n i n Ref 21.


l e t boundary , w h i l e t h e k i n e t i c energy T h e s e s o l u t i o n s show t h a t t h e v e l o c i t y i s
kin a n d d i s s i p a t i o n r a t e c l n a r e g i v e n b y a f u n c t i o n of t i m e o n l y . T h i s i s a d i r e c t
the following r e l a t i o n s : r e f l e c t i o n of t h e incompressible continu-
i t y e q u a t i o n i n a c o n s t a n t a r e a t u b e . The
i
I’
pressure fluctuation is a linear function
of x t h a t v a n i s h e s a t x = l t o m e e t t h e
d o w n s t r e a m b o u n d a r y c o n d i t i o n . Some com-
p a r i s o n s between n u m e r i c a l r e s u l t s and t h e
w h e r e Din i s t h e i n l e t s p a n . a n a l y t i c s o l u t i o n a r e shown i n F i g 1. The
calculated dimensionless velocity a s a
On t h e o u t l e t b o u n d a r y a l l v a r i a b l e s a r e f u n c t i o n of t i m e , and t h e d i m e n s i o n l e s s
c a l c u l a t e d by e x t r a p o l a t i o n from t h e i n t e - pressure a t three longitudinal positions
r i o r . A t t h e symmetry a x i s t h e f i r s t d e - of t h e t u b e a r e compared t o t h e a n a l y t i c
r i v a t i v e s of a l l v a r i a b l e s are set e q u a l s o l u t i o n . Both t h e n u m e r i c a l r e s u l t s a r e
t o z e r o , e x c e p t t h e v - c o m p o n e n t o f t h e ve- i n e x c e l l e n t agreement w i t h t h e a n a l y t i c
l o c i t y w h i c h i s s e t e q u a l t o z e r o . On t h e s o l u t i o n , demonstrating t h e r e l i a b i l i t y of
s o l i d surface t h e non-slip condition i s t h e p r e s e n t method f o r u n s t e a d y f l o w s .
a p p l i e d f o r t h e v e l o c i t y c o m p o n e n t s . The
k i n e t i c energy and t h e d i s s i p a t i o n rate Two-dimensional p e r i o d i c f l o w between p a r -
are d e f i n e d a t t h e f i r s t g r i d p o i n t above allel plates
t h e s o l i d s u r f a c e w i t h t h e use of t h e w a l l
f u n c t i o n s (Ref 1 4 ) . The o s c i l l a t o r y f l o w b e t w e e n two p a r a l l e l
p l a t e s w i t h a s p a n o f 2b i s t h e s e c o n d
F i n a l l y , a s i n i t i a l c o n d i t i o n s , t h e U ve- t e s t c a s e w e p r e s e n t . The R e y n o l d s number
l o c i t y component i s s e t e q u a l t o u n i t y , i s b a s e d on t h e h a l f d i s t a n c e b between
w h i l e t h e v v e l o c i t y component a n d t h e t h e two p l a t e s a n d t h e maximum i n f l o w v e -
p r e s s u r e v a n i s h . The i n i t i a l d a t a f o r t h e l o c i t y U,. A t x=O t h e i m p o s e d i n f l o w u n i -
t u r b u l e n c e model v a r i a b l e s are g i v e n by form v e l o c i t y i s g i v e n by:
equations (8).

5 . RESULTS AND VALIDATION


u(t)=l.sin(Str.t) , v(t)=O

The a n a l y t i c s o l u t i o n f o r t h e v e l o c i t y and
Some r e p r e s e n t a t i v e r e s u l t s o f s e v e r a l
the pressure gradient f o r the developed
t e s t c a s e s a r e shown i n t h i s s e c t i o n . I t
p a r t of t h e channel, i s g i v e n b y Moore
must be mentioned t h a t a l l t h e q u a n t i t i e s
(Ref 22). The S t r o u h a l number is equal t o
u s e d a r e d i m e n s i o n l e s s . The d i m e n s i o n l e s s
10 a n d t h e R e y n o l d s number i s equal t o
n u m b e r s R e y n o l d s , S t r o u h a l a n d Womersley
1.6.
a r e defined as:
A 15x29 g r i d i s u s e d f o r t h e c u r r e n t t e s t
c a s e , w i t h 4b l e n g t h a n d l b h e i g h t . The
l o w e r b o u n d a r y i s a s o l i d w a l l a n d t h e up-
p e r o n e i s a symmetry a x i s .

One c y c l e o f t h e i n f l o w v e l o c i t y o s c i l l a -
t i o n i s s p l i t i n 10000 t i m e i n t e r v a l s and
r e s p e c t i v e l y , w h e r e wref i s t h e r e f e r e n c e the dimensionless t i m e s t e p obtained is:
c y c l i c frequency.

F i n a l l y it must b e n o t e d t h a t a l l t h e re- 2x
dt = = 2x.
s u l t s have been t e s t e d f o r v a r i o u s g r i d s Str . 10000
and are i n d e p e n d e n t from t h e g r i d d e n s i t y .
I n F i g 2 t h e developed v e l o c i t y p r o f i l e s
One-dimensional o s c i l l a t o r y flow at different physical t i m e instants are
p r e s e n t e d . A s can be seen t h e numerical
I n o r d e r t o c h e c k t h e r e l i a b i l i t y of t h e r e s u l t s c o i n c i d e w i t h t h e a n a l y t i c solu-
p r e s e n t method i t w a s i n i t i a l l y d e v e l o p e d tion. In Fig 3 the velocity a s a function
f o r one-dimensional flows and i t w a s o f t i m e a t t h r e e d i f f e r e n t d i s t a n c e s from
t e s t e d t o a n o s c i l l a t o r y c h a n n e l f l o w (Ref t h e w a l l , and t h e p r e s s u r e g r a d i e n t i n t h e
21). I n t h i s p r o b l e m t h e b a c k p r e s s u r e o f d e v e l o p e d p a r t a s a f u n c t i o n of t i m e a r e
t h e channel is o s c i l l a t i n g according t o : p r e s e n t e d . The a g r e e m e n t i s e x c e l l e n t com-
p a r i n g t h e numerical r e s u l t s w i t h t h e ana-
p,,(t) = p, + p e s i n ( s t r . t) l y t i c s o l u t i o n . I t i s c l e a r t h a t t h e un-
s t e a d y motion i s p r e d i c t e d w e l l a f t e r t h e
one f o u r t h of t h e f i r s t p e r i o d , and t h i s
An a n a l y t i c s o l u t i o n t o t h i s p r o b l e m c a n
i s one r e a s o n f o r t h e use of s m a l l time
only be obtained i f t h e p r e s s u r e p e r t u r b a -
steps.
t i o n pe i s s m a l l c o m p a r e d t o t h e mean b a c k
p r e s s u r e p,. I n t h i s work t h e s e p a r a m e t e r s
P e r i o d i c flow i n axisymmetric channel
a r e p e = O . l a n d p , = l . The S t r o u h a l number,
b a s e d on t h e t i m e mean i n f l o w v e l o c i t y U,
The t h i r d t e s t c a s e u n d e r c o n s i d e r a t i o n i s
a n d t h e c h a n n e l l e n g t h 1, S t r = w r e f l / u o i s
t h e periodic Stokes flow i n a c i r c u l a r
chosen t o be e q u a l t o 1 0 .
t u b e , e x t e n s i v e l y p r e s e n t e d and a n a l y s e d
b y many r e s e a r c h e r s (Ref 23, 24, 25, 26).
The a n a l y t i c s o l u t i o n f o r t h e v e l o c i t y and
I n t h e p r e s e n t p a p e r t h e R e y n o l d s number,
35-6

b a s e d on t h e r a d i u s a o f t h e t u b e a n d t h e
maximum i n f l o w v e l o c i t y U,, is considered I n F i g 6 t h e S t r o u h a l numbers S t r = f a / u ,
t o b e e q u a l t o 0.1, i n o r d e r t o a p p r o x i - p r e d i c t e d f o r a l l t h e g r i d s and f o r sev-
m a t e t h e S t o k e s f l o w . A t x=O t h e i m p o s e d e r a l t i m e s t e p s a r e shown. C o m p a r i s o n s a r e
v e l o c i t y p r o f i l e i s (Ref 2 6 ) : made t o o t h e r e x p e r i m e n t a l d a t a a n d nu-
m e r i c a l r e s u l t s . The a g r e e m e n t i s v e r y
good. I t can be s e e n t h a t t h e r e s u l t s are
u ( t )=u ( y ) . c o s( S t r . t ) , v ( t )=O
s l i g h t l y a f f e c t e d by t h e g r i d d e n s i t y o r
t h e t i m e s t e p u s e d . On t h e o t h e r h a n d , t h e
where u ( y ) i s e q u a l t o u n i t y e x c e p t t h e
d i s a g r e e m e n t between t h e e x p e r i m e n t a l d a t a
n e a r t h e w a l l r e g i o n were p a r a b o l i c a l l y
p r e s e n t e d i n F i g 6 show t h e u n c e r t a i n t y
approaches zero. For t h e present case we
and t h e s e n s i t i v i t y of t h e flow.
s e l e c t t h e t y p i c a l Womersley number o f
W = a d ( w r e f / v r e f ) = d30 a n d t h e S t r o u h a l number I n F i g 7 t h e v o r t i c i t y i s o l i n e s are pre-
becomes Str=aw,,,/u,=300. The t i m e s t e p s e n t e d f o r R e y n o l d s numbers 100 a n d 2 5 0 .
used i s 2.094.10-6. I n F i g 8 t h e t i m e h i s t o r y o f t h e v-
v e l o c i t y behind t h e c y l i n d e r and t h e cor-
A 45x40 g r i d i s u s e d , w i t h 1.2a l e n g t h r e s p o n d i n g power s p e c t r u m a r e p r e s e n t e d .
a n d l a h e i g h t . The l o w e r b o u n d a r y i s a I t m u s t be m e n t i o n e d t h a t f o r R e y n o l d s
solid w a l l a n d t h e u p p e r o n e i s a symmetry numbers 1 0 0 a n d 250 t h e f l o w i s p e r i o d i c .
axis. S o l u t i o n f o r t h e above r e l a t i o n s are F o r l a r g e r R e y n o l d s numbers t h e f l o w be-
given b y G o l d b e r g e t a 1 (Ref 2 6 ) , i n t h e i r comes t r a n s i t i o n a l o r t u r b u l e n t , a n d t h e
Table I. t i m e h i s t o r i e s of t h e v e l o c i t y and t h e
p r e s s u r e show a c h a o t i c b e h a v i o u r .
I n F i g 4 t h e comparisons between t h e s e m i -
a n a l y t i c s o l u t i o n a n d t h e n u m e r i c a l re- U n s t e a d y t u r b u l e n t f l o w b e h i n d a backward-
s u l t s p r o v i d e d by t h e c u r r e n t method a r e facinq step
g i v e n , f o r t h e u - v e l o c i t y component, a t
f o u r i n s t a n t s o f t h e p h y s i c a l t i m e . The I n t h e present paper a numerical investi-
a g r e e m e n t of t h e c u r r e n t n u m e r i c a l r e s u l t s g a t i o n of t h e c o h e r e n t v o r t i c e s i n t u r b u -
with the semi-analytic solution i s very l e n c e b e h i n d a b a c k w a r d - f a c i n g (Ref 3 2 )
good a t a l l t h e t i m e i n s t a n t s . The d i s - s t e p i s p r e s e n t e d . The r a t i o o f t h e c h a n -
crepancies t h a t occur a t c e n t r e l i n e veloc- nel height W t o the s t e p height H is 2.5.
i t y a t w t = O a n d wt=n d u e t o t h e s e m i - The g e o m e t r y a n d t h e i n f l o w v e l o c i t y p r o -
a n a l y t i c s o l u t i o n (Ref 2 6 ) . f i l e U ( y ) a r e t h e same a s i n t h e e x p e r i -
m e n t s o f E a t o n a n d J o h n s t o n (Ref 3 3 ) . A
The m a i n r e a s o n t h a t t h i s t e s t case i s e x - 250x50 g r i d i s u s e d , a d e t a i l of which i s
amined, i s t h a t t h e r e s u l t s p r o v i d e d by shown i n F i g 9 . The t o t a l l e n g t h o f t h e
t h e a n a l y t i c s o l u t i o n concern t h e e n t i r e channel i s 50 s t e p h e i g h t s . Both t h e lower
flowfield along the tube, i n contrast t o and t h e upper b o u n d a r i e s are s o l i d s u r -
t h e f l o w between t h e two p a r a l l e l p l a t e s f a c e s . The R e y n o l d s number based upon t h e
where r e s u l t s o n l y f o r t h e d e v e l o p e d p a r t s t e p h e i g h t H a n d t h e maximum i n f l o w v e -
of t h e flow w e r e available. I n a d d i t i o n l o c i t y U, i s 3 8 0 0 0 . The t i m e s t e p u s e d i s
t h e S t r o u h a l number i s much l a r g e r t h a n i t 0.0075.
was i n t h e p r e v i o u s t e s t c a s e .
I n t h e f i r s t r u n t h e o r i g i n a l k-c m o d e l
Unsteady flow behind a s q u a r e c y l i n d e r was u s e d . The f l o w t h a t o c c u r r e d was
s t e a d y . The r e c i r c u l a t i o n l e n g t h was 7 . 1 H .
The u n s t e a d y f l o w b e h i n d a s q u a r e c y l i n d e r The m a i n r e a s o n t h a t a s t e a d y f l o w w a s
i s p r e s e n t e d i n t h i s p a r a g r a p h . The o b j e c - p r e d i c t e d , i s t h e o v e r e s t i m a t e of t h e t u r -
t i v e i s t o examine t h e r e l i a b i l i t y of t h e b u l e n t v i s c o s i t y , which i n d i r e c t l y r e d u c e s
m e t h o d o l o g y when t h e u n s t e a d i n e s s o f t h e t h e R e y n o l d s number. Thus a s e c o n d r u n w a s
flow i s due t o t h e v i s c o s i t y of t h e flow performed u s i n g a modified r e l a t i o n f o r
and n o t t o a n e x t e r n a l c a u s e . the turbulent viscosity:

The R e y n o l d s n u m b e r s e x a m i n e d , b a s e d on
t h e i n f l o w u n i f o r m v e l o c i t y U, a n d t h e
s q u a r e s i d e a , a r e 1 0 0 , 250, 500 and 750.
T h r e e d i f f e r e n t g r i d s were u s e d w i t h
where
1 0 0 x 5 6 , 2 0 0 x 1 1 0 a n d 1 4 5 x 1 1 1 p o i n t s . The
2 0 0 x 1 1 0 g r i d i s shown i n F i g 5 . The p o i n t s
i n s i d e t h e s q u a r e a r e b l o c k e d . The p o s i - -(Y+ - Y:) / A'])'
t i o n of t h e c y l i n d e r and of a l l t h e
b o u n d a r i e s a r e t h o s e shown i n F i g 5 , a n d i s a f u n c t i o n proposed by Miner e t a l ( R e f
a r e t h e same f o r a l l t h e g r i d s . The u p p e r 34) i n order t o reduce t h e turbulent v i s -
and lower b o u n d a r i e s are c o n s i d e r e d t o be c o s i t y n e a r t h e w a l l . The c o n s t a n t s a r e
symmetry a x e s . f o = 0.04 , y; = 8 a n d A'=26.

Indicative experimental s t u d i e s concerning Using t h e above m o d i f i c a t i o n t h e flow be-


t h i s flow are t h o s e of P u r t e l l and K l e - comes u n s t e a d y . The p r e s s u r e c o n t o u r s a n d
b a n o f f (Ref 2 7 ) a n d O k a j i m a (Ref 2 8 ) . t h e v o r t i c i t y c o n t o u r s a r e shown i n F i g
T y p i c a l numerical s t u d i e s are t h o s e of 10. The p r e s e n c e o f a m i x i n g l a y e r b e h i n d
D a v i s a n d Moore ( R e f 2 9 ) , F r a n k e e t a 1 t h e s t e p i s c l e a r . The r e c i r c u l a t i o n
(Ref 3 0 ) a n d K e l k a r a n d P a t a n k a r (Ref 3 1 ) . l e n g t h ( t e m p o r a l mean) i s o v e r e s t i m a t e d
35-7

and is 8.1H, versus the experimental re- 402, 1978.


sult of 7.8H and the other numerical re- 6. Michelassi V. and Benocci C.,
sult of Silveira Net0 et al (Ref 32) of "Prediction of incompressible flow
6.8H. The eddies which impinge on the separation with the approximate fac-
lower wall, and are transported down- torization technique", International
stream, are shed with a frequency f that Journal of Numerical Methods in Flu-
corresponds to a Strouhal number Str=fH/U, ids, vol. 7, pp- 1383-1403, 1987.
=0.068. This is in excellent agreement 7. Mansour M. L. and Hamed A., "Implicit
with the experimental data, where solution of the incompressible Navier-
StreO.07. Stokes equations on a non-staggered
grid", Journal of Computational Phys-
In Fig 11 the time mean velocity profiles ics, vol. 86, pp. 147-167, 1990.
at two different positions are shown, in 8. Launder B. E. and Spalding D. B., "The
comparison to the experimental data of Ea- numerical computation of turbulent
ton and Johnston and the numerical results flows", Computer Methods in Applied
of Silveira Net0 et al. The agreement of Mechanics and Engineering, vol. 3, pp.
the results provided with the experimental 269-289, 1974.
data is very good. At Fig 12 the time mean 9. Zhong Q. and Olson M. D., "Periodic
kinetic energy profiles are compared to solution of turbulent oscillating
the experimental data. The agreement is channel flows", International Journal
very good. In both the Fig 11 and 12 the of Numerical Methods in Fluids, vol.
results of the steady case are also shown. 14, pp. 443-457, 1992.
In Fig 13 the temporal evolution of the 10 Cousteix J., Desopper A. and Houde-
longitudinal velocity component at ville R., "Structure and development
x/H=7.59, y/H=O.l and the corresponding of a turbulent boundary layer in an
spectrum analysis are shown. oscillatory external flow", Turbulent
Shear Flows I, pp. 154-171. Springer-
An interesting phenomenon, that can be ob- Ver lag, Ber lin/He idelberg/New York,
served in Fig 10 is the separation of the 1977.
boundary layer from the upper wall; it 1 1 Binder G. and Kueny J. L.,
generates a second street of coherent vor- "Measurements of the periodic velocity
tices which are transported toward the oscillations near the wall in unsteady
outlet of the channel with a Strouhal num- turbulent channel flow", Unsteady Tur-
ber Str=0.068. This phenomenon has also bulent Shear Flow, IUTAM Symp., pp.
been observed in experiments performed by 100-108. Springer-Verlag, Berlin
Armaly et a1 (Ref 35) with Strz0.07. /Heidelberg/New York, 1977.
12. Tsangaris S., Thomadakis M. P. and
6. CONCLUSIONS Pentaris A., "Numerical investigation
of axisymmetric compressible viscous
An implicit projection methodology for the flow", Proceedings of the 1st European
solution of the unsteady Navier-Stokes Computational Fluid Dynamics, Elsevier
equations in collocated grids is presented pub., Brussels, Belgium, pp. 835-842,
in this paper. The computational method is 1992.
based on the approximate factorization 13. Anderson H. I. and Kristoffersen R.,
technique and the incompressibility con- "Numerical simulation of unsteady vis-
straint is satisfied by a Poisson equa- cous flow", Arch. Mech., vol. 41, 2-3,
tion. Extended comparisons with analytic pp. 207-223, 1989.
solutions, experimental data and numerical 14. Pentaris A., Nikolados K. and Tsan-
results provided by other researchers lead garis S., "Development of projection
to the conclusion that the present method- and artificial compressibility method-
ology is a reliable tool for solving a ologies using the approximate factori-
large range of unsteady problems. zation technique", International Jour-
nal of Numerical Methods in Fluids,
7. REFERENCES vol. 19, pp. 1013-1038, 1994.
15. Michelassi V. and Benocci C.,
1. Chorin A. J., "Numerical solution of "Efficient solution of turbulent in-
the Navier-Stokes equations", Math. compressible separated flows", Pro-
Comp., vol. 22, pp. 745-762, 1968. ceedings of 8th GAMM Conference on
2. Temam R., "Sur 1' approximation de la Num. Methods in Fluid Mech. (1989),
solution des equations de Navier- Vieweg Verlag, pp. 373-390, 1990.
Stokes par la methode des pas frac- 16 Jameson A., Schmidt W. and Turkel E.,
tionnaires (II)", Arch. Ration. Mech. "Numerical solutions of the Euler
and Anal., vol. 32, pp. 377-385, 1969. equations by finite volume methods us-
3. Fortin M., Peyert R. and Temam T., ing Runge-Kutta time-stepping
"Resolution numerique des equations de schemes", AIAA paper, No. 81-1259,
Navier-Stokes pour un fluide incom- 1981.
pressible", J. Mecan., .vel. 10, pp. 17 Pulliam T. H. , "Artificial dissipation
357-390, 1971. models for the Euler equations", AIAA
4. Temam R., "Navier-Stokes equations", Journal, vol. 24, No. 12, pp. 1931-
North-Holland, Amsterdam, 1979. 1940, 1986.
5. Beam R. M. and Warming R. F., "An im- 18 Thomadakis M. P. and Tsangaris S.,
plicit factored scheme for the com- "Improved artificial dissipation
pressible Navier-Stokes equations", schemes for the Euler equations", In-
AIAA Journal, vol. 16, No. 4, pp. 393- ternational Journal of Numerical Meth-
ods in Fluids, vol. 14, pp. 1391-1405,
35-8

1992. research facility", National Bureau of


1 9 . Dejean F., Vassilopoulos C., Simandi- Standards Technical Note, No 9 8 9 ,
rakis G., Giannakoglou K. C., Pa- 1979.
pailiou K. D., "Analysis of 2-D tran- 2 8 . Okajima A., "Strouhal numbers of rec-
sonic turbomachinery flows using an tangular cylinders", Journal Fluid Me-
explicit Low-Reynolds k--E Navier- chanics, 1 2 3 , 3 7 9 - 3 9 8 , 1 9 8 2 .
Stokes solver", Accepted for presenta- 2 9 . Davis R. W. and Moore E. F., "An nu-
tion at the 3 9 t h ASME Intl. Gas Tur- merical study of vortex shedding from
bine and Aeroengine Congress and Expo- rectangles", Journal Fluid Mechanics,
sition, The Hague, June 1 3 - 1 6 , 1 9 9 4 . 116, 475-506, 1982.
2 0 . Sotiropoulos F. and Abdallah S., 3 0 . Franke R. and Rodi W., "Calculation of
"Coupled fully implicit solution pro- vortex shedding past a square cylinder
cedure for the steady incompressible with various turbulence models", 8th
Navier-Stokes equations", Journal of Symposium for Turbulent Shear Flows, .
Computational Physics, vol. 87, pp. Technical University of Munich, 9 - 1 1
328-348, 1990. September, 1 9 9 0 .
2 1 . Merkle C. L. and Athavale M., "Time- 3 1 . Kelkar K. M. and Patankar S. V.,
accurate unsteady incompressible flow "Numerical prediction of vortex shed-
algorithms based on artificial com- ding behind a square cylinder", Inter-
pressibility", AIAA paper, No. 8 7 - national Journal for Numerical Methods
1137, 1987. in Fluids, 14, 3 2 7 - 3 4 1 , 1 9 9 2 .
2 2 . Moore F. K., "Theory of laminar 3 2 . Silveira Net0 A., Grand D., Metais 0.
flows", Princeton Univ. Press, Prince- and Lesiuer M., "A numerical investi-
ton, 1 9 6 4 . gation of the coherent vortices in
23. Atabek H. B. and Chang C. C., turbulence behind a backward-facing
"Oscillatory flow near the entry of a step", J. Fluid Mech., vol. 2 5 6 , pp.
circular tube", 2. Angew. Math. Phys., 1-25, 1 9 9 3 .
vol. 1 2 , pp. 1 8 5 - 2 0 1 , 1 9 6 1 . 3 3 . Eaton J. K. and Johnston J. P.,
24. C. C. Chang and H. B. Atabek, "The in- "Turbulent flow re-attachment: an ex-
let length for oscillatory flow and perimental study of the flow and
its effects on the determination of structure behind a backward-facing
the rate of flow in arteries", Phys. step", Stanford University, Rep. Md-
Med. Biol., vol. 6, pp. 3 0 3 - 3 1 7 , 1 9 6 1 . 39, 1 9 8 0 .
25. Atabek H. B., Chang C. C. and Finger- 3 4 . Miner E. W., Swean T. F., Handler R.
son L. M., "Measurement of laminar os- A. and Leighton R. I., "Examination of
cillatory flow in the inlet length of wall damping for the k-c turbulence
a circular tube", Phys. Med. Biol., model using direct simulations of tur-
vol. 9, pp. 2 1 9 - 2 2 7 , 1 9 6 4 . bulent channel flow", International
26. Goldberg I. S., Carey G. F., McLay R. Journal for Numerical Methods in Flu-
and Phinney L., "Periodic viscous ids, 12, 6 0 9 - 6 2 4 , 1991.
flow: a bench-mark problem", Interna- 3 5 . Armaly B. F., Durst F., Pereira J. C.
tional Journal of Numerical Methods in and Schonung B., "Experimental and
Fluids, vol. 11, pp. 8 7 - 9 7 , 1 9 9 0 . theoretical investigation of backward-
27. Purtell L. P. and Klebanoff P. S., "A facing step flow", J. Fluid Mech.,
low velocity airflow calibration and vol. 1 2 7 , pp. 4 2 3 - 4 9 6 , 1 9 8 3 .
35-9

00000 x/L.r=O.L
ooooo x/k=0.6
0 0 0 0 a Anolytic solution hhhhh x/k=o.a
1.02 1 - Current method
1.12 7 - Current method

1.oa
1.01
1.04

;.oo
c
1 Ll.oo
LC

\ \
3 a0.96
0.99
0.92

Fiqure 1. Time evolution of velocity (left) and pressure (right) in the one-dimensional
flow. Comparison with analytic solution.

00000 Analytic solution, o t = O 00000 Analytic solution, ot=n


~~~0~ Analytic solution, wt=n/4 a0000 Analytic solution, wt=5n/4
- Current method - Current method
0.00 0.00

-0.20 -0.20

-0.40 -0.40
D 111
\ \
x x
-0.60 -0.60

-0.80 -0.80

-1 .oo -1 .oo
- 1.50 -1.00 -0.50 p.00 0.50 1.00 1.50 -1.50 -1.00 -0.50 p.00 0.50 1.00 1.50
U / Uref
00000 Analytic solution, ot=lr/2 00000 Analytic solution, ot=3n/2
ooaoo Analytic solution, ot=31r/4 Analytic solution, ot=7n/4
-

1\
0~~~~
Current method - Current method

O.OO
-0.20 1 O.O0I
-0.20

L3
-0.40 -0.40 4
\
x
-0.60

-0.80
-0.60

-0.80
1
-1 .oo -1 .oo
-1.50 -1.00 -0.50 0.00 0.50 1.00 1.50 -1.50 -1.00 -0.50 0.00 0.50 1.00 1.50
U/Uref U/Uref
Fiqure 2. Longitudinal velocity profiles at several time instants, in the developed re-
gion of the two-dimensional channel.
35-10

0 0 0 0 0 Analytic solution
- Current method

,OI
1 .oo

.c
E 0.00
3
\
3
-1.00

-2.00 -20
0.00 0.40 0.80 1.20 0.00 0.40 0.80 1.20
t/tref t/tref

F i q u r e 3. T i m e e v o l u t i o n o f t h e v e l o c i t y a t t h r e e r a d i a l p o s i t i o n s ( l e f t ) a n d t i m e e v o -
l u t i o n o f t h e p r e s s u r e g r a d i e n t ( r i g h t ) i n t h e developed r e g i o n of t h e chan-
n e l . Comparison w i t h a n a l y t i c s o l u t i o n .

00000 Analytic Solution. y / o = 1 Analylic Solution, y / o = 1


00000
o o o o o Analytic Solution. v/o=0.025 Anolyiic Solution. v/o=0.025
00000
- Current method - Current method
wt=o wt=n
0.00 -
I
-
-0.25 -

- 1.50
0.00 0.20 0.40 0.60 0.80 1.00 1.20 0.00 0.20 0.40 0.6? 0.80 1.00 1.20
x /c1 "/a
00000 Analytic Solution. y / o = 1 00000 Analytic Solution. y/o= 1
o o o o o Anolytic Solution. y/o=0.025 00000 Analytic 'Jolution. v/o=0.025
~ Current method - Current method
wt=n/2 wt=Sn/2

0.50 3
/
0.30

0.20 I
\
\
3 /

"
rl
-0.10 1
0.06' '0.66 '
0.66 '&io ' '0.kO ' '(.Ad ' '1'.60 ' 'l',;o
-0.50 1
0.00
0.20 0.40 0.60 0.80 1.00 1.20
1
x/a x/a

F i q u r e 4. L o n g i t u d i n a l v e l o c i t y component a l o n g t h e c i r c u l a r t u b e f o r o n e c y c l e o f t h e
flow.
35.11

0 00 5 00 1000 1500 2 0 00 25 00
x/a
Fiqure 5 . The 2 0 0 x 1 1 0 grid for the flow around a square cylinder.

0 Experiment01 doto. A. Okojlmo ( 1 9 8 2 )


0 Experimental data. Davis and Moore ( 1 9 8 2 )
A Grid 100x56. dx=dy=0.1. dt=O.Ol
X Grid 200x110. dx=dy=0.05. dt=0.005 dt=D.OOl
0 Grid 145x11 1, dx=dy=0.004. dt=O,02
Numerical results. Fronke et 01. (1990)
Numerical results, Oovis and Moore ( 1 9 8 2 )
U Numerical results. Kalkor a n d Patankor ( 1 9 9 2 )

1 ~ 1 1 1 1 ~ 1 1 1 1 ~ 1 1 1 1 ~ 1 I1 l 1 l 1 (

0 100 200 300 400 500 600 700 800


Re

Fiqure 6. Strouhal number a s function of the Reynolds number. Comparison with experi-
mental data and numerical results.
35-12

Fiqure 7. Vorticity isolines for Reynolds numbers 100 (up) and 250 (down). Grid
200x110.

0.04
0.40

-5 ono ...... D i d 10Ur56. d k O . 8 1


> -Grid 200r110. dlr0.005
--Grid i45ril1. d+=O.W

-0 M

-0.40 0 00
0.20 030 040 050
FRE

Fiqure 8. Tlme history of the v component of the velocity behind the square cylin-
der (left) and the corresponding power spectrum analysis (right).
35-13

2.50
.....................................
......................................
.................
....................................
...................................
2.00
....................................
I1.50
\ .................
A1.oo ..................................
................
0.50

0.00
0.00 2.00 4.00 8.00 10.00

Fiqure 9. A view of t h e 2 5 0 x 5 0 g r i d used f o r the s o l u t i o n o f t h e unsteady turbulent


f l o w i n t h e backward f a c i n g s t e p .

(b)
Fiqure 10. Unsteady f l o w m a backward f a c i n g s t e p . ( a ) pressure i s o l i n e s , (b) vortic-
i t y isolines.

150

t.m]

O M

om +
-0.a

Fiqure 11. T i m e mean l o n g i t u d i n a l v e l o c i t y p r o f i l e s v e r s u s experimental data and other


numerical r e s u l t s .
35-14

. .
om
..
I ' I ~ I ' I ' I oml I I I
o.m om om 003 MI 0.0 om om os4 to
May wm
Fiqure 12. T i m e mean k i n e t i c energy p r o f i l e s v e r s u s experimental d a t a .

0.80

0.40

2 0.00

-o..o

-0.80
001 0 10 015 0 20
Smuhal (MNo)

Fiqure 1 3 . Time e v o l u t i o n of t h e l o n g i t u d i n a l v e l o c i t y near t h e lower w a l l ( l e f t ) and


t h e corresponding spectrum a n a l y s i s ( r i g h t ) .
ADAPTION BY GRID MOTION FOR
UNSTEADY EULER AEROFOIL FLOWS
C. B. Allen

Department of Aerospace Engineering,


University of Bristol,
Bristol, BS8 lTR,
United Kingdom.

Abstract using an unstructured grid requires 2 to 5 times the


CPU time of that on a structured grid with the same
A solution-adaptive structured grid technique is de- number of nodes. T h e situation is likely to be worse
scribed for the computation of stea.dy and unsteady for unsteady computations, where the grid must be
Euler flows past aerofoils. Transfinite interpolation recomputed a t least once per time step.
is used to generate the grids as this is well-suited
to unsteady flows, since grid speeds required in the This paper describes a solution-adaptive grid tech-
flux terms are available directly from the algebraic nique for steady and unsteady Euler flows using
mapping. A novel approach to grid adaption is de- structured grids computed by the tmnsfinite inter-
scribed. Adaption is performed by adapting the in- polation technique. Transfinite interpolation is well-
terpolation parameters, instead of the physical grid suited to unsteady computations [2] since the grid
positions, so the adapted grid positions are available speeds are available directly from the interpolation
algebraically. Hence, the grid speeds required for un- equation. T h e grid generation is remarkably sim-
steady computations are also available algebraically. ple, grid positions are obtained by interpolation of
For unsteady flows grid adaption is performed by im- boundary positions and grid speeds by interpolation
posing an ‘adaption velocity’ on grid points, thereby of boundary speeds, the interpolation being the same
applying the adaption gradually over several time in each case.
steps and avoiding the interpolation of the solution
from one grid to another, associated with instanta- Structured grid adaption is often achieved by solving
neous adaption. Steady and unsteady aerofoil flows a set of partial differential equations for the complete
are considered. In both cases the adaptive grid tech- domain, for exa.mple Catherall [3] solves a combina-
nique is shown to produce sharper shock resolution tion of Laplace, Poisson, and equidistribution equa-
for a very small increase in CPU requirements. tions with source terms a.dded t o control grid stretch-
ing, spacing, and orthogonality. Pericleous el a1 [4]
solve an equidistribution equation, based on solution
1 INTRODUCTION gradients, along each grid line in each coordinate di-
rection, then solve a Laplace equation for the result-
Increases in computer power have meant that com- ing grid positions to ensure orthogonality. Although
putational methods for unsteady flows 1ia.ve become these approaches are suitable for steady flows, they
commonplace. However, the CPU requirements of are less suitable for unsteady flows. When consider-
these methods can still be large. Moving grids are of- ing moving grids the grid speeds are required in the
ten used, and so repeated grid generation is required, flux eva.luation, and neither approach leads to an ob-
and a large numerical integration time may be neces- vious method of evaluating these speeds.
sary to reach a periodic solution. Grid adaptivity is
therefore desirable to improve solution resolution, in A different, approach is presented here, wherein a
regions of high flow gradients, without significantly new interpolation technique is developed and grid
increasing the CPU requirements. There has been adaption is performed by adapting the interpolation
much recent discussion about whether structured or parameters instead of the physical grid positions. In
unstructured grids are best. Unstructured grids ap- this way it is possible to determine the adapted grid
pear to have the advantage of lending themselves positions algebraically. This represents a significant
more naturally to grid adaption or enrichment, but advantage when considering unsteady computations
the computational cost can be large, due to the grid on moving grids. T h e resulting grid position equa-
connectivity d a t a required. It has been shown [l] tion can simply be differentiated with respect to time
that for steady computations a solution coinpiitecl to yield the grid speeds algebraically.

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
36-2

However, there is a further problem encountered and the flux across the face simply FA,. This gen-
--+
when adapting the grid during an unsteady com- era1 flux vector is split into a forward part F as-
putation. T h e conventional steady technique is to sociated with positive moving waves only, i.e. all
adapt the grid instantaneously and interpolate the
solution to the new grid. This is less suitable for un-
eigenvalues of$1 0, and a backward part F- as-
sociated with negative moving waves only, all eigen-
steady flows since many adaptions are required over
several periods of motion, and repeated interpola- "- < 0.At each cell face a pair of states
tion may result in a gradual loss of accuracy. Un- and a single numerical flux derived
structured adaptive grids have been developed for from this pair. The split flux components are, see
unsteady flows, see for example [5, 61, and regions of Van-Leer [7] and Parpia [8],
high gradients are simply enriched with extra points.
However, an interpolation step is still required, and
this has been shown to lead to a conservation loss,
even for unstructured grids, [5].

The adaption for unsteady flows is carried out here


by imposing an 'adaption velocity' onto each grid
point, thereby moving the grid points from one
adapted grid position to the next over several time
steps. This avoids the instantaneous adaption a,p-
proach and so interpolation is not required. It also
requires no extra grid generation.

2 UPWIND DIFFERENCE SCHEME

A finite-volume upwind scheme is used to solve the


a n d m the Mach number normal to the cell face
two-dimensional unsteady Euler equations in inte-
- U- and a is the local acoustic speed. The above
-
gral form, for the domain R with boundary dR a )
splitting is only valid for1x1
5 1. Else
- --
=F, F = 0, if = > 1, (12)

The vector of conserved variables U and convective


+
F = 0, i?- = P , if 2 < -1. (13)
fluxes F and G , for moving grids, are; The general flux vector is split by
U = [p,pu,pv,ElT, (2)
F = [ ~ U , ~ U U + P , ~ ~ UP,) (UE+ x+t P I T , ( 3 )
A third-order spatial interpolation is used to eval-
G = [pV,puV, pvV + P,( E + p)v + YtPIT,(4) uate Ut and U- at each cell face, along with the
continuously differentiable flux limiter due to An-
and derson e l al [9].
U=u-xt, v=v-yt (5)
where xt and yt are the inertial grid speeds in the x Once F has been split into its components the re-
and y directions respectively. sulting flux must be rotated back to our original co-
ordinate system. This is achieved by
The Cartesian velocity components normal and tan-
gential to each computational cell face, and the con- F A y - G A x = K1[?(U+) +F-(U-)]As (15)
travariant velocity normal to the cell face, are then
where R is the rotation matrix.
-
U = U-
AY
- V-,A x -
v = U-
Ax + Ay
As As As
U-,
As (6) An explicit three-stage Runge-Kutta scheme is used
to integrate the equations forward in time. Local
- AY Ax time-stepping is used for steady flows.
U = (U - .t)- - (v - y*)-. (7)
As As
Here A x and Ay are the cell face components and
A s is the face length. The general flux function in 3 GRID GENERATION
the direction normal to the cell face is then
Unsteady flows using structured moving grids will be
considered. As the grid positions and speeds must be
36-3

repeatedly calculated during an unsteady computa- Figure l ( a ) shows the grid near a NACA0012 aero-
tion, we require a method of grid generation which foil, resulting from the above interpolation, using
is simple, and which gives the speeds algebraically i m a x = 129, (99 points on the aerofoil surface, 15
rather than having to evaluate numerical differences in the wake either side), j i n a x = 30, st = 1.2,
between grid positions on successive time levels. It and the outer boundary is 20 chords away. Figure
was thus decided to use the transfinite interpolation l(b) shows the corresponding variation of r,~iand $?.
method originally described by Gordon and Hall [lo]. (Grid points i = 13 - 1 1 7 , j = 1 - 13 are shown).
For the vector function
By differentiating (25) with respect to time the grid
f(rl, €) = [X(% 0,
Y(% 01 (16) speeds can be obtained analytically, (blending func-
tions assumed constant, and outer boundary fixed)
which is known only on certain lines of the region

rl1
€1
Irl Irl2
I€ I€2 1 (17)

transfinite interpolation gives the interpolated func- Hence, grid positions are calculated by interpolation
tion f ( q , <) throughout the region by a direct al- of the boundary positions, and grid speeds by inter-
gebraic mapping. The general transfinite interpo- polation of boundary speeds, the interpolation being
lation method results in a recursive algorithm, see the same in each case.
Eriksson [ll]. However, for a C-grid the inner and
outer boundaries are lines of constant where 1) is <
known. Defining one normal derivative only a t the 4 GRID ADAPTION
inner boundary, the algorithm reduces to E direction
interpolation only, The grid is to be adapted, according to the solution, ,
so that grid points are clustered in regions of high
d gradients. Adaption is normally performed in ( 2 ,y)
f(77,E) = $0(r)f(77,<l)+$1(E)--f(rl,<1)+$2(€)f(77,E2)
a€ space. However, while this gives suitable grids for
(18) steady computations, the grid positions, and hence
Here $Ol1i2are the blending functions in the di- < more importantly grid speeds, would not be available
rection. The function f a.ctually represents a trans- algebraically for unsteady computations. Only nu-
formation from (7,€) space to ( I ,y) space. The grid merical values of d x / d t and d y / d t could be evaluated
points are indexed by i and j in the r] and directions between different adapted grids during an unsteady
respectively, and then each i and j line are defined computation, and these could cause problems of grid
as constant rl and ( lines respectively. The variables distortion and crossover when grid points move along
are normalised such that highly curved lines.
0 5 17, €, $ O J J 5 1. (19) Adaption is achieved here by writing the interpo-
The boundaries f ( q , 0) and f ( v , 1) are known at i m a x lation function in a more general form and adapting
discrete points, i.e. fi(0) and f i ( 1 ) . The value of ( the interpolation parameters instead of the physical
a t each constant [ line is then defined as j / j m a x . coordinates, such that grid positions are available al-
The blending functions 4’ and $ 2 control the spac- gebraically.
ing in the [ direction, and 4’ controls how far the
normal direction affects the line direction. The most Since each i line is a constant ij line, we can move
effective blending functions have been found to be points along an i line by simply varying € (or $)
along that line. The line remains unchanged, only
E = -,j mj a x the distribution of points along it is altered. Adap-
tion in the t direction is thus achieved by letting the

r = { et - 1
e-2
-< blending functions be variant in 71 as well as (, and
so we have,

Figure 2(a) shows the near aerofoil grid resulting


- from varying s t from 1.1 to 1.3 depending on the
$23 = €, (24) rl spacing, i.e. clustering points near the leading
and trailing edges, and figure 2(b) the corresponding
where st is a stretching exponent. The i m a x x j m a x
qj, $;,j variation. To adapt in the other direction, we
grid positions then come from
must now change the interpolation so that each i line
d is no longer constrained to be a line of constant 7.
+
f;,j = $jfi”fi(O) $1 -fi(O)
at
+ $jfi( 1). (25) Along each j line 77 is now varied to give the required
36-4

distribution in this direction (previously vi was the where 2.0 5 f q ,fr2 5 5.0, and Avo, A ~ o Asho,
,
same on every j line). The inner and outer bound- and As,, are the initial spacings.
aries, f(v,O) and f ( v , 1) are determined in terms of
7,so that they are known a t any point, not just the Consider, for example, the variation of 7 along the
specified points fi(0). The interpolation is then aerofoil surface, $’ = 0. An intermediate variable, C ,
is defined so that 7 = v(C) where clearly 0 5 C 5 1.
a A uniform distribution of C is used, and then q(C)
fi,j = $y,jf(vi,jj o)+$~,,-fi(o)+l~l:,jf(vi,j, 1). (28)
2 is defined to give the required distribution of points.
By adapting 7 and 4’ instead of x and y the grid Figure 3(a), shows the initial distribution of 17 along
positions are still available algebraically. This means the aerofoil for 99 points on the aerofoil. This is the
that the grid speeds are also available algebraically, unadapted distribution of points on the aerofoil.
I
which is essential for efficient unsteady adaption.
For a solution where adaption is required in the r]
direction, if for example a normal shock is present,
4.1 Adaption in Each Direction A V is defined a t that point using equation (34) and
I
then use a cosine variation in q to get back to the
Instead of computing a completely new grid due to unadapted distribution of v in as few points as pos-
adaption, it is desirable to simply change only a small sible. Figure 3(b) shows the variation of along the
region of the grid where adaption is required. aerofoil surface for the flow considered in the next
section, when normal shocks are present at approx-
Adaption in the j direction is a.chieved by varying imately 0.64 chord on the upper surface and 0.32
$’ along each i line. For adaption in the i direction chord on the lower. This simple sampling and adap-
I 71 is changed along each j line to give the required tion procedure is performed for each line in each di-
I
‘ distribution. rection.

Adaption is required in regions where flow quantity


gradients are high, and the local Mach number gra- 5 STEADY FLOW RESULTS
~ dient is used as a sensor. At each point the Mach
number gradient in each direction is evaluated, The steady flow over a NACA0012 aerofoil a t 1.25’
incidence, in a flow of freestream Mach number 0.8 is
considered. The initial grid is similar to that shown
in Figure 2, i.e. 129 x 30 C-grid, with st varying
between 1.1 and 1 . 3 (clustering near the leading and
trailing edges).

where Figure 4(a) shows the pressure coefficient over the


aerofoil computed on the non-adaptive grid, the
dashed line is the reference AGARD solution [12].

The grid was then adapted by applying the Mach


number gradient check along each line (in 2 ,y space)
The gradients a t each point are normalised by the and simply clustering points (in v,$’ space) where
largest value over the domain. If this gradient this is greater than the threshold level. Figure 5
is greater than a threshold value then adaption is shows the resulting near-aerofoil variation of 17 and
deemed to be required a t that point. There will usu- $’, and the corresponding grid. The variation in
ally be regions of points where adaption is required, the 11,’ direction is unchanged, since where the Mach
i.e. 213 points around a shock and 5 to 10 points number gradient is above the threshold level the grid
around a stagnation point, and so in each region the spacing is a t the minimum value already. Clearly
point with the largest Mach number gradient is iden- the grid needs to be smoothed. This is often done
tified. At each ada.ption point, the spacing of grid by solving a Laplace equation for the grid point co-
points is controlled by defining two spacing factors, ordinates (see for example [4]).However, that is not
f r1 for stagnation points, and f r’ for other adaption required here. As the grid was initially smooth a
points. Since parallel lines in (v,$’) space may not smoothing can he applied to the whole grid in each
be parallel in ( x , y ) space, the spacing at adaption direction and the unadapted regions of the grid will
points must be scaled thus, he unaffected. A simple three-point smoothing is
applied
36-5

Figure 6 shows the smoothed variation of q and $ 2 , with local time-stepping that is used for steady com-
and the corresponding grid. Figure 4(b) shows the putations. This approach also means that the grid
surface pressure coefficient computed on the adapted generation routine only needs to be called once ev-
grid. The improved shock capturing is clear. ery real time-step, to calculate the grid positions and
speeds at the next time level.
In many steady flow adaption procedures the grid is
allowed to adapt gradually by effectively progressing
with the solution, until a time-asymptotic grid and 6.1 Consideration of Cell Area Changes
solution are reached. The grid adaption here is only
applied once, as this will be the case when an un- If the cell areas a t each time level or stage are simply
steady solution is periodically sampled as in section calculated using the instantaneous physical coordi-
7. The adaption only has one 'chance' to compute a nates of the cell faces a numerical error is introduced
suitable grid at each adaption point. which will increase with time. The cell areas must
therefore satisfy a geometric conservation law of the
same integral form as the mass conservation law [15],
6 UNSTEADY EULER METHOD

The explicit time-stepping scheme used for steady


flows can be made time-accurate by using a global and this must be solved using the same numerical
time-step, and applied to unsteady motion on a mov- scheme as for the flow quantities. The cell areas at
ing mesh by incorporating the cell area changes at the next real time level are thus calculated by
each stage in the time-stepping scheme [13]. How-
ever for a typical unsteady computation, with the
grid size above, as many as 15000 time-steps, and
two CPU hours, per period may be required. It is (42)
more efficient to solve the unsteady problem as a se- where k = 1 , 2 , 3 , 4 represents the four cell faces.
ries of pseudo-steady problems. The implicit form of
the differential equation for each computational cell
7 UNSTEADY GRID ADAPTION
The normal steady flow adaption procedure is to
where A is the cell area and R is the upwinded flux compute the solution, sample it, and change the grid
integral. The implicit temporal derivative is then ap- instantaneously. However, whether using structured
proximated by a second-order backward difference, grids, where a fixed number of grid points are redis-
following Jameson [ 141, giving tributed to be clustered in regions of high gradient,
n n or unstructured grids, where extra points are simply
a [An+'Un+']- _k
- [AnUn]+ added in regions of high gradient, adaption results
2At At in grid points where the solution is not known. This
1 then requires the interpolation of the solution from
- [An-'Un-']
2At
+ R(U"+') = 0. (38) the old grid to the new. The repeated adaption and
A new residual R'(U) is defined as interpolation required over several periods in an un-
steady computation can result in a gradual degener-
3 2 ation of the solution [SI.
R*(U) = - [A"+'U] - - [A"U"]
2At At
+
Also, the implicit scheme implemented here uses val-
1
- [An-lUn-l]+ R(U) (39) ues of conserved variables and cell areas from previ-
2At ous time levels, which do not exist once the grid has
and then a new differential equation can be writ,ten been adapted, so instantaneous adaption cannot be
in terms of a fictitions time T . applied to the unsteady solver used here.
dU
A"+1-
dr
+ R.*(U)= 0. (40) To avoid this the grid adaption is spread over sev-
eral (real) time steps and the motion of each point
This is simply time-marched to convergence in the described in terms of an 'adaption velocity'. The pe-
fictitious time T , for each real time-step. There is riodic nature of the unsteady solution is exploited by
now no limit to the size of the real time step, At, sampling the solution over one period and adapting
that can be taken and this leads to a large reduction the grid accordingly over the next period. Therefore
in CPU times. The time step is now limited by ac- over one unsteady period the solution is sampled ev-
curacy rather than stability. For each real time step ery nsamp real t,ime steps, and the resulting adapted
equations (40) are solved to convergence using an im- (17, distribution stored. When calculating the so-
plicit form of the three stage time-stepping scheme lution on the next period, over each set of nsamp
36-6

time steps the velocity of each point required for that undisturbed flow speed. The scheme was run at a
point to reach its position at the next ada.pted grid CFL number, based on T of 1.4, a.nd local time step-
is imposed on each point, and the grid moves grad- ping was used to accelerate convergence within each
ually between each adapted state. real time step. There were 180 real time steps per pe-
riod and the same grid data w a s used as previously,
If k is the adaption index (k = 0, .., nmdapt, where 129 x 30 points, with 99 points on the aerofoil. In the
nadapt = n t / n s a m p and nt is the number of real adaptive computation nsam.p was 10 and so nadapt
time steps per period), then qf:),$?,,k"' is the grid was 18.
point distribution at adaption k. To move the grid
points from one distribution to the next over nsa.mp Figure 7 shows normal force and moment (about
time steps we calculate the speed of each point, in a chord) coefficient loops obtained by the implicit
(v,$2) space method, adaptive and non-adaptive, and from exper-
iment [lG]. The coefficient loops are quite similar,
but the adaptive C, loop is slightly narrower, and
the C,,,loop has larger 'steps', than the standard so-
lution. The instantaneous pressure distributions are
shown in figure 8. The improved shock capturing
with the adaptive grid is clear. Figure 9 shows the
near aerofoil adaptive grid at each of the incidences
The grid speeds are obtained by differentkting equa- considered in figure 8.
tion 28 with respect to time,

d d$o . d$1. d$2. The non-adaptive scheme required 19 CPU minutes


-f. . = ""f (77.; o ) + ~ - f i ( o ) + ~
( V if, j , 1) per period on a Stardent 3000 machine, and the
dt '" dt "' dt a< dt
adaptive grid solution approximately 22 CPU min-
utes. An explicit time-stepping scheme required ap-
proxirna,tely 15000 time-steps and two CPU hours
per period [13]. Thus the implicit method requires
(45)
Then superimposing the adaption speeds onto the only one-fifth of the CPU tirne of an explicit scheme,
unsteady motion speeds, and replacing by $& & even with an a.daptive grid.
where required, we obtain (the outer boundary is
fixed in time so &f(q,1) = 0 due to motion)
9 CONCLUSIONS

Steady and unsteady solutions have been computed


using non-adaptive and adaptive grids generated
by a new transfinite interpolation technique. Grid
adaption is perforined by adapting the interpolation
parameters, instead of the physical grid positions, so
that the adapted grid positions are still available al-
gebraically. This interpolation has been shown to be
ideal for generating structured moving grids, since
it is very simple, thus requires little CPU time, and
since the grid speeds, even for adapted grids, are
where the superscript M represents speeds due to available directly from the interpolation equations.
the aerofoil motion. The implicit, code is run with
The simplicit,y of the interpolation results in great
the unadapted grid for two periods, the adaptive grid
flexibility, and we can adapt the grid during an un-
data being stored during the nadapt samples of the
steady computation by imposing an 'adaption veloc-
second period, and then two periods of adaptive grid
ity' onto each grid point, thus performing adaption
computations are performed.
gradually. This avoids the interpolation of the so-
lution froin the old grid to the new associated with
instantaneous adaption.
8 UNSTEADY RESULTS
An upwind Euler scheme is used to compute the solu-
The scheme was applied to the Mach 0.755 flow
tions. This is implemented using a dual-time implicit
about a NACA0012 aerofoil pitching about quarter
method for unst,eady flows which is very efficient, re-
chord. The aerofoil motion is defined by
quiring only $ the CPU time of the explicit scheme.
a = 0.016" + 2.51°sin(wt) (47)
For steady and unsteady aerofoil computations, the
The reduced frequency parameter, k = 5 ,was adaptive grid method produces sharper solutions for
221,
0.0814 where c is the aerofoil chord, and U , is the very little increase i n CPU requirements.
36-7

Currently, only a fairly crude grid redistribution [7] Van-Leer, B., “Flux-Vector Split,ting for the Eu-
technique is employed. Future work will include de- ler Equations”, Lecture Notes i n Physics, Vol.
veloping a more sophisticated method, along with 170, 1982, pp. 507-512.
extending the adaptive technique into three dimen-
sions. The method should be equally simple, the [E] Parpia, I. H., “Van-Leer Flux-Vector Splitting
only difficulty arising from the third dimension being in Moving Coordinates”, AIAA J , Vol. 26, lan-
that the boundary definition will involve determin- uary 1988, pp. 113-115.
ing spline equations for surfaces rather than lines.
[9] Anderson, W. K . Thomas, J. L. and Van-Leer,
B., “Comparison of Finite Volume Flux Vector
References Splittings for the Euler Equations”, AIAA J ,
Vol. 24, September 1986, pp. 1453-1460.
[l] Salas, M. D. (ed.), “Accuracy of Unstructured
Grid Techniques Workshop”, (NASA Langley [IO] Gordon, W . J. and Hall, C. A., “Construction
Research Centre, Hampton, VA), Jan. 1990, of Curvilinear Coordinate Syst,ems and Applica-
(NASA Proceedings t o be published) tions of Mesh Generation”, Int J Numer Meth
Eng, 1973, Vol. 7, pp461-477.
[2] Williams, A. L. and Fiddes, S. P., “Solution of
the 2-D Unsteady Euler Equations on a Struc- [ l l ] Eriksson, L. E., “Generation of Boundary-
tured Moving Grid”, Bristol University Aero. Conforming Grids Around Wing-Body Configu-
Eng. Dept. Report 453, 1992. rations Using Transfinite Interpolation”, AIAA
J , Vol. 20, No. 10, 1982, pp. 1313-1320.
[3] Catherall, D., “Adaptivity Through Mesh
Movement”, in Proceedings of European Forum, [12] (Anonymous) “Test Cases for Inviscid Flow
Recent Developments and Applications in Aero- Field Methods”, AGARD Agardograph AR-
nautical CFD, Bristol, 1993. 211, 1985.

(41 Patel, M. K., Pericleous, K. A. and Baldwin, S., [13] Allen, C. B., “Central-Difference and Upwind-
“The Development of a Structured Mesh Grid Biased Schemes for Steady and Unsteady Euler
Adaption Technique For Resolving Shock Dis- Aerofoil Computations”, Aero. J , Vol. 99, 1995.
continuities in Upwind Navier-Stokes Codes”. [14] Jameson, A., “Time Dependent Calculations
Int J Numer Meth Fluids, Vol. 20, 1995. Using Multigrid, with Applications to Unsteady
Flows Past Airfoils and Wings”, AIAA Paper
[5] Morgan, K., “Unstructured Mesh Methods”,
91-1596.
in Proceedings 16, Rutherford Appleton Lab-
oratory EASE Community Club in CFD New [15] Thomas, P. D. and Lombard, C. K., “Geometric
Opporlunilies and Directions in Aeronaulical Conservation Law and its Application t o Flow
CFD,April 1994. Computations on Moving Grids”, AIAA J , Vol.
17, October 1979, pp. 1030-1037.
[GI Webster, B. E., Shephard, M. S., Rusak, Z.,
and Flaherty, J. E., “Unsteady Compressible [lG] (Anonymous) “Compendium of Unsteady Aero
Airfoil Aerodynamics IJsing an Adaptive Time- dynamic Measurements”, AGARD-R-702,1982.
Discontinuous GLS Finit,e Element Method”,
AIAA Paper 93-0339.

Fig.1. Near Aerofoil Grid (a) (z,y) and ( b ) ( q , dp2)


36-8

Fig.2. Near Aerofoil Grid (a) ( 2 ,y) and (b) ( q ,$').

;..;;
*
+
w

/
ZETA ZETA

Fig.S.(a) Unadapted, and (b) Adapted, Varaatron of q.

WC WC
No".hjqahmbl.."".x pGGGT
Fig.4. Surface C,, (a) Non-Adapliue and (4) Adaptive, NACA0012, M = 0.8, U = 1.25'.

Fig.5. Near Aerofoil Adapted Grid (a) (q,$') and (b) ( 2 ,y)
36-9

Fig.6. Near Aerofoil Smoothed Adnpled Gvid (a) ( q ,Q2) and ( b ) ( z , y )

0.4

0.01
0.2

5 0
4.2
4.01 '.
,
..

Fig.7. Unsteady Normal Force and Moment Coeflcient.

4
w.t - 210 -nu

Standard
____. -Grid Expt.&Jpper)Expt(b0Wer)
Grid Adaptive

Fig.8. Inslanloneous Pressure Distributions.


36-10

m
37-1

Adaptive Computation of Unsteady Flow Fields with the


DLR-7-Code
0 . FRIEDRICH, D . HEMPEL, A. MEISTER, T H . SONAR
Deutsche Forschungsansta~tfur Luft- und Raunifahrt e.V
Institut fur Stroniungsmechanik
BuiisenstraBe 10, D-37073 Gottingen
Fax: +49 551 709 2446

1 Abstract the Euler equations. Experience gained by numeri-


cal investigations has showu that even fast unsteady
The features and abilities of the DLR-r-code, afinite flow phenomena like moving shocks in channels can
volume approximation of box type for the Navier- be effectively treated by this combination of algc-
St,oltesequations governing viscous, compressible flu- rithms.
id flow,are described in detail. The code is able to The r-Code employes dynamically adaptive strate-
compute flow i n moving reference frames and is build gies based on insertion and removing of grid points.
upon dynamically adaptive concepts to allow for grid Conservative interpolation avoids mass errors during
refinement in the framework of non-stationary aero- the process of adaptation. One of the main design
dynamics. Implicit as well as explicit time-stepping goals was the use of reliable error indicators instead
schemes can be used depending on the kind of ap- of refinement indicators based on gradients of flow
plication. variables. The indicators we consider are based on
the finite element residual of the Euler equations.
2 Introduction
3 Governing equations
The DLR-r-Code is a finite volume approximat.ionof
the Navier-St,okesequations governing compressible, We consider the Navier-Stokes equations in a mov-
viscous flow. The inet.hod uses a box-type discretisa- ing reference frame. In this context the governing
tion and works on general conforming triangulations. equations are given in the form
The discretisation of the convective fluxes is accom-
plished by means of an approximate Riemann solver
while the diffusive fluxes are discretised in a central
manner.
To achieve high resolution recovery techniques of
ENO-type are applied. New recovery techniques are
presented which are based on radial basis functions.
Although their use is restricted by now to small Integration is performed on time-dependent control
problems, it can be shown that they obey a certain volumes u ( t ) c R3with outer unit normal vector E .
optimality condition. Here, g = ( p , p u l , pu2, pus, pE)= denotes the vector
Explicit time stepping through TVD-Runge-Kutta of conserved variables, f f and f” are the convective
methods is used in a parallelized version of the code. -1 -1
and viscous fluxes, respectively, given by
This parallelized version includes an intelligent load
balancer for performance-controlled domain decom-
position and can handle arbitrary message passing
libraries like PVM or P4.
To effectively deal with unsteady flow problems such
as pitching airfoils and moving bodies in general, the
implementation of implicit time stepping schemes
is also.considered. The development of an implicit 0
method on unstructured grids leads to an linear sys-
rlj
tem of equations with a large sparse and badly condi-
tioned matrix. In t,liis case, the fundaine.ntal mathe-
matical assignment is the discreption of a fast solver
for such linear systems of equations. Extensive in-
vestigations with several possible algorithms indi- The quantity e denotes internal energy which is giv-
cated the superiority of a preconditioned GMRES en by e = E - $(U: + +
U! U:) and the enthalpy H
algorithm. The preconditioner is a simple incom- is defined as H := E +p/p. Pressure is given by the
plete LU-factorization which dramatically improves equationofstatep= ( 7 - l ) p ( E - $(U:+$ +U:)),
the convergence properties of GMRES in the case of 7 being the ratio of speciflc heats. The temperature

Paper presented af the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
37-2

is given by 1' = y(y - 1)l'la;e a.nd the elements of


the shear stress t,ensor are rij = p(8*,ui +&.,U,) +
+
~;A(I%,UI 8,,ti?), with t.he viscosity assumed to
follow the Sut,lierland law p = T'.'(l + S)/(T + S ) ,
where S = llOoI</Tm. hloreover, Ilie connection
between the termal conductivity and the viscosity is
defined by Stokes' hypothesis to be X = -$
The velocity ggrid is the velocity of the moving ref-
erence frame a,nd 4 := v j - iigrid,j denotes the con-
travariant velocity.

4 The DLR-T-code
In order to simplify notation \ve describe the details Figure 2: Boundary of a control volume in 2-d
of our numerical method in two space dimensions.
The extension to three space dimensions follows by
straightforward considerations based on the 2-d case. m1J o , ( t ) ~ ( g , t ) dthe
x , Navier-Stokes equations (1)
can be re-written in the form
d
4.1 Finite volume approximation zs(t)=
We consider conforming triangulations 7 h consisting
of tetrahedra (triangles in two-d) in the sense of Cia-
rlet [5] and define a discrete control volume u,(t) as
the volume of the barycentric subdivision of 7) en-
closing the node & = ( z i , ~+i,#
, and bounded by where N ( i ) := {j I au, n au, # 0} is the set of
the straight line segments /GI k = 1,2, connecting indexes of nodes neighbouring node s,. Since the
line integrals are not defined if g is discontinuous
the midpoint of the edge with the point z The
Ts' two numerical flux functions are introduced, name-
geometry of the control volumes is shown in figure
1. Figure 2 shows the boundary of a control volume
ly E , S : R4 x R4 x R2 -+ R4,approximating the
convective and viscous fluxes, respectively, and sat-
isfying the fundamental consistency conditions
R A 2 2
E(X,SE) = CcF(c)ni, s((u,%L;I~)
=C i y ( d n i .
,=1 i=l

in our implementation the combined Riemann solver


AUSMDV following Liou and Wada [33] is used for
the numerical flux E , which includes Hiinel's scheme
[lo] and was extended in [18] for the use in an implic-
it formulation considering moving grids. Several oth-
er choices, like Roe's or Osher's Riemann solver, are
easily implemented in the current framework. The
viscous fluxes are discretised by the central differ-
Figure 1: Control volumes in 2 4 ence

and serves to define our notations. The point 5 is


defined by
:= &gm,
mc{ij,k)
Applying the midpoint rule to the integral along Itj
results in
with
d
-u.(t) =
dt

in order to account for highly stretched meshes in


boundary layer regions. H (&(&, t ) , C j ( g i j , t ) ; d j ( t ) ) } I
--
Utilizing our notion of control volumes and de-
noting the cell average on ui(t) by % ( t ) := +
+U(h2) U ( h g ) ,
37-3

where the first error term is due to the quadrature Then, defining gh0 := g,, the polynomial (2) will
rule while the second error term depends 011 the func- certainly satisfy the recovery condition. However,
tions& Using cell average values, i.e. c, gi, results the isotropic recovery of the gradients does not take
in a first order approximation, i.e. q = 1, due to the care of shocks in the solut,ion and wlll thus lead to
weak approximation property of the cell average op- instabilities. According to the TVD methodolo&y a
erator, see [22], [23]. To increase the approximation slope limiter @, hast to be introduced such that the
order a recovery function & is sought on U, which recovery polynomial is written in the form
approximates g at least with order U(/a2).It is eas-
ily seen that linear polynomials recover g up to this ij,(z,t)= ~ l , + @ . i ( a ; , , , a b ~ ) ~.(ic-&).
order.
We have good experience in using the limiter de-
scribed by Barth and Jesperson in [4], but conver-
4.2 Recovery algorithms gence to steady state is enhanced if one adds a mod-
If & denotes the barycentre of the control volume ification as suggested by Venkatakrishnan in [31].
U: then a linear polynomial A simple ENO-type recovery can also be described
in terms of the linear interpolants E - The linear r e
T'
covery polynomial & on the box ui is then chosen to
ii%,(z,t)
= 1 u&)(z-dI (2)
be the one linear polynomialEF. on the surrounding
Ids1
meta-triangles for which the modulus of the gradient
has to be recovered on ui(t) such that it satisfies the is minimal, i.e. for which
recovery condition

is valid where V r c , , denotes the i-th component.


Experience with this type of recovery is reported in
Recovery in box-type methods is best described in
terms of a meta-triangulation zhwhich is defined
[22] and [25].
In order to further increase the spatial accuracy of
to be the triangulation of the barycentres of control
the DLR-r-code we are currently working on the ex-
volume ui and the surrounding boxes U<,if & is con-
tension towards a third order scheme by recovering
nected with each of the surrounding $, see figure quadratic polynomials close to the ideas of Abgrall,
3. If z(i) {FE
:= $ h } denotes the set of the meta- see [l], [2]. In [3] an algorithm based on Miihlbafh
expansions was developed which allows the efficient
i4 and stable computation of quadratic recovery poly-
nomials in a step-by-step manner. Preliminary nu-
merical results concerning a third-order r-code are
given in [15].

i2 4.3 Optiinal recovery


Although polynomial recovery functions seem to be
attractive at first glance for their simplicity, their
main drawback lies in the enormous widening of the
stencil if higher order recoveries are sought. On the
other hand, even locally defined polynomials of high
degree exhibit weak properties concerning their os-
cillatory behaviour. Additionally, as can be seen
from application of the theory of Optimal Recovery
as reviewed in [22], polynomials do not exhibit any
optimality condition with respect to their recovery
Figure 3: Meta-triangulation of the barycentres
properties. We do want to recover a function ij, on
ui for which the difference
triangles surrounding & then a linear poIynomialE?
can be computed on each of the 5.
In a TVD-like approach, compare [24], the gradient
(ai,,, a&)T of the recovery polynomial (2) can be ob- between the recovery function and the true solu-
tained from the linear interpolants in a completely tion at the Gauss points is smallest. Functions
isotropic manner, naniely minimizing semi-norms in Their associated function
spaces (i.e. Splines) are exactly those functions for
which the above quantity is minimal. In multiple
space dimensions splines are found in the class of
314

radial basis funclions, for esainple the well-known control voltlme. Flrst results of this ~procedurc.are
thin-plate spline. First experiments with this kind reportcd 111 section 0.
of recovery functions i n [22], [23] showed impressive Meanwhile, radial basis functions with compact sop-
increase in accuracy. Although recovery of radial port are being constructed. We meution the class of
basis functions is much too expensive as compared WO functions as designed in [3G] and the very re-
to polynomial recovery the techniques developed in cent develop~nentsof Wendland [34]. These fuuc-
[3] could very well provide a franiework in which tions are unconditionally positive definite and thus
these more complicated functions could Ire competi- do not need the polynomial augmentation as the thin
tive with polynomial algorithn~s. plate spline. Furthermore, their compact support
In recovery with radial basis functions a recovery makes them very attractive for practical purposes.
function of the form Whether these functions can be competitive in run-
N-1 At
time to polynomial-based recovery algorithms is the
k,~(r,t)
= 4Nua)WIg- -VI,1 ) +Eb ’ k ~ k t, ) contents of future research on E N 0 approximations.
j=O k l
4.4 Time stepping schemes and par-
is sought for the I-th component, where A(u*)xf := allelism
& J,, f(2)dg denotes the cell average operator.
The radial function CP is assumed to satisfy the fun- The DLR-7-code was originally supplied with an
damental condition of being conditionally positive explicit Runge-Kutta time stepping algorithm de-
definite and r h , k = 1,.. ., N, denote a basis of the signed by Shu and Osher in [21] which respects the
space of polynoniials of a certain degree, which de- TVD-properties of the spatial discretisation, see [24].
pends on the radial function CP chosen. The number However, these schemes are limited in CFL num-
of nodes N in the recovery stencil is another quan- ber by 1 which is a dramatic upper bound for ap-
tity which has to be choosen. Numerical experience plication in a n adaptive framework where grid cells
gained so far has indicated that polynomial-based can be very small. In the meantime other Runge-
E N 0 stencil selection criteria work well also in the Kutta schemes with up to five stages are in use and
case of radial basis functions. show satisfying behaviour especially when used in
Using the well known thin plate spline a multigrid environment for steady problems. For
the computation of unsteady flows, as pitching air-
foils, the restrictions due to the CFL condition are
still too strong. One way to overcome the limita-
tions of explicit time stepping schemes is the use of
parallel computers which is easy in the case of finite
volume approximations because domain decomposi-
tion is natural. A grid partitioner was developed in
which, by construction, is able to reproduce linear connection with an intelligent load balancing a l g e
polynomials, amounts to use at least four control rithni to redivide and redistribute grid patches de-
volumes in the stencil. In an ENO-like manner one pending on the load of the processors used. In that
can think of the stencil selection according to fig- framework a parallel computation is easily done in
ure 4,where the control volumes were chosen to be an environment conskiing of a cluster of worksta-
triangles. tions running P V M or P4 while the machines are
still occupied by other users.
In figure 5 the grid of a channel with forward facing
step is shown. The grid partitioner has divided the
grid into 59 patches which contain nearly the same
number of nodes. The possible speedup is docunient-

Figure 4: The construction of four node sets out of


a certain neighbourhood

If on each of the four stencil sets a radial basis recov- Figure 5: Grid partitioning
ery function is computed the one with the smallest
total variation norm is selected and assigned to the ed in the diagram of figure 0 where speedup vs. num-
37-5

her of processors is shown for an Intcl PARAGON. by f(y) = IIL - Ly11: anti choose an arbitrary ini-
The fiow is tlie supersonic test case by Woodward tial vcctor yo. %'&ting wit.11 111 = 0 the resid-
and Colella [35] as discussed also in section 6.1. ual Frn = min,=,o+, f ( y-) is comput,ed, where
As can be seen from figure 6 the present approach ,EKn
K , , , ( A ,:= -_-_-
~ SPQil (ro,Ar*o,~'r.o,.
..,~"'-'vo} de-
notesthe in-th Krylov subspace and vL= b-Ay0. __
Now we increase in until r;,,, is belowa given tol-
/t erance. Then we compute the o p h i a l approximate
solution y,, = arg min f ( y ) . Considering the fact
-u=uo++
_ -
iEKn
that the expense to calculate the residual increases
with the Krylov subspace dimension it is efficient to
limit this dimension. If this limit is reached before
t.liat of the tolerance tlie approximation gm lias to
be calculated and used as tlie initial value during a
repetition. This t,echniqne is called "GMRES with
restart".
Since the convergence rate of an iterative method
depends on the condition number of the matrix A,
Figure 6: Speedup on an Intel PARAGON an incomplete LU-factorisation is used as a precon-
ditioner in order to decrease the condition number.
Hereby the incomplete LU-factorisation is a pair of
towards parallelism through domain decompostion
leads to a very efficient method. a lower left (a and a upper right
fying the following three conditions:
(g matrix satis-
For use on conventional machines an implicit time
stepping scheme according to 1. presents the unit matrix for all i ,
2. L..= U , . = A.., if A . . is a null matrix,
3. (aii
"1 -> 3Ll "I
= A j ,if Aj is not a null matrix
and the linear system is transformed into

+o ((t"+' - ,")4+1) , A detailed description of these preconditioned GM-


RES algorithm in comparison with other implicit
and explicit finite volume schemes is presented in
~71.

5 Adaptive concepts
5.1 Refinement .-algorithms
was designed. Over the years experience was gained with several
The numerical fiux functions are evaluated at the different refinement/coarsening strategies for trian-
+
time 'n 1' whereby a linearisation is necessary gulations. This work is documented in [26], [27], [ll]
which leads to a linear system of equations in the and [13]. Numerical experience indicated that, at
form least for reliable Euler grids, a version of the isotrop-
ic red-green refinement as described in [14] gives sn-
A..A&'+
a , l&A$=h , i = l , ...,I , perior grids. In this refinement strategy triangles
jEN(i)
which have to be refined are red-refined according
where a%" = $+' -g and A .,B.. E R4x4.
-! ->
Con- to figure 7. Remaining triangles with two hanging
nodes are also red-refined before green refinement
seqnently, for each time step a linear system
turns the triangulation again into a conforming one.
Note that at the beginning of each refinement cycle
(3)
the previous green refinements are removed in or-
has to be solved, where A is a large sparse non- der to keep the triangulations stable, i.e. in order to
symmetric matrix. For t s solution of the system avoid too small angles occuring after several adap-
(3) the GMRES algorithm developed by Saad and tation cycles.
Schulz [19], [20] is used. Therefore, the system is In a corresponding re-coarsening strategy several

lem. First, we define the function f : R" -


transformed into an equivalent minimisation prob-
R$
topological configurations can be identified in which
points can be removed from tlie grid. It can be
37-6

shown. see [14]. t.lrat a rcfined mesli cm1 nlwiiys be is much easier to ~mplemenlthis is t h c eiror indica-
coinpletely coarsened up to it,s init,ial state. tor of our choice.
Note that in order bo keep the process of cos,rsening The additional problem occuriiig w i t h the Navier-
conservative it is necessary t.o use int.erpolatio~~
pro- Stokes equat.ions lics 111 t,he second derivatives inher-
cedures respecting couservat,ion. Examples of such ent i n the diffusive Ruses. Although we are currently
procedures are given in [14]. not able to prove error bounds it seems possible to
approximate the second derivatives in the compu-
tation of the residual in a measure-theoretic way by
sampling the jumps of the first derivatives across the
edges in normal direction. This type of error indi-
cators was inspired by the work of C. Johnson et al.
on the adaptive streamline diffusion finite element
method, see [7].[12], and developed by Goliner and
Warnecke [9]. We are currently investigating this
type of indicators for compressible flow [29].

Figure 7: Red (left) and green refinement of a trian- 6 Numerical results


gle
6.1 Unsteady flow in a channel
To show the ability of the code to adaptively re-
5.2 Error indicators solve the flow features we consider the test case of
In contrast to classical approaches in CFD the Ma, = 3 flow through a channel with forward facing
DLR-s-Code relies 011 residual-based error indica- step. This case was used by Woodward and Colel-
tors which were developed in subsequent papers [26], la [35] for extensive comparison of finite difference
[27], [ll],[ZS]. This type of indicators was devel- schemes. In figure 8 four grids at consecutive times
oped for use in codes for the Euler equations but are shown together with the corresponding density
we are currently working on extensions towards the distributions. As can be seen the adaptive algorithm
Navier-Stokes equations. If LE = 0 denotes the ab- has not only resolved all of the flow features but al-
stract form of the Euler equations in which L is so succeeded in coarsening those part of the meshes
the Corresponding differential operator of first order which were previously refined. The computation was
and denote the linear interpolants of the flow done with the parallel s-Code on a cluster of work-
variables on triangle T , then the local error of the stations using PVM.
numerical method under consideration is defined by
& := xT - g. If the numerical approximation zTis 6.2 Pitching airfoil
iiwrted into the differential equation the deviation
from the zero vector is a measure of closeness to the Figures 9 and 10 show the results of a calculation
exact solution. The quantity of an unsteady inviscid flow about the NACAOOl2
airfoil in comparison with experimental data. In
-v b := LET this case the airfoil is pitching harmonically about
the quarter chord point with a reduced frequency
is therefore called the residual. It was shown in [28] of k = 0.1028 and an amplitude of (I! = 2.51'.
that a two-sided error bound of the form The freestream Mach number is Ma, = 0.755 and
h h h
the angle of attack initially is 0.016". Consequent-
~ l ~ ~ ~ T ~ 5~ IDk T* b( (TT ) 2 C?1IETb'(T) ly, the time-dependent angle of attack is a ( i ) =
can be proved where 11 I D.(TJ denotes the dual
+
0.016" 2.5l0sin(0.1628. t ) .
Figure 9 shows the obtained instantaneous pressure
graph-norm. First numerical results were present- distribution in comparison with the experimental da-
ed in [ZS] which indicated that the use of the dual ta for several times during the third cycle of motion.
graph-norm leads to similar results as the use of the Figure 10 shows the comparison of the lift coefficient
weighted L2-norm and the moment coefficent vs. the time-dependent
angle of attack. The computational data are very
~II~~IIWT), close to the experimental ones. Note, that no diffu-
h denoting the length of the longest edge of T , sive effects were included in this calculation.
which was used for heuristic reasons before, see [ll],
[26], [27]. In (301 Siili was able to prove that the
dual graph-norm is indeed essentially equivalent to
h l l $ l l ~ ~and
~ ~since
) this locally weighted Lz-norm
37-1

Figure 8: Evolution of the adaptive grid and corresponding solutions

".I

'"'I
DI

Figure 10: Comparison of lift and niomeutuni coeffi-


cieiit VP. time-dependent angle of attack between tlie
numerical computation (inviscid) and experimental
data

tions. We consider tlie steady laniinar flow about a


NACA0012 airfoil wit,h a Reynolds number of 600,
a Prandtl number of 0.72, a reference temperature
of 273 degree Iielvin, a freestreani Mach nuniber of
Figure 9: Comparison o f t he instantaneous pressure 0 3 5 and a,n a.iigle of atrack of 0'. Tlie obt,ained
distribution between the numerical coinputation (in- h4acli number distribution is shown in figure 11 and
viscid) and experimental data figure 12 presents t,he adapt,ed grid. Tlie adapt,ation
indicat.or used is hased 011 t.he finite element, residual
of tlw Navier-St.olies equations.
6.3 Viscous flow about a NACA0012
airfoil
6.4 3-D transonic wing
The next ca$e was chosen l o test the nietliod and
As a three dimensional t.estcase t,lie inviscid flow
the adaptation algorit.lini for viscous flow coniputa-
about an Onera AI6 wing with Ala, = 0.84 and
37-8

sation is done like desrribed i n [SI. III ligurcs 15


and 10 the advantage of the multigrid SOIYCI. over
the single grid solver is sl~own. As oiie c i i n 5ee in
figure lG, the lift coefficient in t.he single grid coin-
putation is oscillating with about 5 t,o 10 percent
after 2500 timesteps while bhe lift coefficient of the
inultigrid computation is converged already aft.er a
CPU time t,hat corresponds to one Iiundred single
grid timesteps.

Figure 11: NACA0012 airfoil - Mac11 number distri-


bution.

Figure 13: Isolines of the Mach number distribution.

Figure 12: NACA0012 airfoil - Adapted grid.

an angle of attack of 3.06' is considered. Figure 13


shows isolines of the Mach number distribution on a
coarse hybrid grid with less than 40,000 gridpoints.
Figure 14 shows the distribution of the computed
Lz-no;m of the finite element residual for the same
solution. It can be seen that the much too coarse
resolved leading and trailing edge, the tip region BS
well as the shock were picked up. Thus, also in three
space dimensions the finite element residual of the Figure 14: Lz-Norm of the residual displayed on the
Euler equations can be used as an adaptation indi- surface. Darker regions indicate heigl~erL~-norm.
cator.
To accelerate convergence to steady state agglom-
eration multigrid as described by Venkatakrishnan
and Mavriplis [a21 is used Tlie coarse grid discreti-
37-9

180' of rotation gives a reliable criterion concerii-


singlegrid
wing
M6 wing
M6 multigrid
iiig the accuracy of the recovery. In figure 17 the
10'

IO'S
10"

Figure li: Grid and solution without recovery


.-
10-'4 I I
0.0 1000.0 2000.0
I I I I
Figure 15: &Norm of the density residual vs. CPU-
time.

M6 wing
-singlegrid

0.50
0.40 I i

Figure 18: Solutions with linear polynomial (left)


and thin plate spline recovery
I

grid used in shown with a numerical solution of a


finite volume approximation without recovery. The
remaining cone height is a disappointing 0.382 and
the shape of the cone is dramatically corrupted. Us-
ing a linear polynomial recovery algorithm following
0.00 I I Durlofsky, Engquist, Osher [6]results in the solution
0.0 low.o 2wo.o
shown in the left part of figure 18. The cone height
Figure 18: Lift coefficient vs. CPU-time. now is 0.835 but the shape of the cone is still lacking
regularity. Using the thin plate spline recovery as
described in 4.3 results in a cone with proper shape
6.5 Radial recovery functions and height 0.886. This solution is shown in the right
part of figure 18.
The accuracy of recovery algorithms based on radial Experiments with radial basis functions with com-
basis functions can be seen in an application to a pact support have indicated even better numeri-
simple model problem. Consider the linear partial cal results than those obtained with the thin plate
differential equation spline. Additionally, methods for the fast construc-
tion of recovery functions are currently being devel-
oped [3] so that there is hope that these functions
can be implemented in the DLR-r-Code in the near
future.

References
and
-LR+1 ; RS0.01 [I] R. Abgrall - On essentially non-oscillatory
; else. schemes on unstructured meshes: Analysis and
implementatiou. J. Conp. Phys. 114, 45-58,
where R := ( X I - +
i)2 (x2 - I1) a' The initial func-
(1994)
tion is a cone of unit height which is rotated around
the origin under the action of the differential equa- [Z] R. Abgrall - Design of an essentially nonoscilla-
tion. Measuring the remaining cone height after tory reconstruction procedure on finite-element-
37-10

type meshes. 1CA.W Rtport No.91-84, (1991). [15] D. IIietel. A . Meister, Th. Sonar - On the eoni-
Revised version. INRIA Repod N o 2.942, parisoii of two different implementations of an
(1994) implicit third-order E N 0 scheme of box type for
the Computation of unsteady compressible flow.
[3] R. Abgrall, Th. Sonar - On the use of MiihlLach in prepnratton
expansious in the recovery step of EN0 meth-
ods. DLR Interner Berrcht IB 229-95 A 34, [I61 D.J. Mavriplis - A three dimensional multi-
(1995) grid Reynolds-averaged Navier-Stokes solver for
unstructured meshes. ICASE report no. 94-29
[4] T.J. Barth, D.C. Jespersen - The design and
(1994)
application of upwind schemes on unstructured
meshes. AIAA paper 89-0366, (1989). [l7] A. Meister - Ein Beitrag zum DLR-r-Code:
Ein explizites und implizites Finite-Volu-
[5] P.G. Ciarlet - The finite element method for el- men-Verfahren zur Berechnung instationarer
liptic problems. North-Holland, 2nd edt. (1987) Stroinungeii auf unstrukturierten Gittern. DLR
[a] L.J. Durlofsky, B. Engquist, S. Osher - Triangle Interner Berichl IB 223-94 A 36, Gottrngen,
Based adaptive stencils for the solution of hy- (1994)
perbolic conservation laws. J. Comp. Phys. 98, [18] A. Meister - Development of an implicit finite
64-79, (1992) volume scheme for the computation of unsteady
[7] K . Eriksson, C. Johnson. Adaptive finite ele- Bow fields on unstructured moving grids. lo QP-
ment methods for parabolic problems I. A h e a r pear in Proceedings of the ICFD Conference on
model problem. Chalmers Uniuertty of Tech- Numerical Methods for Fluid Dynamrcs, 02-
nology, Department of Mathematics, prepnnt ford, (1995)
8891 (1988). [I91 Y. Saad, M. H. Schulz- GhfRES: A generalized
[8] M. Galle - Solution of the Euler and Navier- minimal residual algorithm for solving nonsym-
Stokes equations on hybrid grids. Thzs uolume, metriclinear systems. SIAM J . Scr. Stat. Comp.
paper no. 30 7 , 856-869, (1986)

[SI U. Gohner, G. Warnecke - A second-order finite [20] Y. Saad - Krylov subspace techniques, conju-
difference error indicator for adaptive transonic gate gradients, preconditioning and sparse ma-
flow computations. Num. Math. 7 0 , 129-161, t ~ i xsolvers. von Karman Institute of Fluid Dy-
(1995) namics, Leelure series 1994-05, (1994)

[lo] D. Hiinel, R. Schwane - An implicit flu-vector [21] C.-W. Shu, S. Osher - Efficient implementation
splitting scheme for the computation of viscous of essentially non-oscillatory shock-capturing
hypersonic flow. AIAA paper 89-0274 (1989) schemes. J. Comp. Phys. 77, 439-471, (1988)

[ll] V. Hannemann, D.Hempel, Th. Sonar - Adap- [22] Th. Sonar - Multivariate Rekonstruktionsver-
tive computation of compressible flow fields fahren znr numerischen Losung hyperbolisch-
' with the DLR-r-code. ill: Numerical Methods
er Erhaltungsgleiefiungen. Habilitationsschrifl,
for the Naurer-Stokes Equations, F.-K. Hebek- T E Darmstadt, (1995). Also: DLR Forscbungs-
er, R. Rannacher, G. Wittum (Eds.), Notes on bericht 95-02, (1995)
Numencal Fluid Mechanics, Volume 47, Vieweg [23] Th. Sonar - Optimal recovery using thin plate
Verlag, 101-110, (1 994) splines in finite volume methods for the numer-
[12] P. Bansbo, C. Johnson - Adaptive streamline ical solution of hyperbolic conservation laws.
diffusion methods for compressible flow using DLR Interner Berichi IB 223-94 A 42, (1994)
conservation variables. Comp. Methods Appl. [24] Th. Sonar - On the design of an upwind scheme
Mecb. and Engrg. 87,267-280, (1991). for compressible flow on general triangulations.
[13] D. Hempel - Dynamic adaption of triangular Numerical Algonthrns 4, 195-149, (1993)
grids. DLR Interner Berichl IB 223-95 A 38, [25] Th.Sonar - On the construction of essentially
(1995) non-oseillatory finite volume approximations to
[14]D. Hempel - Isotropic refinement and recoars- hyperbolic conservation laws on general trian-
ening in 2 dimensions. DLR Intener Bertchl IB gulations: Polynomial recovery, accuracy. and
223-95 A 35, (1995) stencil selection. submitted: Journal of Conzpu-
tational Physrcs, (1995)
1 37-1 1

[2G] 7'11.S0na.r - Strong and weak iiorni refinement


indicat,ors based 011 the finite elenient. residu-
al for compressible flo\v c~mputat~ion. Impact of
Coinputiny in Scieirce n n d Engineeriity 5 , 111-
127, (1993)
[27] Th. Sonar, V. Ha.nnemann, D. Hempel - Dy-
namic ada.ptivit,y and residual control i n un-
staedy compressible flow computation. Mathe-
nautical a n d Coinputer Modelling 20, 201-213,
(1994)
[2S] T h . Sonar, E. Siili - A dual graph norin re-
finement indicator for the DLR-T-code. DLR
Fo rsc hungs be rich2 94- 24, (1994)
[29] Th. Sona.r, G. Warnecke - On a finite difference
error indicat,or for adaptive approximations of
conservat,ion laws. i n preparation

[30] E. Siili - A posteriori error analysis and glob-


al error control for adapt,ive finite eleinent ap-
proxinmtions of hyperbolic problems. 16th Int.
Conf. on Nuin. Anal., Dundee, June (1995)
[31] V. Venka.ta.kris1inaii - On the accuracy of lim-
iters and convergence t o steady state solutions.
A I A A paper 93-0880 (1993)
[32] V. Veiikata.krislina.ii, D.J. Mavriplis - Agglomer-
ation inultigrid for the t,liree-dimensional euler
equations. ICASE report no. 94-5 (1994)
[33] Y. Wada, M.-S. Liou - A flux splitting scheme
with high-resolution an robustness for disconti-
nuities. ICOMP-93-50, (1993)
[34] H. Wendland - Piecewise polynomial, posi-
tive definite and compactly supported ra.dia1
functions of miniinal degree. Man,usktript, In-
st. Nunt. Airg. Math., Uiaiversitat Gottingen,
(1994)
[35] P. \Voodwa,rd, P. Colella - The numerical simu-
lation of two-dimensional fluid flow with strong
t shocks. J. Comp. Phys. 54, 115-1 7'3, (1984)
[36] Z.-R4. \'vu - Multivariate compactly support-
ed positive definite radial basis functions.
Alanuskript, Inst. Nunt. Aitg. Math., Univer-
sita'f Gottingela, (1994)
38-1

PARAMETRIC STUDIES OF A TIME-ACCURATE FINITE-VOLUME


EULER CODE IN THE NWT PARALLEL COMPUTER

L. P. Ruiz-Calavera
INTA, Aerodynamics Division, Fluid Dynamics Department,
Carretera de Ajalvir Km. 4 . 5 , 28850 Torrejon de Ardoz
SPAIN
N. Hirose
N U , Computational Science Division,
7-44-1 Jindaiji-Higashi,Chofu-shi, Tokyo 182,
JAPAN

SUMMARY j = cell row


A code to calculate unsteady aerodynamic k = cell plane
loads on non-uniformly moving 3 - 0 n = cell face
isolated wings has been prepared. The q = Runge-Kutta stage
Euler equations are solved by means of
a time-accurate Finite-Volume method 1. INTRODUCTION
with second order central spatial Aeroelastic problems appear to be of
discretization and Runge-Kutta time increasing importance in the design of
integration. The code has been aircraft. The size of the structures and
implemented in a parallel supercomputer. its elastic behavior, the aerodynamic
The numerical scheme used together with interference of different components,
some representative results are transonic effects, structural and
presented. control nonlinearities, etc, are
becoming a severe limiting factor. There
LIST OF SYMBOLS is thus a strong need to apply
c = local chord sophisticated and reliable aeroelastic
co = reference length simulation tools already in the early
c, = section lift coefficient= lift/Qcal design stage of a new development. These
cp = pressure coefficient = (p-p,)/qmcyl tools have to couple highly accurate,
d = dissipative flux robust and user friendly CFD codes with
D = dissipative operator Structural Dynamics software. Whereas
E = specific total energy the latter is already well established,
F,G,H = components of Euler flux vector the former still need development before
k = reduced frequency = wc/2Vm a generally recognized standard code is
k‘”,k(4)= artificial viscosity constants available.
L = scaling factor
U, = free-stream Mach number To clear a configuration of aeroelastic
n = surface normal unit vector problems, a very large number of cases
p = static pressure have to be run. Time accurate CFD codes
Q = convective operator are generally considered to be
qm = free-stream dynamic pressure computationally too expensive for
R = residual industrial application. Potential theory
R’ = averaged residual is mainly used, whereas the next level
S = cell face area of approximation, i.e. Euler Equations
t = time with or without boundary layer coupling
t’ = dimensionless time = tV,/c, is only now slowly starting to find its
U = vector of conservative variables way in the design offices despite the
u,v,w = components of flow velocity better approximation they provide. The
v, = free stream velocity application of high performance parallel
V, = mesh velocity computers to this kind of problems is
cy, = mean angle of attack obviously extremely interesting, not
a, = pitching motion amplitude only because it allows to tackle larger
p = Runge-Kutta coefficients problems in a shorter time but also
C = cell boundary because it opens the possibility to
E = residual averaging parameter perform parametric studies in a
E ( ~ ) E, ( ~ ) =artificial viscosity parameters reasonable time.
A , p , a = spectral radius of flux Jacobian
matrices in [ , 7 , and { directions A time-accurate Euler code has been
p = air density prepared to calculate inviscid transonic
[ , 7 , [ = curvilinear coordinates flow around oscillating 3-D wings. The
v = shock wave sensor code has been implemented in the NWT
R = cell volume (Numerical Wind Tunnel) parallel
w = frequency of oscillation supercomputer of the National Aerospace
Laboratory in Japan. The objective of
subscripts the present work has been to study the
i = cell column influence on the unsteady results of the

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
38-2

different parameters that control the corresponding to each of the axis in the
calculation. transformed plane [ , q , < .
The following presents a brief The integral equation (1) is applied
description of the scheme and its separately to each cell. Assuming that:
parallel implementation, together with the independent variables are known at
some results. the center of each cell; calculating the
flux vector as the average of the values
2. NUMERICAL SCHEME in the cells on either side of the face;
Among the different schemes which have and taking the mesh velocities as the
been developed to solve the unsteady 3-D average of the velocities of the four
Euler equations [1-10], the very popular nodes defining the corresponding face,
one of Jameson 1101 has been selected the following system of ordinary
for this study. In the following a brief differential equations (one per cell)
description of the implementation made results :
here is given. More details can be found
in [11].
2.1 Governing Equations where the convective operator Qi,j,k
The flow is assumed to be governed by
the three-dimensional time-dependent
Euler equations, which for a moving
domain R with boundary may be written
in integral form as: is a function of Ui,j,k, Ui+l,j,k, Ui-l,,,k,
and Ui,j,k-l* Schemes
Ui,j+l,kr Ui,j-l,kt Ui,j,k+l
constructed in this manner reduce to
central difference schemes on Cartesian
(1) meshes, and are second order accurate if
the mesh is sufficiently smooth.
where U is the vector of conservative
flow variables; (F, G, H) are the thrge This formulation is inherently non-
components of the Euler flux vector; vL: dissipative (ignoring the effect of
is the velocity of the moving boundary; numerical boundary conditions), so that
and n is the unit exterior normal vector dissipative fluxes Di,j,k have been added
to the domain.

(6)
The well known model of Jameson [121 is
used. The idea of this adaptive scheme
is to add 4th order viscous terms
throughout the domain to provide a base
level of dissipation sufficient to
prevent non-linear instabilities, but
Here p , p, (U, v, w) and E respectively not sufficient to prevent oscillations
denote the density, pressure, Cartesian in the neighborhood of shock waves. In
velocity components of the flow, and order to capture shock waves additional
specific total energy. 2"d order viscosity terms are added
locally by a sensor designed to detect
In order to close the system of discontinuities in pressure. To avoid
equations (1)a sixth equation is needed overshoots near the shock waves produced
which is obtained from the thermodynamic by the combined presence of the Zndand
relationships for a perfect gas 4th order terms, the latter are cut of€
in that area by an appropriate switch.
1
p = (y-1) p [ E - -
2
( u 2 + v 2 + w 21) (3) For the dissipative flux across the face
separating cells i,j,k and i+l,j,k we
have (for the other faces similar
2.2 Spatial Discretization expressions apply) :
The domain around the wing is divided
into an 0-H mesh of hexahedral cells,
for which the body-fitted curvilinear
coordinates [ , v , < respectively wrap
around the wing profile (clockwise),
normal and away from it, and along the
span. Figure 1 shows an example.
Individual cells are denoted by the
subscripts i,j,k respectively
i 38-3

The dissipation coefficient &I2' and of 4, the resulting At's are usually too
are calculated as small for practical applications. This
t restriction can be relaxed by using a
I
I
. - k ( 2 ) max ( u i + Z , j , k r u i + lj,, k r
e (. 2 )1
1+?.J8k (*)
technique of residual averaging [131
which gives an implicit character to the
time-integration scheme. Before each
t u i ,j , k r u i - l ,j , k )
time-step the residuals Ri,j , k=Qi,,k-Di,
j ,k
are replaced by modified residuals Rfi,j,k
(4)
' i + $ j, , k
= max(0 , k ( 4 ) - ~ ( ~) )
i + lj ,, k
(9) which are calculated by means of an AD1
2 method :

fl;x Jacobian matiix in direction Xi,j,k (14)


- ' i ,j.k+'i+l,j.k
where lit2, h V 2 and lir2 are the second
I 'i+f,j,k 2 (10) difference operators in the [ , 7 , and {
directions and E ~ , ~is, ~the smoothing
and with parameter [ 141

as a sensor of the presence of a shock (15)


wave. with at denoting the desired time step.
I
2.3 Time Integration Within a linear analysis, the former
I
The system of ODES in ( 4 ) is solved by technique assures unconditional
I means of an explicit 5 stage Runge-Kutta stability for any size of the time step.
I scheme with two evaluations of the However, as the resulting effective
I dissipation terms. Courant number becomes large the
contribution of the dissipation terms to
the Fourier symbol goes to zero, and
consequently, the high frequencies
introduced by the non-linearities are
undamped [15]. Thus the practical limit
for the time step is determined
principally by the high frequency
damping characteristics of the
integration scheme used. As the
properties of the 5-stage Runge-Kutta
(31, u!3) -Q(n) (n) 3 At fQ!2! -D;i! time-integration method are very good
ni,J,k z,j,k- i,j,kui,j,k-- l,J,k .J,kl from this point of view, CFL values as
high as 240 have been successfully used,
which significantly decrease the
l . j , k u !i4 .! ~ , .-Q!n)
n!4), k - i , j , k u i (, 0j ,) k - - A2 t L Q !13.! J . k - D l!.lJ!. k ] calculation time needed for a typical
case.

Q i( ,5j) , k u ] 5 ),j. k --Q (in,)j . k U:,?, k - A t [ Ql!P,!, k - D i (1)


, j , kl 2.5 Freestream Capturing
For the scheme to satisfy the freestream
capturing condition 1161 it must be

(12)
i which is second order accurate in time
and can be shown [ l l ] to have good
diffusion and dispersion errors which is the discrete form (consistent
characteristics and less computational with the numerical scheme here employed)
cost per time step than other schemes of the Geometric Conservation Law as
with a lesser number of stages. formulated by Thomas and Lombard [17].
It states that the cell volumes must be
I
I
2.4 Residual Averaging
This explicit time-integration scheme
advanced in time in the same way as the
fluid variables (even if they could be
has a time step limit that is controlled calculated analytically at each time
I by the size of the smallest cell. step) to prevent grid-motion-induced
errors in the numerical solution.
2.6 Boundary Conditions
The following Boundary conditions are
imposed :
Even though the CFL number of the 5-
I
stage Runge-Kutta scheme is of the order a) Kinematic boundary condition on the
38-4

wing surface. The pressure on the 3.2 Grid Density


surface is extrapolated from the Two different grids, namely 80x16~30and
internal points. 160x32~30, have been considered. The
spanwise grid distribution and outer
b) Symmetry condition at the k=l plane. boundary location was kept the same for
both cases, with 20 grid planes on the
c) Far field boundary condition in terms wing and 10 grid planes between the wing
of Rieman Invariants for a one tip and the side boundary of the
dimensional flow normal to the outer computational domain which is located at
computational boundary. two semi-spans from the plane of
symmetry. The outer boundary around the
3. RESULTS root section is at 9 chords. Results are
In the following, results for the LA" shown in Figures 6 and 7, where the
wing are presented. This is a high first harmonic of the pressure
aspect ratio (AR=7.92) transport type distributions around wing sections at
wing with a 25O quarter-chord sweep 17.5% and 82.5% semi-span is presented.
angle, a taper ratio of 0.4, and a It can be seen that the influence is
variable 12% supercritical airfoil large as a consequence of the better
section twisted from about 2.6O at the shock resolution of the finer grid. On
root to about -2.0° at the tip. The the other hand, as was to be expected,
geometry used for the computational the discrepancies are much smaller when
model is that of [le]. The results integrated along the chord to obtain
presented here correspond to the design sectional forces and moments, as can be
cruise condition: M,=O.82, c~,=0.6~.The seen in Figure 8 for the lift
wing performs harmonica1 pitching coefficient.
oscillations about an axis at 62% root
chord with an amplitude of c~,=0.25~and 3.3 Time Step Size
a reduced frequency k=0.104. Figures 9 and 10 respectively show the
real and imaginary parts of the first
The calculation proceeds as follows: harmonic of the pressure distribution
first an initial steady solution is around the wing section at 92.5% semi-
obtained and quality controlled; then span calculated with the 80x16~30grid
the time-accuratecalculation is started using dimensionless time-step sizes at'
and is time-marched until the initial ranging from 0.002 to 0.01 (which
transitories are damped out and an correspond to CFLs from 30 to 150). This
harmonic solution is obtained (typically section at the wing tip has been
three cycles of oscillation are needed) ; selected because at its trailing edge
finally the results of the last cycle the smallest cells are to be found, for
are Fourier analyzed to extract the mean which the stability limit should first
value and harmonics of the different be reached in accordance with (13). This
aerodynamic coefficients. is indeed the case as can be clearly
seen in the zoomed region. Outside of
Because of the large memory and CPU time this area the results are time-step
requirements of this type of methods, independent. Fortunately this
very few studies are available in the instability is very well behaved,
literature that assess the relative growing only at a very slow rate at the
influence on the unsteady results of the same time that it spreads inboard and
different parameters that control the towards the trailing edge, so that
calculation. To take advantage of the meaningful engineering calculations
benefits of parallelization to perform could be performed at larger at' without
this task was one of the main objectives a significant loss of accuracy.
of the present work.
3.4 Deforming vs. Rigid Moving Grids
3.1 Artificial Viscosity In the present method the instantaneous
Calculations have been done for a grid is computed by deformation of an
80x16~30grid with different amounts of initial steady mesh in such a way that
artificial viscosity, which has been the grid points near the wing surface
varied by means of the two coefficients are forced to closely follow the wing
k(2)and kI4) in (8) and (9). Results in (which motion is known as a function of
terms of mean part and real and time) whereas the displacements of grid
imaginary parts of the first harmonic of points far from the wing surface
the pressure distributions around wing gradually decrease and vanish at the
sections at 17.5%' and 82.5% semi-span outer boundary.
are presented in Figures 2 to 5.
Logically the main effect is on the Dynamic grid deformation such as this is
shock resolution which in turn computationally expensive as it involves
influences the magnitude and positions re-calculation of grid position,
of the corresponding peaks in the first kinematics and metrics at each time
harmonic component. step. For those cases in which the wing
has no elastic deformations it is also
38-5

possible to perform the calculation on computation of k-derivatives in PE,


a grid that moves with the wing as a requires data stored in PE,,, and PE,-,
rigid body. This option, although which, in principle, would imply the
theoretically less accurate than the need to communicate with the neighbor
former, is obviously computationally PES, thus increasing the overhead. This
less expensive. To evaluate its is avoided using overlapped partitioned
influence on the results, calculations arrays. Array partitions are defined in
have been performed both with the usual such a way that adjacent partitioned
deforming grid and with a rigid one. It ranges automatically overlap and have
has been found that differences are some common indices (with a depth
negligible. No figure is given because depending on the stencil) so that copies
the differences are within the of selected data at the interfaces
resolution of the graph. between two PES are stored at both local
memories. In this way k-derivatives can
3.5 Freestream Capturing also be computed in each PE without'any
Calculations have been performed both communication. At the end of each
imposing and not imposing the freestream calculation cycle, data in the overlap
capturing condition (16). In the latter range of the partitioned arrays is
case the cell volumes at each time step harmonized by copying its value from the
have been calculated analytically. Again parent PE.
the differences in the results are
totally negligible. The above explained procedure can be
maintained throughout the code except at
4. PARALLEL IMPLEMENTATION IN NWT the residual averaging subroutine, where
The above presented scheme was the alternating directions method (ADI)
originally developed in a Cray-YMP employed prevents its use as it requires
computer and has been implemented in the a sequential calculation. The inversions
NWT (Numerical Wind Tunnel) machine of in the i- and j-directions can be done
the National Aerospace Laboratory [191. in each PE independently so that the k-
This is a distributed memory parallel parallelization can be maintained, with
machine with 140 vector processing the vectorization in j-direction for the
elements (PE) and two Control Processors i-inversion and in i-direction for the
connected by a cross-bar network. j-inversion. As for the k-inversion,the
process must be sequential in the k-
Each PE is itself a vector supercomputer direction so that we transfer the
similar to Fujitsu VP400 with a peak affected data from a k-partition to a j-
performance of 1.7 GFlops and includes: partition. Then we can compute the k-
256 Mbytes of main memory, a vector inversion on each PE with vectorization
unit, a scalar unit and a data mover in i-direction. At the end of the
which communicates with other PE's. The calculation the data is transferred back
resulting total performance of NWT is to a k-partition. Figure 11 depicts the
236 GFlops and 35 GBytes. calculation flow.
The code has been parallelized using In Figure 12 the speed-up factor (ratio
Fujitsu NWT FORTRAN which is a FORTRAN of CPU time in 1 PE to CPU time in n
77 extension to perform efficiently on PES) vs. number of PES used is presented
distributed memory type parallel for calculations performed for the LA"
computers. The extension is realized by wing with the 160x32~30 grid. The
compiler directives. Basic execution results strongly depend on whether the
method is the spread/barrier method. residual averaging technique is used or
not, because of the need to transfer
The present scheme has always two data between partitions. Its relative
directions in which the computation can importance in relation to the normal
be performed simultaneously.Accordingly data transfer workload decreases as the
we can use one direction for number of PES used increases and both
vectorization and the other for curves tend to reach a common limit. It
parallelization. For the 0-H grid used must be born in mind that the 160x32~30
here the most natural way of grid only fills about 20% of the main
parallelizing, i.e. assigning different memory of a single PE (less than 1% when
vertical grid planes to different 32 are used), so that the granularity of
processing elements has been used. We the problem is extremely low. The
thus divide every array evenly along the parallel efficiency is expected to
k-index and assign each part to dramatically increase for larger grids,
different PES. The vectorization is made as has been the case with other codes
in i-direction which usually has the [201.
largest number of cells.
An indication of the CPU times required
With this partition, i-derivatives and to march in time the solution for one
j-derivatives can be computed in each PE period of oscillation (using a at' of
without any communication. The 0.01 for the coarse grid and 0.004 for
38-6

the fine one which respectively [91 Brenneis, A.; Eberle, A;;
correspond to CFLs of 150 and 240) is IIEvaluation of an Unsteady Implicit
given in Table 1. Euler Code Against Two and Three-
Dimensional Standard Configurationsg1;
5. CONCLUDING REMARKS AGARD CP-507, Paper 10,; 1992
A time-accurate Euler code to calculate
unsteady transonic flow about 1101 Jameson, A. ; Venkatakrishnan, V. ;
oscillating wings has been prepared and "Transonic Flows about Oscillating
implemented in the NWT parallel Airfoils using the Euler Equationsv1 ;
supercomputer. The achieved performance AIAA Paper 85-1514, 1985
has shown the feasibility of using this
type of computationally expensive 1111 Ruiz-Calavera; I'Calculation of
methods in an engineering environment. Unsteady Transonic Aerodynamic Loads on
The influence of different parameters on Wings Using the Euler Equationsv1;INTA
unsteady computations has been studied. OAT/TN0/4510/005/INTA/95, 1995

ACKNOWLEDGEMENTS [12] Jameson, A. ; "A non-oscillatory


This work was supported by a fellowship Shock Capturing Scheme using Flux
from the SCIENCE AND TECHNOLOGY AGENCY Limited Dissipationr1. Princeton
of Japan. University MAE Report 1653, 1984
REFERENCES [13] Jameson, A. ; "Transonic Flow
[l] Whitfield, D.L.; Janus, J.M.; Calculations for Aircraft"; Lecture
llThree-Dimensional Unsteady Euler Notes in Mathematics, Vol. 1127;
Equations Solution Using Flux Vector Numerical Methods in Fluid Dynamics;
Sp1itting1l;AIAA Paper 84-1552; 1984 Editor: F. Brezzi; Springer Verlag; pp.
156-242; 1985
[2] Salmond, D.J.; l1Ca1culation of
Harmonic Aerodynamic Forces on Airfoils [14] Batina, J. ; llUnsteady
Euler Airfoil
and Wings from the Euler Equations"; RAE Solutions Using Unstructured Dynamic
Tech. Memo Aero 2011,; 1984 Mesheso1;AIAA J., Vol. 28, No. 8 ,
pp.1381-1388; 1990
[3] Sankar, L.N.; Wake, B.E.; Lekoudis,
S.G.; I1Solution of the Unsteady Euler [15] Radespiel, R.; Rossow, C.; " A n
Equations for Fixed and Rotor Wing Efficient Cell-Vertex Multigrid Scheme
Configurationsll; Journal of Aircraft for the Three-Dimensional Navier-Stokes
Vol. 23, NO. 4, pp. 283-289; 1986 Equations"; AIAA Paper 89-1953; 1989
[41 Sankar, L.N.; Malone, J.B.; [161 Obayashi, S . ; "Freestream Capturing
Schuster, D. ; "Euler Solutions for for Moving Coordinates in Three
Transonic Flow Past a Fighter Wing"; Dimensions"; AIAA Journal, Vol. 30, No.
Journal of Aircraft, Vol. 24, No. 1, 4, 1991
pp.10-16; 1987
[171 Thomas, P.D.; Lombard, C.K.;
[51 Belk, D.M.; Whitfield, D.L.; "Time- IIGeometric Conservation Law and its
Accurate Euler Equations Solutions on Application to Flow Computation on
Dynamic Blocked Grids"; AIAA Paper 87- Moving Grids"; AIAA Journal, Vol. 17,
1127, 1987 No. 10, pp. 1030-1037, 1979
[61 Anderson, W. K.; Thomas, J.L.; 1183 "AGARD Three -Dimensional
Rumsey , C.L.; "Extension and Aeroelastic Configurationsu1;
AGARD-AR-
Applications of Flux-Vector Splitting to 167; 1982
Unsteady Calculations on Dynamic
Meshest1;AIAA Paper 87-1152, 1987 [191 Hirose, N.; "Numerical Wind Tunnel
Project and Computational Fluid Dynamics
[71 Brenneis, A.; Eberle, A.; at National Aerospace Laboratory,
"Application of an Implicit Relaxation Japan"; NAL TM-648T
Method Solving the Euler Equations for
Time-Accurate Unsteady Problems"; J. of [20] Miyoshi, H. et al.; IIDevelopment
Fluid Engineering, Vol. 112, pp. 510- and Achievement of NAL Numerical Wind
520; 1990 Tunnel (NWT) for CFD Computations1I;
Proceedings of IEEE Super Computing
[81 Guruswamy, G.P.; Unsteady 1994
Aerodynamic and Aeroelastic Calculations
for Wings Using Euler Equationsu1; AIAA
J., Vol. 28, NO. 3, pp. 461-469; 1990
Nwr hlwp CRF%Y-YuP (x92)
1PE 32 PE 1PE
80x1620 15' 2.5' 59'
160x32~30 187' 31' - -

Fig. 1. LAW W h g . 80x16~30Grid.


12
I
10 IO

08 011

OB

04

a
0 0 2

00

-0 2

-0 4 -. k2-0.5 k4=1164 -. k2-0.5 k4-1/64


- k211.0 k4-2164
... k2.1.5
- k2-1.0 h4.2164
-0 6
k4-3164 OB ... k2-1 5 k4-3164

0 8
00 02 04 0 6 011 10
00 02 04 0s OB 10
x/c xlc
Fig. 2: Mean P a r t . 17.5% semispan Fig. 3: Mean Part. 82.5% semispan
DO.0
'!
I1
-. k2-0.5 i\ -. k2-0.5 k4-ll64 4
- h2-1.0
k4-1/64
k4-2164 ..
I 1 70 0
- k2-1 .O k4.2164 !i
1.
..' k2-1.5 k4-3/64 :!

0,O o t 0.4 0.8 08 1 .o


IO 0
-. h2-0.5 k4-1/64

I.0 - - k2-1.0 k4.2164


... k2.1.5 k4-3/64 -. k 2 - 0 5
- k4-1/64

-
$00
- k2-I 0 h4-2/61
0.0
... k2-1 5 k4-3/64

-
h
OD -
n
2
m
-g , o Q -
-,5.0 -

zoo -
-20.0 - !i
!i
!i
t
-25.0
so 0
0.0 0.1 0. 0.6 0.. ,.o
XIC
Fig. 4: F i r s t Haxmmic. 17.5% semispan
50.0
10.0 ,
K.0 -.. 8 0 x 1 6 ~ 3 0
- 160x32~30
10.0

I
n
-
0
tEzo.o
m
4
10.0

0.0

.10.0

00 02 0. 06 01 10 0.0 0.2 0.4 0.1


XlC XlC
Fig. 6 : F i r s t Hanmnic. 17.5% semispan
I

100.0 10 0

80.0

80.0
90 -
70.0

60.0

8 0 -
-50.0
a -
-Q
240.0
-
0,
lu
a130.0
0 7 0 -
5 20.0
LT

10.0
- ... 80x1 6x30
0.0
6 0
- 160x32~30

.10.0

-20.0
5 0

-30.0 ' 1
0.0 0.2 0.4 0.6 0.8 1 .o
20.0

I
10.0
... 80x1 6x30
- 160x32~30
0.0
A
a
2
0)lO.O
m
-E
-20.0

-30.0

-40.0
-3 0 '
0.0 0.2 0.4 0.6 0 8 1 .o
0.0 0.2 0.4 0.6 0.8 1 .o 2zlb
XIC
.Fig. 7 : First.Hamnic. 82.5% semispan Fig. 8: Lift Coefficient. lstHarmonic.
60.0 I i 3.0
1

2.0 -

'1.0 -

-
0.0
__----------
I

0.0 *
-
!

-1.0

.10.0

'
-20.0 -2.0

Fig. 9: Real P a r t . First Harmonic. 92.5%


38-10

10.0
1

0 .o 0.2 0.4 0.6 0.8


xlc

1 .o Parallel for k
Vector for i

OVERLAP

Parallel for k
z FLUXES Vector for i

OVERLAPFIX

-2.0 ;
0.95 0.96 0.97 0.98 0.99 1 .oo Parallel for k
XlC INVERT i Vector for j
Sequential for i
Fig,. 1 0 : Imaginary P a r t .
F i r s t Harmonic. 92.5%
Parallel for k
INVERT j Vector for i
Sequential for j
8

MOVE ARRAYS
6 k- j

45
1
m4 INVERT k
Parallel for j
Vector for i
Sequential for k

3
No Residual Averaging
2
-With Residual Averaging [MOVE ARRAYS\

1
4 8 12 16 20 24 28 32
PES

F i g . 12 : Speed-up F a c t o r . F i g . 11: Calculation Flow.


39-1

Parallel Implicit Upwind Methods for the


Aerodynamics of Aerospace Vehicles
K.J. Badcock’ and B.E. Richards

Abstract. Research at the University of Glasgow, based for flutter, for example, is becoming increasingly expensive.
around implicit methods for solving the Euler and Reynolds’ On the other hand, with the rapid developments in computer
Averaged Navier-Stokes equations and to be reported in this hardware and computational techniques, the topic of compu-
paper, has targeted advanced CFD methods for tackling the tational fluid dynamics is reaching maturity as a viable way
complex flow fields of interest to aerospace vehicle design- of providmg design solutions.
ers. Therequirements for this application are for efficient,high A reasonable simulation of the fluid dynamics of high
resolution schemes which can be ported to various MPP sys- Reynolds’ number can be obtained by solving the Reynolds’
tems and implemented with robustnessto give fast turn round averaged Navier-Stokes (RANS)equations. Increasing com-
times at competitive cost. It is recognised that the most de- puter power now makes the solution of these equations feasi-
manding topics concern unsteady viscous flows and thus time ble. The level of turbulencemodel used needs to be a compro-
accuracy and efficiency is pursued as a high priority. This pa- mise between a simple eddy viscosity model such as Baldwin
per then reviews the work, ongoing and planned, by the team Lomax and a more complex second moment closure model.
at Glasgow in code developments embracing future comput- In this work the former is used, but the codes are starting to
ing environments and including some results not previously use the more general k w two equation model.
published. The example test cases used in the performance To satisfy the general requirements for a code suited for
I and sensitivity studies include the transonic flow results on aircraft design, it should be accurate, efficient and robust and
the RAE 2822 aerofoil and ONERA M6 wing selected by usable on future computer architectures.The general approach
AGARD. The computing environments to which the codes chosen by the University of Glasgow CFD Team in this work
port include workstations, either used singly or clustered to is to use high order upwind differencing to provide accu-
provide a parallel computing domain. and also integrated dis- racy and robustness and to mostly use implicit methods to
tributed memory Supercomputers such as CRAY T3D and provide efficiency [51 [91. Unstructured grids are also being
Intel Hypercube systems. The paper outlines these technolo- considered by the Team as a way forward for dealing with
gies also. geometric Complexity but there are developmental difficulties
in tackling viscous flows near boundaries and calls for high
1 Introduction memory. Geometric complexity using structured meshes can
be accommodatedusing multi-block grids which lend them-
Aerodynamics has been established as a foundation technol- selves to distributed memory computing architectures using
ogy for the design of aerospace vehicles. Good application a multidomain approach.The combination of an implicit ap-
of aerodynamics will lead to substantial economic benefits proach on a structured grid for wall turbulent flows provides
for future aircraft designs. Particularly important target areas an efficient code, particularly for unsteady flows.
include drag reduction to improve direct operating costs and There exists a considerable variety of computer architec-
better prediction of steady and unsteady loads on aircraft to tures from which to choose.The general consensus,however,
overcome structural conservatism at the time of freezing the is that competitivelypriced distributed memory massively par-
design. For the majority of aircraft, this requires the partic- allel processors (MPPs) will provide the Teraflops facility (or
ular capabhty of predicting the phenomena of shock waves greater)that will berequiredto tackleCFDsolutionsusingthe
!
and flow separation. This can be achieved through a bet- RANS model for flows over complete aircraft configurations.
ter understanding of the fluid mechanics of flow interactions A number of vendors promise production of such Teraflops
using either experimentaltechniques or computational meth- facilities in the near future, although the cost is likely to be
ods. Wind tunnel testing at simulated conditions,particularly beyond the means of all but the largest organisations. Also
I
I

I
’ Lecturer,Department of AerospaceEngineering, University of Glas-
gow, Glasgow, G12 8QQ. UK
there needs to be a further investmentin adapting the majority
of existing codes to use it. There is a trend to provide a similar
, Professor, Department of Aerospace Engineering, University of architecture at a much lower cost using workstation clusters.
I Glasgow, Glasgow, G12 8QQ, UK

Paper presented at the AGARD FDP Symposium on “Progress and Challenges in CFD Methods and Algorithms”
held in Seville, Spain, from 2-5 October 1995, and published in CP-578.
39-2

On the sort of broad-bandwidth networks that are planned The title was changed to the shortened form Computation of
for the future in Corporate networks, the type of powerful Complex Aerodynamic Flows or CCAF Project after the pro-
high memory workstations that are being used for detailed posal was accepted [2]. The other initiative was to develop a
CAD/CAM design work in the engineering industry in the Consortium of Departments within the University to bid for
daytime are amenable to be turned loose at off peak times to resources under the New Technologies Initiative (NTI) for
provide a powerful high-memory system. the development of a High Performance Parallel Computing
Creating the parallel computing environment for the Glas- facility from Spare Capacity on a Network of Workstations!
gow CFD Team has proven to be an interesting case history NTI was developed by the Joint Information Systems Com-
that it is appropriate to relate as a contribution towards the mittee from funds that the Committee had securedthemselves
theme of this conference. Before 1990, computer systems from the Higher Education Funding Councils to promote pilot
within the Universities in the UK had undergone a major up- studies towards developing state-of-the-art computing capa-
grade each seven years, funded by the University Funding bilities across the Universities. When the funds were awarded
Council, and this generally enabled the acquisition of a useful the University project was designated the HNW Project. These
multi-user mainframe. Numerically intensive computer users projects (both now have a year's maturity) give access to a
then had access to off peak cycles through batch facilities. world class resource to the CFD Team. These projects will
Before 1994 at Glasgow, for example, the sizeable University now be described separately.
central Computing Service operated a CMS environment on One target area of application for the CCAF project is to-
an IBM 3090 150Evector facility for scientificwork, with the wards the study of aeroelasticity at the edges of the flight
help of a technological agreement with the vendor, as well as envelope, an area in which the non-linearity of the problem
VME and VMS environmentson sizeable ICL and DEC facil- poses considerableuncertainties and is likely to reveal inter-
ities, respectively. From another initiative the University also esting new mechanics. The challenge is to be in a position to
acquired a 32 transputer distributed memory Meiko Comput- complementexperimentaland analyticalstudies of these com-
ing Surface, along with a systems manager, on which some plex physical phenomena using facilities as powerful as the
early experience on parallel computing was developed by the EPCC Cray T3D. Electrodynamics radiation is also included
CFD Team members to complement time awarded by peer re- in the programmebecauseof the commonalityin grids and so-
viewed on National Facilities such as CRAY-XMP and YMP lution techniques and the opportunity to widen the application
vector multi-processors . The Team's work could be classed base of the project. The resource awarded is modest (around
at this stage in the category of high performance computing 64,000processor hours per year), but with developmentbeing
(HPC). done on local computing environments with production tests
In 1994, the Funding Council support changed to a sys- done on the National facility, the resource is useful.
tem of IT support on an annual basis, at the same time the ' h omain computational approaches are being pursued in
University adopted an IT strategy to distributed the monies CCAF: structured grid work is at a more mature stage, par-
involved thinly to all Departments whilst providing a core allelisation of multi-block codes and dealing with boundary
support for: the overall Campus Network including a FDDI layers is straight forward but dealing with geometric com-
backbone (later an ATM backbone); and a UNIX cluster for plexity is problematic; unstructured grid work copes well
core Computing (with a cost imposed on groups who used with geometric complexity but partitioning causes problems.
cycles above a threshold which was set at a low level). The The project includes comparisons between codes developed
implication of this University strategy was the need for HPC in order to determine the best future strategy. In the area of
users to prise a proportion of their Department's allocation of aeroelasticity. there is a dearth of experimental data of the
funds and add it to other initiatives to secure the computing quality and appropriateness for CFD validation. Nevertheless
environment that they needed. Also at about the same time the Consortium has identified a suitable unsteady test case
at National level, resources targeted for research (managed involving the AGARD L A " swept wing to provide an ap-
by EPSRC) were used to purchase a CRAY T3D with 320 propriately challengingcommon test case. The Glasgow Team
DEC-Alpha nodes and following bids this was placed at the is involved particularly with the developmentof a multi-block
Edinburgh Parallel Computing Centre at Edinburgh Univer- structured grid flow code meshed with a structural code made
sity. This facility was designated for the exclusive use of a available from industry and uses on average 1,600 processor
limited number of University Consortia to tackle Grand Chal- hours of T3D resource per month on this. Some preliminary
lenge problems only. results are reported below.
With this background, the team then fronted two main ini- At the other end of the cost scale, the HNW cluster project
tiatives to achieve an acceptable computing resource for its was awarded 8 man years of effort by JISC over a period of 3
ambitions to develop state-of-the-art codethat might beuseful years. The six collaborating Departments in the University of
for aircraft designers. The first was to develop a University Glasgow provided funds to purchase equipment and software
Consortium (finally, this included the Universities of Bristol, for a pilot facility. which could also be used as a demonstrator
Glasgow, Oxford and Swansea and UMIST) that proposed for a dedicated cluster as well as a base for testing different
a topic on Physically and Geometrically Complex Aerody-
namic Flows for Aircrafi Flight to use the Cray T3D facility.
see http://www.aero.gla.ac.uk/ResearcWWfor full details
39-3

cluster technologies. Following a stringent selection process ologies for the two and three dimensionalcodes and provides
and based on the company's strong interest in the cluster some new examples and Section 4 discusses the parallel cod-
technology, six Silicon Graphics' Jndys with MIPS R4400 ing methodology used.
processors and 64 ME3 memory and 17 inch monitors were
selected and purchased. These were assembled together in 2 Two-Dimensional Method
one laboratory and connected using grade 5 UTP cabling to a
lObaseTEthemet switch, which is the standard presently used The two dimensional thin-layer Reynolds' Averaged Navier-
by the University,and this itself was connectedtothe network. Stokesequationsin generalised curvilinear co-ordinates (t.9)
Using PVM 3.3 message passing, excellent performance was with 9 normal to the surfacecan be denotedin non-dimensional
achievedusing the Team's CFD codes, with little latency 163. conservative form by
A planned upgrade to ATM switching on the UTP cabling is
planned in the near future to improve communication speed as
well as a multi-cluster activity with an adjoining University
linked to the local ATM based Metropolitan Area Network where w denotes the vector of conserved variables, f the
(MAN) called ClydeNet. convectivestreamwiseflux, g the convectivenormal flux and
From other research projects, ten more Indys have recently s the normal viscous flux.
been added to the Departmental 1ObaseTnetwork.The com- One implicit step, updating the primitive variables P, can
bined resource is available generally as a computing domain be written as
to users given an account. Apart from PVM being installed
on the cluster as the messagepassing softwarefor the parallel
dw aRc"
implementations,alternatives for users include h4PI (CHIMP (-
aP
+At-
ap
+ A t aP
s n ) 6 Q = - Af(R: + Rt) (2)
and LAM versions) as well as Oxford Parallel BSP. Clusters
in other Departmentsin the University are beginning to be set where Re and R, are terms arising from the spatial discreti-
up in a similar way. sation in the t and 9 directions respectively and
Because of the heterogeneous nature of the user base of
the cluster, a resource management system was required to
optimise use of these cluster resources. The public domain
software NQS, CONDOR and DQS and demonstration ver-
sions of the supported softwareCODINETMandLSFTMwere
obtained and assessed. LSFTM,written by Platform Comput-
ing Inc. of Toronto, hadthe best ingredientsfor the University In the present work the spatial terms are discretised using
based project, particularly a multi-cluster capability, and has Osher's flux approximation with MUSCL interpolation and
been selected by a number of Industries, particularly some the Von Albada limiter for the convective,terms and central
Aerospace Industries as a means of managing the cluster re- differencing for the viscous fluxes. The Baldwin-Lomax tur-
source. A University agreement, which included technolo - bulencemodel is employedto provide a turbulent contribution
T i
ical exchanges towards the future development of LSF , to the viscosity but this is not linearised in time in the present
made available a multi-platform site license to explore its use work, i.e. turbulence contributions only appear on the right-
in a University environment and particularly this presently hand-sideof equation (2). This has been found not to degrade
unique facility of managing multiclusters. The experienceto the stability properties of the methods examined in this paper.
date in its implementation is that improved load balancing, The alternating direction implicit version of equation (2) is
and hence a considerably better use of cycles is made by now
submittingjobs to the domain, rather than to a specific work-
station. The software identifies the best resource for a job and
carries it out transparently to the user. If a user wishes to re-
claim use of a machine for interactive work, the part of the where
job being done on that machine is automaticallycheckpointed Rap= - At(R; + Rt).
and migrated to another machine with spare capacity. PVM is The AD1 factorisation which appears on the left hand side
embedded in the software so that it provides an ideal system of equation (3) has been widely used to approximatea solution
for queuin and implementing parallel programmes at low to the system (2) because the banded structure of each of the
cost. LSFT' provides excellent user interfaces, which help
factors makes it relatively easy to solve. However, the solu-
system managers of clusters to improve their service to users. tion of the AD1 system is not an exact solution of equation (2)
With continued development of the cluster technology, and in practice the factorisation error (the error introducedby
there is evidence that this type of Gordable computing could solving equation (3) rather than equation (2)) leads to a prac-
be a norm in design offices within the Aerospace Industry. tical limit on the time step and introduces another source of
With this background on the technology used at Glasgow, Sec- error into the calculation.This motivates the use of a precon-
tions 2 and 3 of the paper, outline the discretisation method-
ditioned conjugategradient solution of the unfactored system.
39-4

Conjugate gradient methods find an approximation to the between increasingthe CFL number to minimise the number
solution of a linear system by minimising a suitable resid- of implicit steps and reducing the CFL number to minimise
ual error function in a finite dimensional space of potential the number of CGS steps at each implicit step. The compar-
solution vectors. Several algorithms are available including ison of the pressure distribution with experiment for various
BiCG, CGSTAB, CGS and GMRES. These methods were levels of convergence is shown in figure 2 and shows good
,
tested in [3] and it was concluded that the choice of method agreement with experiment.
is not as crucial as the preconditioning. However, the CGS
1200
method was found to be the quickest of the three methods that
do not require re-orthogonalisationand is used here. CGS has 1100
the additional advantage that the transpose of the matrix on
the left-hand side of equation (2), is not required, reducing
implementation difficulties. The CGS algorithm was derived 900
in [lo] and is restated in 1121.
Denoting the linear system to be solved at each time step 800
by
700
Ax=b (4)
we seek an approximation to A-' z C-' which yields a 600

system -- 50 60 70 80 90 100 110 120


C-'Ax&-'b (5)
CFL. number
more amenable to conjugategradient methods.The ADImethod
provides a fast way of calculating an approximate solution to Flgure 1. l h e to converge to within 0.25 9% of drag asfunction of
equation (4) or, restating this, of forming the matrix vector CFL number onfinest grid
product
c-'b=x. (6)
Hence, if we use the inverse of the AD1 factorisation as the 1.5
preconditionerthen multiplyinga vector by the preconditioner
can be achieved simply by solving a linear system with the 1
right-hand side given by themultiplicand andthe left hand side
matrix given the approximate factorisation. The factors in C 0.5
are put in triangular form once at each time step with the row
operations being stored for use at each multiplication by the 8 0
preconditioner.This roughly doubles the storagerequirements
of the method. -0.5

To illustrate the performance of this method we present re-


-1
sults for flow over an RAE2822 aerofoil at a free stream Mach
number of 0.73. an angle of attack of 2.73' and a Reynolds
-1.5
number of 6.5 x lo6. The comparison of convergence rates 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
for the present method (called AF-CGS), straight AD1 and an Chad ratio [dc]
explicit local time-stepping method was made in [5] and an
improvement in time to convergenceof 25 per-cent was noted Flgure 2. pressure distributionat various levels of convergenceat
for the present method when compared with ADI. When the a CFL number of 100 on the 257x57grid
free stream flow is used as starting conditions, the CFL num-
ber which yields fastest convergencefor the AF-CGS method
is 35 but CFL numbers of up to 50 can be used. The largest A similar approach has been used for unsteady flows over
CFL number which yields a solution for AD1 is 18 and hence, pitching and plunging aerofoils and aerofoils with moving
removing the factorisation error allows the use of larger time flaps [4]. Themain conclusion from this work was that AFCGS
steps. A further reduction in the time to convergence by a does not allow the choice of time step from purely accuracy
factor of five has been achieved by mesh sequencing. Three considerationsbecause of the need to limit the time step to
levels of mesh sequencing were used to provide a good start- ensure the reasonable performance of the linear solver. How-
ing solution on the finest mesh (257x65).Using this approach ever, AF-CGS does allow for larger time steps and a reduced
the optimal CFL number was increased to 100 and the overall computationalcost when compared with ADI. For onepartic-
time to convergeto within 0.25 per cent of the fully converged ular case the stability restriction on the size of the time step
lift value was reduced by a factor of 5. The time to conver- is a global CFL number of 1000. The average CFL number
gence as a function of CFL number is plotted in figure 1 and during one cycle for the unfactored method is around 2000
shows a clear minimum. This is because there is a balance for the unfactored method translating into a saving in CPU
39-5

time of around twenty-five percent. method only the matrix for one spanwise slice or one line in
the spanwise direction need be stored at any one time. This
has the effect of reducing the matrix storage requirements at
3 Three-dimensional Extensions any one time in the calculation to m~225~~/ice,125N/;ne)
The extension of the method to three-dimensions is compli- where N,,iceis the number of grid points in a spanwise slice
cated by two considerations.First, computer storagebecomes and N;ine is the number of grid points in the spanwise di-
a limiting factor due to the need to store large Jacobian ma- rection. Since N;ineNs/ice=n/ it can be seen that the storage
trices. Secondly,the AD1 factorisation in three-dimensionsis requirements have been reduced substantially (by around two
significantly worse than in two-dimensions, making its use orders of magnitude for the test case examined in this paper).
as a preconditioner less favourable.This fact however means As a test case we shall consider flow over the ONERA M6
that there are increased gains to be made in three dimensions wing in transonic conditions. The experimential data for this
by the use of an alternative to ADI. wing is available in [13] with several previous computational
One step of the method considered can be written as results including those in [lll. The flow problem we consider
here has a free stream Mach number of 0.84, an incidence
of 6.06 and a Reynold's number of 11 million. For this case
250 explicit steps were required before PUN was used with a
where CFL number of 10. The residual is reduced about 4 orders of
Rap= - + R y + Rz). magnitude from its initial value. This was also observed for
flows over aerofoils in 181 and was due to small oscillations
This two factor step can be loosely described as unfactored in the pressure at the far field.
in each spanwise slice and approximately factored in the The comparisonof the computedpressure distribution with
spanwise direction. A stability analysis 171 has shown that the experimential results of [13] at six spanwise slices are
the method has similar stability properties to the two fac- shown in figure 3. Good agreement is obtain for the flow ex-
tor AD1 method in two-dimensions,representing a significant cept for the position of the shock and the very last station at
improvement on the behaviour of the three factor method in 99% span. This has also been observed in [ l l l for this test
three-dimensions. The linear system resulting from the first case. Shock induced separation occurs after the strong shock
factor in equation 7 has a more complicated structurethan the near the tip and the Balwin Lomax model is known to be inad-
block pentadiagonal systems which are encountered for each equate for this phenomenon. In [ 111 the Johnson-Kingmodel
factor in the three factor method. However, this sytem can be was also implemented which significantly improved the re-
solved using a direct generalisation of the method described sults. Figure 3 shows that mesh refinement in the streamwise
for two dimensions above i.e. we solve the system direction has very little effect on the solution apart from sharp-
ing the strong shock near the tip. However refinement in the
spanwisedirection not only improves the resolution of the tip
by the CGS method where of the C-H grid and hence the pressure distribution before the
shock close to the tip; but also the strength of the first shock
aw aR, aRy in the mid span region. This can be more clearly seen from
A=(- + A t - +At-), the upper wing surface pressure contours shown in figure 4.
aP 8-P ap
aw a& aw-' aw aRy
C=(-
aP
+ At-)- (-
ap ap ap
+At-)
aP
(10)
4 Parallel Implementation
and
b= - At(Rx + R y + Rz). (1 1)
A detailed description of the parallel implementation of the 2
and 3-D methods can be found in [6]. In the present section
followed by the solution of a block pentadiagonal system for we summarise the main features and give sample results.
the uDdates The major obstacle to an efficient parallel implementation
aw + At-)6P=X.
aR, of the AF-CGS method is the inherently sequential nature
(-
ap ap of the AD1 procedure. This was overcome in [ll by using a
The two factor method has substantially reduced memory transposition of the data to allow completeAD1sweepsto pro-
requirementscompared with the fully unfactoredmethod. For ceed independently on each processor. We use this approach
the third order spatial discretisation there are 13 non-zero 5 here although extra communicationis required for the present
by 5 blocks for the rows in the unfactored matrix associated method because of the matrix-vector products required in the
with any one grid cell. This means that the number of floating CGS algorithm.
point numbers which must be stored for the coefficientmatrix The computational space is mapped onto the nodes by
for a mesh with fl cells is 325N. Since n/ can be of the or- grouping complete mesh lines in both the ( and the 7 direc-
der of one million for flows around basic wings, this implies tions onto a single node. Care has to be taken to make sure
that even if we can solve the linear system efficiently, stor- that ( lines on either side of the wake cut are mapped to the
age requirements will be a limiting factor. For the two factor same processor. The computation then falls into three phases.
39-6

First, the matrix is generated and the factors are put in up- cessor without further communication. Once the updates are
per triangular form. The next phase is the multiplication of a available a second transposition is used to restore storage by
vector by the matrix and finally we have multiplication of a spanwise slices for the next time step.
vector by the preconditioner which reduces to back substiti- The method has been implemented in parallel on a range
tution on the triangular factors of the AD1 factorisation. For of machines. The algorithm speeds for the Cray T3D and the
each phase data is held on a node for complete lines in one SGI cluster are given in table 2 for grids with 140000 grid
direction in the mesh and the entire computation relating to points for the T3D and roughly half this number on the SGI
that direction is completed. The data is then communicated cluster. The parallel efficiencies will increase when the grid is
so that information for complete lines in the other direction is refined, however a high parallel efficiency has been obtained
held on a single node and the computation for that direction on 128 nodes, even for this relatively small problem. Excellent
proceeds. efficiency is obtained on the SGI cluster.
The parallel code was also implemented on a cluster of
Silicon Graphics Indy workstationsat the University of Glas-
gow. The message passing was accomplishedby using PVM 1 No.ofnudes I Emlicittimestem I ImDlicittimesteDs I
version 3.3. The comparison of algorithm speeds (time in
I speed I efficiency 1 speed I efficiency
psec/grid pointhime step) on the SGI cluster is shown in table
n_ _n_l. I 417 I1 1.00 II 1510 II 1.00
I
-

1. The results where obtained on a coarse mesh with only


1 nn16
_ _ _ .. -. . I
I 29.6 0.88 I 107 I 0.88
'I3D32 15.6 0.84 55.0 0.86
2400 points and hence the loss in efficiency is quite small 'I3D64 8.25 0.79 28.9 0.82
when this is considered. T3D128 4.63 0.70 15.7 0.75
SGI 1 2372 1.00
Machine I I
algorithmspeed efficiency SGI 6 416 0.95
SGI cluster 1 nodes I 95 8 I 1.00
SGI cluster 6 nodes I 230 I
0.69
SGI cluster R nodes I 194 I 0.62 Table 2. Algorithm speedsin psec/gp/ts and parallel efficiency for
the Cray l 3 D and SGI cluster

Table 1. algorithm speeds in pseclgrid pointltime step on the SGI


cluster.

5 Conclusions
The three-dimensional algorithm has two distinct phases. The programmes that are providing a world class comput-
First, there is the generation and solution of the large lin- ing environment for the development of CFD codes at the
ear system arising from each spanwise slice of the mesh. University of Glasgow were described. A high quality access
Secondly, there is the solution of the banded linear systems to the 320 processor EPCC Cray T3D was obtained through
arising from the second factor in the spanwise direction. forming the CCAP consortiumon the problem targeted in this
The first phase is split between processors in two ways. report. At the other end of the cost scale, the development and
First, the spanwise sections are split into groups. Each group description of a parallel environment based on the spare ca-
is then assigned to a set of processors with each spanwise pacity on workstations mounted on a quality network under
slice in the group being treated in a similar way to the two the H N W project was described. It was predicted that this lat-
dimensional algorithm described above by those processors. ter typeof computingenvironmentwould bea standardwithin
The communication between the different groups of proces- the design offices of Aerospace Companies in the future.
sors, each treating a different set of spanwise slices, is simply An implicit method for simulating three-dimensionalcom-
that which would be required by an explicit method so that pressible and viscous flow developed to run on a distributed
the contributions to the residual (or the right-hand-sideof the memory parallel environmentis outlined. The AF-CGS method
linear system) from the spanwise fluxes at the interfaces be- is based on a two-dimensional approach which consists of an
tween the spanwise groupings can be evaluated. Since there iterativesolution of the linear system by the conjugategradient
is significantly less communication involved at this stage than squaredalgorithm with preconditioning by the alternating di-
is required to solve a spanwise slice in parallel, it is clear that rection implicit factorisation. The FUN (factored-unfactored)
the most efficient partition of the problem will arise when as method tackles three dimensional flows and builds on the
large a number of spanwise groups as possible is used. For a two dimensional method by factoring the linear system into
fixed number of total processors this will reduce the number a factor arising from spanwise slices in the mesh and a block
of processors which operate on a spanwise section. penta-diagonal factor arising from strips in the spanwise di-
The second phase of the calculation involves assigning rection. The more complicated factors arising from the span-
complete spanwise lines in the mesh to single processors. wise slices are solved by the two dimensional method. This
Again, a transposition of the data is used so that the calcu- approach yields a method which has similar properties to the
lation involving a single line can proceed on a single pro- 2d AD1 method, a situation which is substantiallybetter than
39-7

a three dimensional version of A D L A study concerning the [5] K.J.Badcock and B.E.Richards, ‘Implicit time stepping meth-
optimisation of the AFCGS codeusing RAE 2822 Case9 was ods for the NavierStokes equations’, in 12th AlAA CFD con-
carried out. Three levels of mesh sequencing were used to ference, San Diego. A M , (1995).
obtain a starting solution on a h e mesh of 257 x 65. Then [6] K.J.Badcock and B.E.Richards, ‘Implicit time stepping meth-
ods for the NavierStokes equations’, to appear in AlAA Jour-
the optimal CFL number used was increased to 100, and the
~ l(1995).
,
overall time to converge to within 0.25 per cent of the fully
[7] K.J.Badcock, I.C.Glover, and B.E.Richards, ‘Convergence ac-
converged lift value was reduced by a factor of 5. When ap- celeration for viscous aerofoil flows using an unfactored
plied to unsteady flows AFCGS was shown to allow for larger method’, in Second European conference on CFD, pp. 333-
time steps and a reduced computationalcost when compered 341.ECCOMAS. (1994).
to ADL [8] KJ.Badcock, 1.C.Glover. and B.E.Richards, ‘A preconditioner
The FUN code was tested through the prediction of the for steady two-dimensional turbulent flow simulation’, submit-
flows over the ONERA M6 Wing using the Cray T3D. Even ted for publication, May 1994, (1994).
for the relatively course grid tested parallel efficiencies of 75 [9] L.Dubuc KJ.Badcock, X.Xu and B.E.Richards, ‘Precondition-
per cent were achievedusing 128nodes. Improved efficiencies ers for high speed flows in aerospace engineering’, to appear
will be achievableusingfiner grids. The comparisonswith the in NumericalMethodsfor Fluid Dynamics V . Institute for Com-
putational Fluid Dynamics, Oxford, (1995).
experiment using a Baldwin-Lomax turbulence model were
[ 101 P.sonneveld, ‘CGS: A fast Lanczos-type solverfornonsymmet-
found to be satisfactory,but improvements are expected when ric linear systems’, SIAMJOUrMlStatiStiCSandcomputing, 10,
the k w turbulence model is implemented. 36-52, (1989).
Future work includes the development of multi-block ap- [ l l ] R. Radespiel, C.Rossow, and R.C. Swanson, ‘Efficient cell-
proach and the testing of the unsteady 3 d code and its cou- v&ex multigrid scheme for the three-dimensional Navier
pling with a structural code to tackle aeroelasticity cases. Stokes equations’,AIAA Journal, 28,1464- 1472, (1990).
Work is underway on multiblock extensionsof the methodol- 1121 M. Wtaletti, ‘Solver for unfactored schemes’, AlAA Journal.
ogy presented. 29,1003-1005. (1991).
[13] V.Schmitt and E C h q i n , ‘Pressure distributions on the
ONERA-M&Wmg at transonic Mach numbers’, Technical Re-
port AR-l38,AGARD, (1979).
ACKNOWLEDGEMENTS
This work has been carried out with the supportof the Ministry
of Defence, the Engineering and Physical Sciences Research
Council, British Aerospace and the Joint Information Sys-
tems Committeeof the Joint Higher Education Funding Coun-
cils under the following grants: EPSRC/MOD GW47371,
DRA/MOD/BAeFRNlC/407,EPSRC GR/K42264, NW65.
The authors would like to thank Mark Woodgatefor obtaining
the three dimensionalresultsshown in this paper and to Dr Ian
Glover for his contribution to the early part of the work. The
work on the cluster has been carried out by Bill McMillan,
Dr Xiaokun Zhou and Angus McCuish. The mesh generation
subroutines were supplied by Dr.A. L. Gaitonde of Bristol
University.

REFERENCES
T. Chyczewski, E Marconi, R. Pelz, and E. Churchitser, ‘Solu-
tion of the Euler and NavierStoke-s equationson a parallel pro-
cessorusingatransposed/IhomasADIalgorithm’,inllthAlAA
Computational Fluid Dynamics Conference. AIAA, (1993).
B.E.Richards et al, ‘Computation of complex aerodynamic
flows - CCAF project’, Technical report, Technical Annex to
Proposal to EF’SRC (unpublished).
KJ.Badcock. ‘Newton’s method for laminar aerofoil flows’,
Aerospace Engineering Report 30, Glasgow University, G l a s
gow. UK, (1993).
KJ.Badcock and A.L.Gaitonde, ‘An unfactored method with
moving meshes for solution of the NavierStokes equations for
flows about aerofoils’, submitted forpublication, AugustJ994.
(1994).
39-8

CDC4.44 CDc4.65
15

05

4 4 4
0

05

1
0 02 04 08 08 1 0 02 04 08 08 1
& XJC

CDc4.90 CDc=0.95 CDC=O.99

Flgure 3. Comparison of computed pressure distribution with experiment for ONERA M6 wing :-Solid line 129 x 33 x 33, Dashed line
129 x 33 x 97, Dotted line 257 x 33 x 33

1
39-9

129 x 33 x 33 grid.

129 x 33 x 97 pid.

*re 4. Surface pnsaurs for 6.06 d e p ONERA hi6 wing.


GD-1

PROGRESS AND CHALLENGES IN CFD WETHODS AND ALGORITHMS


GENERAL DISCUSSION
J.W. Slooff. NLR, Netherlands
After Dr. Kroll has given his opening remarks we will open up to the
floor and try to get a, hopefully, lively discussion on various
issues and aspects that we may wish to address. But first, Dr.
Kroll, please give us your on-the-spot evaluation.
N. Kroll. DLR, Germanv
Thank you for the invitation. I appreciate that I can act as the
evaluator for the CFD Symposium. Before I go into details, I would
like to mention that this evaluation reflects my personal thoughts
and is based mainly on the oral presentations. Only a few papers
reached me before the Conference, so I did not get the time to go
into the details of the papers. Therefore, in the written version
some of my statements may be revised, but I think the essential
messages will not change.
The background of this Symposium is the fact that CFD, as we all
know, is widely accepted as a key tool for aerodynamic design.
However, on the other hand, we also know that CFD still has
deficiencies in accuracy, complexity, robustness, and efficiency.
Due to this, in industry CFD is not yet being exploited as
effectively as one would expect. Therefore, this Symposium has been
set up with the aim to present and discuss those topics which are
considered as likely to constitute pacing items and new challenges in
CFD. The work presented here will be evaluated against the ambitious
theme of the Conference. From my point of view and from what I saw
in the invited papers, for the aeronautical industry CFD is expected
to deliver: 1. detailed viscous flow analysis for complex geometries
at high Reynolds numbers, 2 . accurate prediction of aerodynnamic
data, 3. fast turnaround calculations at acceptable costs, 4.
aerodynamic design and optimization of aircraft components or
complete aircraft and 5. interdisciplinary analysis. There may be
many other key problems, but in my opinion, these are among the most
important ones in order to raise the confidence level of CFD in the
aeronautical industry.
With respect to the scope of the Conference, I expected contributions
to the following topics: improvement of basic algorithms including
space discretization, time integration and fast iterative methods:
advanced techniques to treat complex configurations including
blockstructured methods, unstructured and hybrid grids as well as
Cartesian and Chimera techniques: adaptive methods: parallel
computing: effective algorithms for more complex applications such as
turbulent flows, chemically reacting flows and unsteady flows; design
and optimization methods; effective methods for miltidiscipline
physics.
Three invited and 34 technical papers were presented. First I would
like to make some general remarks on the technical quality of the
papers. Many of the papers were of high quality because they
represented the current status of the CFD community and they
GD-2

identified or presented new important directions of algorithmic


development in CFD. But in my opinion, also many papers of lower
quality were delivered which either were not within the scope of this
Symposium or did not represent the current status and progress of
CFD. Moreover, some of them reinvented well-established knowledge in
CFD. The Symposium covered eight major topics as shown in this vu- I
graph.
Although many papers addressed several subjects, I categorized them
according to their central focus. The result of this classification
is somewhat different from the session grouping which was set up by
the Program Committee. There were 10 or 11 papers on advanced
discretization schemes, only three papers on fast implicit iterative
solvers and a bunch of papers on parallelization. Several papers
were given on unstructured meshes, overlapping grids and hybrid
grids. This morning we heard papers on adaptive grids and two papers
concerning specific algorithmic aspects on chemically reacting flows.
We had some papers on DNS/LES and on unsteady flows. In the
following, I would like to go through each subject and to make some
comments on what was presented and whether the major challenges were
addressed by the papers.
First let me say a few words about the invited papers. The first
paper was presented by Anthony Jameson. He gave an excellent
overview about the present status, challenges and future development
of CFD. He identified some important challenges, in particular the
3-D viscous flow simulation for high Reynolds numbers. He mentioned
that about 8 to 10 million points are needed in order to accurately
resolve turbulent flows and to predict the drag coefficient within
one count. His presentation on the unified theory for 1-D shock
capturing methods was very interesting. He showed that unifying
different schemes may help in designing new improved methods such as
his newly developed CUSP or HCUSP scheme. Jameson also addressed the
important topic of aerodynamic design and optimization.
The second invited paper was delivered by Paul Rubbert. He talked
about CFD research in the changing U.S. aeronautical industry. I
think it was a very interesting paper because he identified the
challenges which are beyond the technical ones. He made an analysis
of the process by which CFD capabilities are created. He mentioned
that research can be improved by introducing new principles like
customer focus and customer satisfaction. From the technical
reviewer's point of view, some comments and statements of the
aeronautical industry on the status of CFD and future requirements
would have been desirable.
The third paper was presented by Doyle Knight. He gave a nice
overview on parallel computing. Since he explained the terminology
used in parallel computing, he formed the basis for the audience to
follow the technical papers on parallelization. He discussed
important issues, and he gave several examples of experience with
parallel computing in the aerospace industry.
From my point of view, the Program Committee did a good job
concerning the selection of the keynote papers. In my opinion,
GD-3

however, an invited paper on status and progress on grid generation


for complex configurations was missing. Although grid generation was
not a subject of this Symposium, I think that such a paper would have
been very helpful for the assessment of structured and unstructured
methods. The issues of turnaround time and accuracy of a numerical
method very often depend on the capability of the available grid
generation procedure.
Now let me come to the specific subjects. Several papers on parallel
computing were presented. It is obvious that routine use of CFD and
future large applications in aeronautics require parallel computing.
The papers addressed several important issues such as parallelization
strategies, portability, performance and load balancing. Some papers
were devoted to the adjustment of algorithms designed for sequential
computer to parallel architectures. The issue of scalability was
only barely addressed although it is one of the key features for
efficiently exploiting parallel computing. Only a few 3-D
applications on parallel computers have been presented. The
efficient use of parallel architectures for 3-D complex industrial
configurations seems to be still a major challenge. The reason for
this may be the problem of load balancing. In the case of
unstructured meshes, much work has been done with respect to domain
partioning and some public domain software is already available.
However, for structured meshes the load balanced partioning is much
more complicated mainly due to the geometrical restrictions.
Furthermore, I believe that the adjustment of sequential algorithms
to parallel computers is not sufficient. New parallel algorithms
have to be developed. As an example, it is well known that the
multigrid method is not fully scalable, and therefore may not exploit I

the full performance of massively parallel computers. In conclusion,


challenges for parallel computing for the near future are scalable
implementations of 3-D applications, load balancing for structured
mesh calculations and development of new parallel algorithms. Future
work should address these issues.
Now let me discuss the topic of advanced space discretization.
Various promising schemes have been presented showing some
improvements over conventional methods. For example, papers were
devoted to quadratic reconstruction with flux-limiters, improved
flux-splitting schemes, multidimensional upwinding and kinetic flux-
splitting. My criticism here is that in some of these papers, the
assessment of the new algorithms were restricted to only one or two
aspects of spatial discretization. Designing new discretization
schemes, several different aspects have to be addressed, including
high resolution of viscous shear layers, sharp shock resolution,
conservation, robustness at shocks and in expansion regions, overall
efficiency and compactness of the stencil. In my opinion, a detailed
comparison of available schemes covering all these issues is needed
in order to assess potentials and limitations of advanced algorithms. i
I also would like to mention that very often a detailed accuracy
assessment of new or modified methods is not carried out. An
assessment study should include investigations with respect to grid
refinement and other important numerical sensitivities. Well-
established test cases for the Euler and Navier-Stokes equations
should be calculated to raise the confidence level of the new
GD-4

techniques. Of course, the advanced methods have to be applied to


those problems for which the standard schemes show substantial
deficiencies. Multidimensional upwinding, from my point of view,
made large progress in the last few years, but I still think that
these schemes - and also kinetic algorithms - are not yet at the
stage to be used in a 3-D production code. The major challenge for
advanced discretization schemes is the accurate calculation of 3-D
viscous flows. Beside high resolution schemes, improved turbulence
models are required. Turbulence modelling, however, was not a topic
of this Symposium.
Let me come to the third subject on fast implicit and iterative
methods. Here good papers on Newton-Krylov subspace methods were
presented. For standard cases, like 2-D inviscid flows or viscous
flows with moderate Reynolds numbers, the more sophisticated methods
such as multigrid, Newton-Krylov subspace methods and advanced
implicit schemes perform almost equally well. From the literature,
it is obvious that multigrid is mostly used with structured meshes,
whereas for unstructured grids, very often Newton's method with
Krylov subspace iteration is applied. For 3-D flows around complex
configurations the situation is not clear. We do know that for
generic configurations and moderate Reynolds numbers multigrid is
quite efficient. There is not much known about the Newton-Krylov
methods. Open questions are the subspace dimension, the memory
requirements and the computational costs. Much more effort is
required to explore the capabilities and limitations of these
methods. In my view, the real challenge concerning the development
of efficient time integration algorithms was not addressed here. It
is the simulation of realistic Reynolds number flows in 2-D and 3-D.
Due to efficiency reasons, for these flows high aspect ratio cells
are required. Due to the lack of efficient smoothers, the
convergence behavior of the multigrid method gets worse. In the case
of Krylov subspace methods, a suitable preconditioner has to be
designed. Future work should be devoted to the development of
efficient time integration algorithms for stiff discrete equations
due to high aspect ratio cells.
There was a nice paper on time-preconditioning, which I think is a
very interesting approach to achieve Mach number independent
convergence. A key concern is the development of a unified flow
solver covering incompressible flows up to hypersonic flows. I do
not think that the technique is already mature, however,
preconditioning is a good candidate to reach that goal.
Another subject dealt with unstructured and hybrid methods. The
paper from Rockwell stated that unstructured grids are well suited
for inviscid flows including flows around complex 3-D configurations.
On the other hand, experience shows that for accurate viscous
calculation, some kind of regular cells are required in the boundary
layer. So the approach of hybrid grids may be a good choice because
i it combines all the advantages of structured and unstructured meshes
and offers the possibility for an automatic simulation of 3-D complex
configurations. The work presented here on hybrid meshes is in an
early stage and substantial effort is required to establish a
valuable tool for complex viscous applications. For the simulation
GD-5

of configurations with moving bodies, the overlapping grid technique


seems to be very interesting. The meshless technique approach is an
interesting idea, but it shows many deficiencies. For example,
conservation is not guaranteed and the control of accuracy is quite
difficult. Much work is required to get some confidence in this
approach. For all discretization strategies, considerable effort is
still required to significantly reduce the turnaround time for
viscous simulations for complex 3-D configurations. Some promising
results have been presented here.
The next subject covers the activities on adaptive methods. It is
well known that adaption is an important issue for cost-effective
calculations. Various strategies have been presented including mesh
movement and mesh refinement for both structured and unstructured
grids. In my opinion, there are several open questions, some of them
were addressed at the Conference. A key issue is the selection of
suitable criteria for grid adaption. A s proposed by several papers,
finite element error indicators seem to be the right choice. They
ensure that the solution will not be sensitive to the adaption
pattern. However, so far in most applications local flow gradients
are used as sensors. In these cases, the estimation of the overall
accuracy is quite difficult and a grid independent solution may not
be obtained. With respect to parallel computing, the problem of
dynamical load balancing occurs, especially in the case of structured
meshes. The challenge of adaptive methods is the application to 3-D
viscous flow fields. A s mentioned here by several authors,
considerable work is required to extend error based indicators to
viscous flows.
Concerning unsteady flows, several papers presented time accurate
calculations for incompressible and compressible flows. Some
attempts were made to cut the cost of time accurate calculations.
However, it is obvious that new innovative concepts have to be
developed in order to efficiently simulate 3-D viscous unsteady
flows.
A few papers addressed LES and DNS. At the moment both simulation
techniques focus on fundmental research of flow physics, especially
turbulence. Specific requirements on LES and DNS solvers were
discussed including high resolution in time and space, adaptive grids
and parallel computing. Based on these sophisticated methods, one
paper held out a prospect of large eddy simulation of a clean wing at
moderate Reynolds number in the near future. In my opinion,
significant research work on both algorithms and subgrid model is
required to enable this simulation. Some of the papers dealing with
this subject did not meet the scope of this Symposium.
The topic of chemically reacting flows was covered only by two
papers. Both presented modifications and improvements of numerical
methods to meet the requirements of hypersonic reacting flows,
namely, sharp capturing of strong shocks, high resolution of viscous
regions, robustness in regions of flow expansion and efficient
solution of stiff equations. Promising results for 2-D and 3-D flows
were presented.
GD-6

This concludes my technical comments on the various subjects. Now I


would like to give a few concluding remarks. Measured against the
theme of the Symposium, in my opinion, many papers of high quality
but also many papers of lower quality were given. Concerning the
technical standard, there was quite some difference between this
Symposium and other conferences such as the AIAA conferences in the
U.S. or the ECOMASS in Europe. In order to improve the quality of
the papers, one should ask for extended abstracts. CFD has so many
aspects and facets that it is very difficult to assess the quality of
a paper with only a few pages of abstract. Furthermore, one should
define some criteria or certain procedures which have to be met by
the abstracts. This could include the calculation of specific test
cases.
Nevertheless, I would like to say that all in all the Symposium was
interesting. We saw some recent developments and achievements in CFD
which I have mentioned before. Several problems were identified,
being pacing items for algorithmic improvements and new developments.
However, from my personal point of view, the Symposium did not
reflect the actual status of the CFD community compared to other CFD
conferences. Many leading experts were not present. Especially,
there was only a small contribution by the U.S. Many aspects and
recent developments were not addressed in this Conference.
Furthermore, in some areas, I think CFD is much more developed than
it was presented here. I would like to say a few words concerning
the challenges I have mentioned previously. Many of the key issues
important for industry were not covered here. No paper tackled the
problem of high Reynolds number flows. There were not many papers on
accurate drag calculation for viscous turbulent flows. Concerning
the problem of short turnaround time for complex configurations, some
advanced approaches including unstructured and hybrid methods were
presented. However, no paper addressed 3-D viscous calculations
around more complex geometries. There was only one paper - Jameson's
invited lecture - dealing with design optimization. No paper
addressed interdisciplinary methods, an issue which is definietly a
future challenge in CFD.
Finally, I would like to remark that, in my opinion, the scope of the
Symposium was too encompassing. It is almost impossible to cover all
important new directions in CFD within an AGARD conference of three
and one-half days. It would have been better to restrict the
Symposium to some specific subjects, say adaptive methods. In that
case a comprehensive overview and review of this particular subject
would have been possible. New directions and developments and their
critical assessment could have been addressed in more detail.

J.W. Slooff, NLR, Netherlands


Thank you very much Dr. Kroll for what I think was a very appropriate
and to-the-point evaluation with a good balance of critique and
praise. One small remark from my side, I think that there is an
internal conflict between two of your statements in the sense that on
the one hand you think not enough subjects of CFD were covered and on
the other hand, you said the scope was too wide for only three and
one half days.
GD-7

N. Kroll, DLR, Germanv


Measured against the scope and aim of this Symposium, not all
important issues and subjects were covered.
J.W. Slooff, NLR, Netherlands
We are now at the Open Discussion part of the final session. The
last thing I would like to do is to put the discussion into some sort
of a straight jacket, but on the other hand, it occurred to me that
if we go about this in a completely unstructured way, the discussion
might quite easily become pointless. What I suggest is that we
proceed in the following way. First of all, I would like to spend 5
or 10 minutes to give direct comments on some of the statements made
by the Technical Evaluator. Some of you may have the urge to do so.
After that, I suggest that we try to look at things from some
distance in order to get a better perspective of what we are doing
and what we are doing it for. In doing so, I think, we should add a
background as a sort of framework against which we can project our
comments, remarks and suggestions. On this background we should keep
questions in mind like: what are industry's requirements? what do we
have to offer as the CFD research community, what kind of new
developments?, which of these developments have the best prospects
for better meeting industry's requirements.
Before we start the actual discussion, I have to point out a few
administrative things. This Round Table Discussion is being recorded
and a transcript of the tape will be made. That does not mean that
you should confine yourself. You don't have to be afraid of saying
things that you will be confronted with afterwards. You will be sent
a copy of the transcript of the tape and you will have the
opportunity to edit your comments and remarks. In order to be able
to do so, it is necessary that you clearly state your name and
affiliation so that we know who spoke and who to send the transcript
to. I will come back later on some of the basic questions and
provide a little bit more framework for our discussion. But first,
who would like to give some direct comments on some of the remarks
that the Technical Evaluator gave us just a few minutes ago? I
imagine the Program Committee Chairman might have to say something.
J.A. Essers, Universitv of Lieae, Belaium
First of all I think that Dr. Kroll did a very good job, and perhaps
you will be surprised to note that I agree with almost everything he
said. But anyway, I have a technical remark to make on one point.
This point is concerning the fact that it is better to use a
structured grids on viscous flows, and perhaps an unstructured grid
is good for inviscid flows. I disagree with that. Many people
believe that with unstructured grids, you cannot compute accurately a
boundary layer or other shear layers. That is wrong. We made some
calculations with quadrilateral and triangular grids that were
extremely irregular and got very good accuracy, with for example,
parabolic re-construction techniques. My feeling is that many people
use discretizations which are not accurate enough for the viscous
terms when the grid is distorted, but if you use schemes which are
weakly sensitive to mesh distortions, it can work fine. Anyway, I
must confess that you usually need more points in a boundary layer if
you use unstructured grids then if you use a structured grid, so it
GD-8

is anyway worthwhile to use the hybrid grids for that reason. Now I 1
would like to make another comment concerning Dr. Kroll's remarks. I I
said that I essentially agree with what he said, but he should be
aware of the fact that the Program Committee of an AGARD meeting has I
constraints that organizers of large conferences don't have. For I
example, we are almost not allowed to have parallel sessions. And
parallel sessions w,ould have been necessary. I propossed parallel
sessions, but I hay immediately 10 opponents in the group, so it was
impossible to make it. That is namely, because we need a Technical
Evaluator who cannot attend all sessions at the same time of course. I
Something else, also, is that we have some, let us say, political I
constraints, in the sense that it is important to AGARD that all NATO
countries can participate in such conferences, to present the status
of the research in their country, and that is a constraint other
I
I
conferences don't have. I
B. Masure. STREHNA, France
The Technical Evaluator said that many papers did not address the
problem of the accuracy assessment. I ask the Technical Evaluator to
say to us what is exactly an accuracy assessment for a code. ~

N. Kroll, DLR, Germany I


For structured meshes, one should make sure that by refining the grid
the results will become independent from the grid. Furthermore, the I

order of accuracy claimed by theory can be checked by grid refinement I


studies. In case of unstructured meshes the accuracy assessment may
be more difficult, but I think we can borrow some techniques from 1
I
finite element theory. I

P.W. Sacher. DASA, Germanv


Six years ago we had a big Symposium on the subject of Code
Validation, CFD Validation. This was a subject that I missed here, 1

specifically in your remarks, in your evaluation. Does it mean that


code validation is no longer an issue for CFD?
J.W. Slooff, NLR, Netherlands
I am pretty sure that Dr. Kroll is going to say that it is still an
issue, but that it was, on purpose, outside of the scope of this 1
Conference. Are there any further direct comments on the Technical
Evaluator's remarks?
F. Mokhtarian, Canadair, Canada
I just wanted to say that I agree with most of the comments of the
Technical Evaluator regarding the quality of the Conference and the
papers. However, the comment I would like to make is that I thought
his comparison of the Conference with some of the other conferences
was perhaps a bit unfair. This isn't the first time I have attended
an AGARD Conference. I have been to many other conferences in Canada
and the U.S. and you always get a variety of papers, you can't always
be very critical. It is very difficult to tell the quality of some
of the papers ahead of time. There were some papers I thought
perhaps were not exactly up to par, but there were lots and lots of
1
papers that were very high quality and I was glad I was able to
I
attend. The only comment I wanted to make is that I think he was a
GD-9 I
bit harsh comparing this Conference with some of the other
Conferences.
N. Kroll, DLR, Germanv
Essentially, I think you are right, but you have to read again the
title of the Conference. The Conference has the title, tlProgressand
Challenges of CFD Methods". This is an ambitious title. You have to
make sure that most of the papers of the Conference will meet the
high demands of the Symposium.
J. van Inaen. Delft University. Netherlands
In regard to the remark by Prof. Essers about the boundary conditions
on an AGARD meeting I think we as a Panel have to think about what
especially is the place of AGARD in all of these CFD conferences. I
think CFD conferences organized by AGARD should have some specific
ideas in it to bring together the users and developers and maybe we
should leave the more fundamental subjects to a special conference on
that. The present criticism may be due to the boundary conditions we
have to put on AGARD conferences.
J.A. Essers. Universitv of Lieae, Belaium
Just to see if we agree, let me just make a few comments and ask a
question to Dr. Kroll. I think that, as you said, there were very
good papers in this Conference, there were some that were not as
good, of course. Unfortunately, very few papers here had a
sufficient vision of the future. But that remark is also valid for
many conferences. Nevertheless, I agree that the title of the
conference was perhaps too ambitious, but we couldn't know that
before receiving the submitted abstracts. I also would have liked to
hear more people saying why they use this technique instead of
another; why it is more appropriate because in the future they want
to tackle that problem and that problem. For example, I would have
liked somebody making an Euler calculation, explain that he uses that
scheme instead of another one because he thinks that that scheme will
be better for future viscous flow calculations. Nobody discussed
such issues. Is it perhaps what you want to say Dr. Kroll?
N. Kroll, DLR, Germany
This is exactly what I wanted to say. I tried to define challenges
on the different subjects which should be addressed by the CFD
community.
A.G. Panaras, HAF Academy, Greece
I think that it should be more appropriate to state that progress has
been reported on some new ideas and not to make the distinction
between good and not good papers. Many authors have made substantial
efforts in preparing their work and certainly there is always
something new that comes in a conference like this.
J.W. Slooff, NLR. Netherlands
Now with your permission I would like to switch to the second part of
this discussion. I would like to get you thinking, if not talking,
, about three key questions that we have to deal with and that we have
to get answers for. To trigger the discussion a bit further and to
perhaps provoke you a little bit into making comments, let me try to
GD-10

list what I think are industry's three most important requirements.


One is to increase the confidence level of the codes, which means,
from my point of view, that for every application you would like to
know what the accuracy is. I don't think there are many codes that,
together with the CP distributions and what have you, provide an
estimate of the accuracy. I think that is something that we should
strive for. Robustness, that is clear, is another aspect. Reduction
of the problem turnaround time has also been mentioned by Dr. Kroll,
of course. Here grid generation is the bottleneck, in particular for
the structured grid approach. That leads us to efficiency. We may
loosely define efficiency as accuracy divided by cost, and cost is
more or less proportional to time. We can distinguish between
manpower time needed, particularly for preprocessing, that is
geometry handling and grid generation and the pure computer time.
I don't think the post processing part of it is a big deal here. If
we look at that formula and address the different parts in it, we
know that accuracy is in the first place a function of the physical
model, including the turbulence model (that was on purpose not
addressed here at this Conference). The other important parameters
are the number of grid points, the distribution of the grid points,
the Irorder"of the method plus the artificial dissipation models and
whatever flux upwinding or multidimensional upwinding scheme is used.
On the cost-side we have prepocessing as I already mentioned, plus
the CPU time. For the latter, the number of grid points is again a
parameter, plus the rlnumericallr
scheme, the solution argorithm, and
of course, the hardware. The latter is, however, beyond the scope of
this Conference.
What I would like to do is to discuss the current main developments
in CFD against the background of industry requirements, the
efficiency requirement in particular. If, as a baseline, we take the
currently well-established multi-block structured type of codes with
conventional types of schemes, let us say the Jameson Flo-57, -67
level of technology, we can try to estimate where we can improve,
relative to that baseline situation. For the unstructured grid
approach we may, for the same number of grid cells, have perhaps
somewhat less accuracy than for a structured grid solution. I am not
completely sure about that, and you may wish to comment. However,
there is certainly a lot of gain in the grid generation part. If we
look at adaptive grids, it is clear that unstructured as well as
structured adaptive grids have a potential for increasing the
accuracy for a given number of grids cells. However, grid adaptivity
also has the prospect of reducing the preprocessing and the grid
generation calendar time. This because with adaptation the first
grid you start with doesn't have to be as good as is the case when
you do not have an adaptive grid approach. The CPU aspect for given
accuracy is also clear. I think the biggest advantages for adaptive
grids are in the unstructured case. I think there we have a bigger
potential for gain in accuracy for a given number of grid cells or
reduction of the number of grid cells for given accuracy than in the
case of structured grids. Higher order schemes, multi-dimensional
upwinding and similar refinements, are, of course, good for accuracy.
I am not quite sure what it means for grid generation. I have the
feeling that some of these more subtle schemes may require better
GD-11

meshes than more conventional schemes. They also may require some
more computer time, at least for the same number of grid points.
However, because the number of grid points required for a certain
accuracy level may be less, we may still gain something. I don't
know what the balance is. You may wish to comment on that.
One further remark, on adaptive grids. We might wonder what is more
efficient - to implement a highly sophisticated higher order scheme
with the best thinkable multi-dimensional upwinding with only one
grid point in the shock wave, or to have an adaptive grid scheme
with, for example, two or three grid points in the shock. I am not
sure which of the two is more efficient. I will stop here. This is
just a little bit of provocation in order to get you out of your
seats, so to speak. Who would like to shoot at this or anything
else?
P.E. Rubbert. Boeina Commercial AirDlane Group. U.S.
It is important to speak to how good do you have to be, what is the
target. Not just faster, but how fast, etc. One of the things I
seem to detect at this Conference is that many of the speakers had in
their mind a different definition of the decimal point than I do. I
heard talk about working hard on grid generation to reduce the time
from three weeks to maybe one week or maybe one day. My experience
in using CFD in an airplane design environment is that when you are
talking about designing a wing, it wasn't too many years ago that
that involved a sequence of about 75 full blown CFD runs, part
analysis, part inverse design, etc. One day turnaround was
unacceptable. We do not want to take 75 days to design wings. The
decimal point belongs in terms of hours, not days. In our old design
environment, our target was to get three turnarounds in an 8 hour day
in the design environment. The challenge is now to reduce cycle time
even more. So I think it is worth saying that some of the targets
that I hear people setting for themselves will produce a capability
which is not really acceptable and useable in a real airplane company
environment.
J.W. Slooff. NLR, Netherlands
Thank you for that comment, and it reminded me that I forgot to
mention one aspect in relation to high order schemes and accuracy.
We are not looking for infinite improvements in accuracy. What we
need is, for a given accuracy that we want to obtain, but not
necessarily want to exceed, the highest efficiency, the shortest
preprocessing turnaround time and the lowest CPU cost. Higher order
methods usually have their greatest benefit if you require very high
accuracy. If you have lower accuracy requirements they may not be so
well suited for the purpose. In industry, you probably will agree
with me, different levels of accuracy are needed in different phases
of the design process. Industry is not always looking for the
highest accuracy. That is something we also have to bear in mind in
considering higher order methods.
P. Rubbert. Boeina Commercial Aimlane Group. U.S.
The subject of accuracy - another thing I did not hear at the
Conference was any discussion of CFD with respect to the environment
in which we use it. I think it is very important that we learn how
GD-12

to think about what we want from CFD in the presence of wind tunnel
analysis and the other tools that we have for doing airplane design.
For example, the question of accuracy. I heard many times people
setting goals like we would like CFD to be able to calculate drag at
this level of accuracy, and so forth, but the way we really do it in
industry is that we don't depend on any one tool to give us the total
answer. The total answer is arrived at by utilizing all of the
information at your disposal; the information that CFD provides, the
information that wind tunnels provide, your previous experience, etc.
Integrate that all together into a judgement as to what something
like the drag would be. Again, when we talk about accuracy of CFD,
if it is going to take us 6 months to build a wind tunnel model and
test it, that means one thing in terms of the amount of accuracy you
need out of CFD. But I heard some discussion this week about
stereolithography methods, and things like that that could lead you
in the direction of what one might call overnight model
manufacturing. If that happens in the wind tunnel, that has a major
influence on the type of accuracy levels you would need out of CFD.
If you could rapidly get a number out of the wind tunnel, maybe you
don't need to focus so hard on CFD accuracy. I guess my point is we
have to stop looking at CFD by itself. We have to learn to look at
it with respect to the total environment.
D. Kniaht. Rutaers Universitv. U.S.
I would like also to focus on this question of accuracy. As I have
often understood it, it seems to be more of a question of accuracy as
a function of resource rather than resource as a function of
accuracy. Typically, for example, if you want to compute the total
pressure recovery in an inlet in an industrial environment, the
question is how long it will take to get within a certain accuracy.
That may be 1% for the total pressure recovery, if it is a design, it
may in fact be even smaller or perhaps larger. I think we yet, in the
CFD community, don't focus enough on the question: given a level of
accuracy of a particular type, like total pressure recovery, what is
the resource required to get that. If you are in industry and you
have a week to do a computation, can you actually predict the total
pressure within 1%, or should you not try at all. Maybe that will
take 2 weeks and that is the information that you need to know. This
also raises the question of optimal design: the optimal design of
your algorithm in terms of reconstruction of high order methods, and
also the optimal design of your grid structure within that algorithm.
That, of course, brings to the fore the question of an estimate for
the accuracy of your scheme. That is an issue that was mentioned in
a number of papers including the earlier one this morning by
Friedrichs. In the CFD community we still do not yet have a good
measure of accuracy, and how to predict that from our solution.
S . V . Ramakrishnan, Rockwell Science Center. U.S.
I have one comment on hybrid grids. From our experience in
generating such grids, I can say that most of the difficulty lies in
the region near the body surface for complex configurations. If you
can develop a structured grid near the body, you might as well
develop such a grid everywhere, because it doesn't take too much work
to develop the grid away from the body surface. Therefore, if we
GD-13

cannot solve the viscous problem with unstructured grid, we may as


well not use it all. There is no point in using hybrid grids.
P.G.C. Herring. British AerosDace Ltd. U.K.
A comment first on adaptive grids, I am not sure if the size of the
pluses and minuses is an indication of the potential benefit, but
some of the work that we have been doing is beginning to indicate
that in some of cases it is maybe not worth using adaptive grids.
The time it takes you to develop the procedure and run the codes is
often longer than it would take just running 2 or 3 cases of an
ordinary grid. The other thing that surprises me is your last line
on parallel algorithms which indicates that, for CFD fluid problems,
there is not much benefit to be gained by going to parallelization.
You have a small positive in the last column. As I say, if the size
of the plus is an indication of the benefit, it appears we are not
yet ready for parallelization with CFD.
J.W. Slooff. NLR. Netherlands
I hasten to say that I have not been very consistent with the sizing
of the pluses. But I do think personally that a good grid adaptivity
scheme is one of the most important things that we have to go after.
J.A. Essers. University of Lieae. Belaium
I have two very specific questions. The first one is about
chemically reacting flows. It is a question for Dr. Radespiel or Dr.
Marmignon. Perhaps someone can answer it. Well, in the abstracts we
received no proposals on the following subject. In the past, we
expected that there would be some developments on techniques using
different grids for different equations, for example, relatively
coarse grids for the flow equations and the finer grids for the
chemical reaction equations. I heard nothing on that issue in this
Symposium. Is that idea still around or is it forgotten now?
The second question is concerning the DNS method. In the past, I
expected that DNS would provide some kind of numerical wind tunnel or
experiment to construct better turbulence models, classical
turbulence modelling, or for example, for LES. Nobody addressesd
that subject. My question is, at this time, are there some people
who use DNS to try to construct better models for turbulence or not?
Usually, when I attend a talk on DNS I hear nothing about that.
C. Marmianon. ONERA. France
I would like you to be more specific with the question if possible.
;
In the past some people suggested, I namely think of Marsha Burger of
the Courant Institute, but I am not sure, that some people in the
U.S. were working in the field mentioned in my first question, i.e.,
the use of different grids when you have chemical reactions. Some of
them have very short relaxation times, leading to very sharp
gradients in a shock wave for example, so it could be worthwhile to
discretize the kinetic equation corresponding to that reaction with a
very fine grid and perhaps to use a coarser grid for another chemical
reaction and still a coarser grid for the flow equations. So you
could imagine to have a series of grids, three grids, for example.
GD-14

Of course, the points of the coarse grid would also be points of the
finest grid, like in multigrid techniques I would say, and for some
equations you would only discretize them on the coarse grids and
interpolate the results for the fine grids in order to save time. Is
there some research going on in that field? Maybe it is not
interesting, I don't know.
C. Marmignon, ONERA, France
We have not looked at this point.
J.W. Slooff. NLR. Netherlands
On the last question, that is the DNS, LES question, I think I saw
three hands up there.
B. Geurts. University of Twente. Netherlands
The question you raised is, as exclusively mentioned, not within the
scope of this Symposium. If you are interested in it, I would like
to refer you to some of the work at Twente where we try to use DNS as
a data base for developing subgrid models for LES which is an
intermediate step for possible extension to Reynolds averaged
turbulence modelling improvement. We are not unique in the world,
there are several groups that have similar approaches in which they
start from DNS.
P. Comte. LEGI, Institut de Mecaniaue de Grenoble, France
I think all the LES community has tested models in comparison with
DNS, however DNS are currently restricted to fairly low Reynolds
number flows. If we want to use LES for higher Reynolds numbers,
maybe those comparisons wouldn't be that relevant.
N. Kroll, DLR, Germany
I just want to make a short comment on chemically reacting flows. In
my opinion the most severe problem is the stiffness of the discrete
system. I think you cannot overcome this problem by using different
mesh types. You have to develop efficient algorithms to overcome
that stiffness problem.
J. Jimenez. Escuela Superior de Inaenieros Aeronauticos, Spain
The question of the relationship between DNS and modelling is
something that has been considered for several years. It is a
question of what to expect. You cannot expect DNS to give you a
model. That has to be done by modellers. What DNS gives you is
"ground truth'. It gives you what the real flow is doing, and it
gives you constraints on which models work and which ones do not.
This has been practiced extensively now, at the CTR in Stanford, at
Twente, as reported in this meeting, and at many other places. There
are cases in which DNS is almost the only data available, as in the
case of stress balances in separated flows, which are difficult to
measure and difficult to model, but which have been computed with
DNS. You can use those data to check whether a particular model
works or not and, if it does not, it is up to the modeller to come up
with a better one. That last step was, of course, outside the scope
of the present meeting. DNS can do this, and it can give you some
ideas of how to improve your model, but it cannot produce a model by
itself. It is as difficult to get good models out of DNS as it has
GD-15

been to get them from experimental data. There is nothing magic


about DNS. It is just a better experiment.
J.A. Essers. University of Lieae. Belaium
I don't know, but I suppose it is easier to get a lot of data from a
calculation like DNS than from experiments. I don't know a lot in
that field, but I would see DNS as a kind of experimental facility
that can provide you with a lot of information if you need it.
J. van Inaen, Delft University, Netherlands
I think we should not forget that there has been a time, I refer to
the Stanford trials in '68 and '80, when the idea was that we just
had to wait for the ultimate turbulence model and all our problems
would have been solved. Then, I think around the ' ~ O ' S , people
started to realize that there is not a single turbulence model. You
will have models for different kinds of flows. So if you say DNS is
providing a different approach to experiments, yes, but you will need
experiments, hence also these numerical experiments, in these
different kinds of flow. Having a problem with calculating high
Reynolds numbers will remain as long as you cannot do DNS for these
high Reynolds numbers.
J.W. Slooff. NLR, Netherlands
I thank you all for your contribution to this discussion, and in
particular Dr. Kroll for giving us the starting point for the Round
Table Discussion. I think Prof. Essers would now like to formally
close this Conference.
J.A. Essers, University of Lieue, Beluium
I will make it very short. First of all, I would like to say that in
my opinion, the Conference was satisfactory from several viewpoints.
First of all in terms of attendance, if the attendance is a measure
of the usefulness to the NATO people. I just would like to let you
know that we had 124 attendees, including observers, Panel Members
and authors. For those of you who are interested in this, I just
show you a distribution of attendance per country so it could be
perhaps useful to some of you. That is just for statistics, let us
say. Now concerning the technical content, let me just say that I
agreed with many of the things Dr. Kroll said. Anyway, I think that
during this week we could at least answer some questions. For
example, we know which was perhaps not obvious for all of us, that
there is still a lot of exciting future for CFD. That is great
because otherwise many of us would become unemployed in the near
future. I feel also that we are all convinced that there is no way
that CFD could replace wind tunnels in the future. They have to work
together and they are very complementary to each other. Their
complementary role should still be reassessed and used more
intensively in the future. Then I have some conclusions concerning
some work that could be done in the future. For example, I believe
that that issue on grids will be very important and I think that
there is no way to say that structured grids or unstructured grids
will be better. They have to be used together. For example, I would
like to remind you of that idea of hybrid grids and overlapping
grids, and all these things. I have already been interested in the
fact that you can have good error detectors and good error
GD-16

estimators. It would be nice if we could generalize them to the


transient error and to evaluate the error due to the time
discretization. That would help a lot in unsteady flow calculations.
I also appreciated the talks on DNS. For example, it was very clear
that DNS could only be used in some small parts of the configuration,
for example, and then the issue would be to develop multi-block
techniques using DNS in some blocks, and some other models in other
blocks. The communication between blocks obviously still has to be
defined. This is an important issue for the future. Finally, I
believe that to accelerate the calculations, which I believe is
something very important, we should both use more efficient numerical
techniques like implicit techniques and so on. Also, to have
efficient computers. I liked a lot the talks on parallel computing,
namely the talk by Prof. Knight. I must confess that many of us who
don't use parallel computing are a little bit scared of using it.
But I think we should do it anyway, or we will be out of business. I
would however feel concerned by the portability issue. If you say it
could become very portable, it would be nice to use it.
To close this meeting, I would like to thank all of the people who
contributed to the success of this Conference. I will not thank each
of them separately because this will be done by Christian Dujarric in
a few minutes. I just would like to say that I am grateful to the
authors who prepared good papers: in particular, I feel very
satisfied by the fact that we received a copy of all of the papers
now, which is not so usual in AGARD Conferences, so you can go back
home with copies of all the papers. That is good in itself. I would
also like to thank the Programme Committee members, the session
chairmen and the technical ealuator, Dr. Kroll, who did a great job,
in my opinion and also the Spanish organizers. They had planned
everything including very good weather and they had a great party on
Monday. There were very nice facilities. I would like to thank you
for your attendance to this Symposium. I hope that you will go back
home and remember this Conference as useful for your work. I hope
that it will be very rewarding for your career, and wish you a good
trip back home.
C. Duiarric. Chairman Fluid Dynamics Panel
Thank you Prof. Essers. Ladies and Gentlemen. We have now come to
the end of our Symposium. I think that we have identified together
promising research orientations. The scientific material will permit
each of us to formulate recommendations for our respective
organizations on the aspects of the use of numerical methods for
aerodynamics which particularly merits our efforts for its
development. The Fluid Dynamics Panel will use the results of this
Conference as one of the elements for its contribution to Working
Group Aerospace 2020. This Working Group will present to the highest
authorities of NATO the recommendations concerning the technological
efforts required to provide to the Alliance by 2020, radically
improved military capacity in spite of the tight expected budgetary
pressures.
A symposium regrouping all the panels of AGARD is planned in Paris in
the Spring of 1997 to present the conclusions of this Working Group.
This meeting will have in attendance military authorities, industry
GD-17

representatives and researchers. This will be for us an occasion to


deliver our message, and I hope that many of you will participate.
This Symposium which has just finished, has been very well followed
as Prof. Essers has mentioned. Inspite of some of the points made by
Dr. Kroll, it was very fruitful. The Program Committee deserves our
congratulations. We thank Prof. Essers, the Chairman of the Program
Committee. We also thank the members of the Committee, Prof.
Deconinck, Prof. Kind, Prof. Bonnet, M. Jacquotte, M. Lacau, Dr.
Korner, Prof. Panaras, M. Borsi, Prof. Slooff, Dr. Ytrehus, Prof.
Falcao, Dr. Corral, Prof. Jimenez, Prof. Kaynak, Dr. Poll, Prof.
Cantwell and Dr. Lekoudis. We warmly thank all the authors and all
of you who have helped us to have a lively discussion. We also thank
the Technical Evaluator, Dr. Kroll, who has presented his point of
view regarding our work. These comments will be attached to the
publication of the Round Table Discussion. A remarkable job of
organization was done to permit us to have our Symposium. I would
like to thank on behalf of the Fluid Dynamics Panel, the Spanish
authorities, in particular the National Delegates, for the invitation
to hold this meeting in Seville. I remind you that the Minister of
Defense for Spain and INTA have contributed to making our stay so
agreeable by financing the organization of our Conference. We are
very grateful. We thank in particular, Lt. General Mira Perez for
the wonderful evening we had last Monday.
We especially thank our Local Coordinator, Prof. Javier Jimenez as
well as Miss C. Gonzalez Hernandez, Spanish National Coordinator.
This Conference would not have been possible without the complicated
logistics whose operation relies largely on good will. So we thank
the interpreters who have succeeded in translating in spite of the
very technical character of our remarks, considering especially the
level of difficulty of doing so with certain speakers, perhaps myself
included.
We thank the technicians for keeping the equipment functioning, the
hostesses, as well as the people who welcomed us and helped in the
smooth running of the Conference.
Lastly, we thank the Secretary of our Panel, Anne-Marie Rivault, who
has just received the AGARD Personnel Medal for her devotion to the
FDP and who participates for the last time in a Symposium before
taking her well-deserved retirement.
We also thank the Panel's Executive, Mr. Jack Molloy for his very
effective support in the preparation of this Conference.
Now I would like to present you with our program for 1996. We will
have in the Spring a Symposium on the Characterization and
Modifilation of Wakes from Lifting Vehicles. This will take place in
Trondheim in Norway from the 20 to the 23th of May, 1996. In the
Fall, if everything goes well, we will organize for the first time in
the history of AGARD, a Symposium in Moscow. It demonstrates the
recent opening up toward the countries of the old Soviet bloc. It
will cover the Aerodynamics of Wind Tunnel Circuits and Their
Components. The Russians have a great deal of experience in this
GD-18

field and have promised to share this expertise. We begin a new


level of cooperation which will be technically extremely fruitful for
us. We will also have in 1996, two special courses at VKI, one on
Advances in Cryogenic Wind Tunnel Technology, and the other on
Aerothermodynamics and Propulsion Integration for Hypersonic
Vehicles. You are all invited to participate in our future progams,
and I hope to have the pleasure to meet you. Thank you for your
attention.
REPORT DOCUMENTATION.PAGE
1. Recipient’s Reference 2. Originator’s Reference 3. Further Reference 4. Security Classification
of Document
AGARD-CP-578 ISBN 92-836-0026-6 UNCLASSIFIED/
UNLIMITED
5- Originator Advisory Group for Aerospace Research and Development
North Atlantic Treaty Organization
7 rue Ancelle, 92200 Neuilly-sur-Seine, France
6. Title
Progress and Challenges in CFD Methods and Algorithms

8. Author(s)/Editor(s) 9. Date
Multiple April 1996
10. Author’s/Editor’s Address 11. Pages
Multiple 488
12. Distribution Statement There are no restrictions on the distribution of this document.
Information about the availability of this and other AGARD
unclassified publications is given on the back cover.
13. KeywordsDescriptors
Computational fluid dynamics Turbulent flow
Design Aeroelasticity
Algorithms Fluid dynamics
Aerodynamics Chemical reactions
Parallel processing Computerized simulation
Parallel programming Computation
Computer architecture Grids (coordinates)
Unsteady flow
14. Abstract
The papers prepared for the AGARD Fluid Dynamics Panel (FDP) Symposium on “Progress and
Challenges in CFD Methods and Algorithms”, which was held 2-5 October 1995 in Seville,
Spain are contained in this Report. In addition, a Technical Evaluator’s Report aimed at
assessing the success of the Symposium in meeting its objectives, and an edited transcript of the
General Discussion held at the end of the Symposium are also included.
Papers presented during nine sessions addressed the following subjects:
- parallelcomputing;
- advanced spatial descretization techniques;
- unstructured, hybrid and overlapping grids;
- adaptive meshes;
- fast implicit and iterative solvers;
- large eddy and direct numerical simulations of turbulent flows;
- chemically reacting flows;
- unsteady aerodynamics.
NATO -9-
\,’
OTAN
7 RUE ANCELLE 92200 NEUILLY-SUR-SEINE DIFFUSION DES PUBLICATIONS
FRANCE AGARD NON CLASSIFIEES
TBlecoDie (1147.38.57.99Telex 610 176
Aucun stock de publications n’a exist6 B AGARD. A partir de 1993, AGARD dttiendra un stock limit6 des publications assocites aux cycles
de conftrences et cours sptciaux ainsi que les AGARDographies et les rapports des groupes de travail, organists et publits B partir de 1993
inclus. Les demandes de renseignements doivent &treadresskes B AGARD par lettre ou par fax B I’adresse indiquke ci-dessus. Veuillez ne
pas tiliphoner. La diffusion initiale de toutes les publications de I’AGARD est effectuke auprks des pays membres de I’OTAN par
I’intermtdiaire des centres de distribution nationaux indiquks ci-dessous. Des exemplaires supplkmentaires peuvent parfois &treobtenus
auprks de ces centres (A I’exception des Etats-Unis). Si vous souhaitez recevoir toutes les publications de I’AGARD, ou simplement celles
qui concement certains Panels, vous pouvez demander B Ctre inch sur la liste d’envoi de I’un de ces centres. Les publications de I’AGARD
sont en vente auprks des agences indiqutes ci-dessous, sous forme de photocopie ou de microfiche.
CENTRES DE DIFFUSION NATIONAUX
ALLEMACNE ISLANDE
Fachinformationszentrum Karlsruhe Director of Aviation
D-76344 Eggenstein-Leopoldshafen 2 c/o Flugrad
BELGIQUE Reykjavik
Coordonnateur AGARD-VSL ITALIE
Etat-major de la Force atrienne Aeronautica Militare
Quartier Reine Elisabeth Ufficio del Delegato Nazionale all’AGARD
Rue d’Evere, 1140 Bruxelles Aero orto Pratica di Mare
CANADA 00048 Pomezia (Roma)
Directeur, Services d’information scientifique LUXEMBOURG
Ministkre de la Dtfense nationale Voir Belgique
Ottawa, Ontario K1A OK2 NORVEGE
DANEMARK Norwegian Defence Research Establishment
Danish Defence Research Establishment Attn: Biblioteket
Ryvangs All6 1 P.O. Box 25
P.O. Box 2715 N-2007 Kjeller
DK-2100 Copenhagen 0 PAYS-BAS
ESPAGNE Netherlands Delegation to AGARD
INTA (AGARD Publications) National Aerospace Laboratory NLR
Carretera de Torrej6n a Ajalvir, Pk.4 P.O. Box 90502
28850 Torrejbn de Ardoz - Madrid 1006 BM Amsterdam
ETATS-UNIS PORTUGAL
NASA Headquarters Estado Maior da ForGa ACrea
Code JOB-1 SDFA - Centro de DocumentaGSo
Washington, D.C. 20546 Alfragide
2700 Amadora
FRANCE
O.N.E.R.A. (Direction) ROYAUME-UN1
29, Avenue de la Division Leclerc Defence Research Information Centre
92322 Chdtillon Cedex Kentigem House
65 Brown Street
GRECE Glasgow G2 8EX
Hellenic Air Force
Air War College . TURQUIE
Scientific and Technical Library Milli Savunma Bavkanligi (MSB)
Dekelia Air Force Base ARGE Dairesi Bavkanligi (MSB)
Dekelia, Athens TGA 1010 06650 Bakanliklar-Ankara
Le centre de distribution national des Etats-Unis ne detient PAS de stocks des publications de I’AGARD.
D’kventuelles demandes de photocopies doivent Etre formulkes directement auprhs du NASA Center for Aerospace Information (CASI)
B I’adresse ci-dessous. Toute notification de changement d’adresse doit &trefait tgalement auprks de CASI.
AGENCES D E VENTE
NASA Center for ESAnnformation Retrieval Service The British Library
Aerospace Information (CASI) European Space Agency Document Supply Division
800 Elkridge Landing Road 10, rue Mario Nikis Boston Spa, Wetherby
Linthicum Heights, MD 21090-2934 75015 Paris West Yorkshire LS23 7BQ
Etats-Unis France Royaume-Uni
Les demandes de microfiches ou de photocopies de documents AGARD (y compris les demandes faites auprks du CASI) doivent
comporter la dknomination AGARD, ainsi que le numtro de sCrie d’AGARD (par exemple AGARD-AG-315). Des informations
analogues, telles que le titre et la date de publication sont souhaitables. Veuiller noter qu’il y a lieu de spCcifier AGARD-R-nnn et
AGARD-AR-nnn lors de la commande des rapports AGARD et des rapports consultatifs AGARD respectivement. Des rtftrences
bibliographiques complktes ainsi que des rCsumCs des publications AGARD figurent dans les journaux suivants:
Scientific and Technical Aerospace Reports (STAR) Government Reports Announcements and Index (GRA&I)
publit par la NASA Scientific and Technical publiC par le National Technical Information Service
information Division Springfield
NASA Headquarters (JTT) Virginia 22 161
Washington D.C. 20546 Etats-Unis
Etats-Unis (accessible tgalement en mode interactif dans la base de
donnCes bibliographiques en ligne du NTIS, et sur CD-ROM)

Imprimd par le Groupe Communication Canada


45, boul. Sacrd-Caw, Hull (Qudbec), Canada KIA OS7
7 RUE ANCELL;
NATO -+-
\,'
92200%EbILLY?SUR-SEINE
OTW'
I

DISTRIBUTION OF UNCLASSIFIED
' I
t
FRANCE AGARD PUBLICATIONS
t f

~~
Te'lefax (1.)47.38.k7.99
~ ~
Telex 610 176 . .
AGARD hoMs limited,quantities:of the publications that accompanied Lecture Series and Special Courses held in 1993 or later, and of
AGARDographs and \"(orking.Orobp reports ubli'hed from 1993 onward. For details, write or send a telefax to the address given above.
Please do not telephone: * . % [
. AGARD does not hold stocks of .publications'$,t accompanied earlier-Lecture Series or Courses r bf 'any other publications. Initial
distribution fall AGARD publications is made t NATO nations through'the National Distribution Cektres listed below. Further copies are
sometimes itailable froythese centres (except incthe United States). If you have a need to rkceive all AGARD publications, or just those
relating to one or more specific AGARD Panels, thef ma)i.be willing to ,jnclude you (or{&our organisgtion) on their distribution list.
AGARD. publications-may be purctiased from the Sales Agencies listed below, in photocop$ or microfiche form.
I.

1
. NATIONAL DISTRIBUTION CENTRES
! \
BELGIUM LUXEMBOURG\ !
Coordonnateur AGARD - VSL .,
See .Belgium
Etat-major de la Force aCrienne . .
_

:,
'.NETHERLANDS
:

. '
. Quartier Reine Elisabeth 1
Netherlands Delegation to AGARD
Rue d'Evere, 1140 Brukelles . , '

., National Aeros ace Laboratory, NLR


CANADA P.O. Bo; 905&,
DirectOr Scientific' Information Services 1006 BM Amsterdam 1
Dept of National. Defence I .

. Ottawa, Ontario K1A OK2 '", - . 1.. ' NORWAY


Norwegian Defence Research' Establishment
DENMARK Attn: Biblioteket #,
Danish Defence Reseakh Establis(ment , . P.O. Box 25
. '
1 '
Ryvangs All6 1 9 N-2007 Kjeller , *
I
2 ,
. P.O. Box 2715 . E
DK-2100 Copenhagen 0 PQRTUGAL' i
i-. .Estado Maim da Fd*a ACrea
FRANCE
O.N.E.R.A. (Direction)
#
<SDFA-- Centro de Documentagio
a
~

29 Avenue de la Division Leclerc Alfragide


2700 Amadgra
92322 ChPtillon Cedex I .

. GERMANY SPAIN ,A. '

Fachinformationszentrum Karlsruhe INTA (AGARD-Publicatibm) ' ..


. D-76344 Eggenstein-Leopoldshafen 2 ; Carretqa de Torrej6n a Ajahir, Pk.4
28850 Torrej6n de Ardoz - Madrid
GREECE
Hellenic Air Force TURKkY
Air War College I Milli Savunma'Ba$kanliki (MSB)
Scientific and Technical Library ARGE Dairesi Bagkanliii (MSB)
Dekelia Air Force Bage; 06650 Bakanliklar-Ankara
Dekelia, Athens TGA 1010
ICELAND .:. UNITED KINGDOM
Director of Aviation . '*
c/o Flugrad b:' '<. '
.
, *. '
. Defence Research Information Centre
Kentigern House
Reykjavik *. . Q 65 Brown Street
-., . . , Glasgow G2 8EX
ITALY i
Aeronautica Militare. UNITED STATES
Ufficio del Delegato Nazidnal~all'AGARD NASA Headquarters
Aeroporto Pratka di Mare Code JOB-1
00040 Pomezia (Roma) Washington, D.C. 20546
.< .
The United 'States National Distribution Centre does NOT hold stocks of AGARD publications.
Applications for copies should be made direct to the NASA Center for Aerospace Information (CASI) at the address below.
Change of address requests should also go to CASI.
SALES AGENCIES
NASA Center for ESAnnformation Retrieval Service The British Library
Aerospace Information (CASI) European Space Agency Document Supply Centre
800 Elkridge Landing Road IO, rue Mario Nikis Boston Spa, Wetherby
Linthicum Heights, MD 21 090-2934 75015 Paris West Yorkshire LS23 7BQ
United States France United Kingdom
Requests for microfiches or photocopies of AGARD documents (including requests to CASI) should include the word 'AGARD'
and the AGARD serial number (for example AGARD-AG-3 15). Collateral information such as title and publication date is
desirable. Note that AGARD Reports and Advisory Reports should be specified as AGARD-R-nnn and AGARD-AR-nnn,
respectively. Full bibliographical references and abstracts of AGARD publications are given in the following journals:
Scientific and Technical Aerospace Reports (STAR) Government Reports Announcements and Index (GRA&I)
published by NASA Scientific and Technical - . .. by the National Technical Information Service
published
Info--"-- n:-.:.:.-

NA,
Wa:
Uni

*338689**P*UL*

You might also like