Ecfd
Ecfd
Ecfd
M. Ramakrishna
Department of Aerospace Engineering
Indian Institute of Technology Madras
Chennai 600 036 India
Published 2011
Elements of Computational Fluid Dynamics
Ramakrishna Mokkapati, PhD
Department of Aerospace Engineering
Indian Institute of Technology Madras
Chennai 600 036
ISBN 978-93-80689-05-0
Revision Number: PRN20160217
c
2011
by Ramakrishna Mokkapati,
Department of Aerospace Engineering
Indian Institute of Technology Madras
Chennai 600 036
To
Contents
Preface
Chapter 1. Introduction
1.1. What is Computational Fluid Dynamics?
1.2. Modelling the Universe
1.3. How do we develop models?
1.4. Modelling on the Computer
1.5. Important ideas from this chapter
10
10
11
13
18
24
25
25
31
34
40
48
59
64
67
77
78
80
81
81
92
97
98
105
106
119
125
128
137
141
145
148
152
156
158
160
161
165
174
176
179
182
188
188
193
196
197
198
198
203
216
225
226
226
233
240
241
242
250
255
268
271
273
274
275
275
288
291
296
297
298
302
Chapter 8. Closure
8.1. Validating Results
8.2. Computation, Experiment, Theory
303
303
305
Appendix A. Computers
A.1. How do we actually write these programs?
A.2. Programming
A.3. Parallel Programming
307
309
316
317
320
320
323
330
Bibliography
333
Preface
Information regarding most topics of interest is available on the Internet. There
are numerous wonderful sources of CFD paraphernalia and knowledge. This book
sets out to cut a path through this jungle of information. In spirit, it is a journey
rather than a systematic exposition of CFDiana. This is not a historical monograph
of the authors personal journey. It is more of a guided tour that the author
has been conducting into the realm of applied mathematics. The target audience
has typically been engineering students. The pre-requisites are calculus and some
matrix algebra.
Computational fluid dynamics[CFD] requires proficiency in at least three fields
of study: fluid dynamics, programming, and mathematics. The first two you can
guess from the name CFD. The last two are languages, the last is a universal
language.
This is a book for an introductory course on CFD and, as such, may not be that
demanding in terms of fluid mechanics. In fact, it is possible to pitch this material
as an introduction to numerical solutions of partial differential equations[Smi78].
A good bit of the fluid mechanics used here will be basically one-dimensional flow
of gases. There is a lot of CFD that can be learnt from one-dimensional flows, so
I would suggest that a review of gas dynamics and the attendant thermodynamics
would help, especially in chapter 4.
That was about fluid mechanics, next is programming. Clearly, programing
skills are required, though in these days of canned packages one may hope to get
away without acquiring the programming skills. However, I feel that one learns best
from first-hand experience. So, there are programming assignments in this book,
and it would really help the student to be able to write programs. I strongly urge
you to do all the assignments in the book. They are not an addendum to the book.
They are an integral part of the book. Never ask someone whether something can
be done when you can try it out and find out for yourself.
Fluid mechanics and programming done, finally, we come to mathematics. Acquisitive is the word that comes to mind when I think of what should be the attitude of a student to mathematics. The following will see you through most of your
mathematics requirement as far as the manipulation part goes: product rule,
chain rule, integration by parts and implicit function theorem.Most importantly,
a review of linear algebra [Str06][Kum00] would help, at least that of matrix
algebra.
As we go along, I will indicate material that you may choose to revise before
proceeding[Str86].
An unsolicited piece of advice on learning a new field: Learn the lingo/jargon.
You get a group of people together to work on some problem, after a time everyone
knows that the context of any conversation typically will be around that problem.
6
PREFACE
They get tired of describing things all the time. So, they could make a long winded
statement like: those lines in the flow field to which the velocity vector is tangent
at every point seem to indicate the fluid is flowing in nice smooth layers. They
will eventually say: the streamlines seem to indicate that the flow is laminar.
It is important to pick up the jargon. Many text books will supply the jargon in
terms of definitions. Other bits of jargon you may have to pick up from the context.
Try to understand the definition. Definitely commit the definitions along with the
context to memory.
This book is biased. There are times when I try to remove some of my personal
bias, for the most part however, it reflects a lot of my opinions on how things should
be done. It will not reflect all of the things that are done in CFD. I have however,
tried to give enough information for the student to pursue his or her own study.
I hope we lay the foundation for the analysis and intelligent use of any new
technique the student encounters. I hope students takes away an understanding of
the subject that will allow them to read other material intelligently.
Throughout the book I will try to suggest things that you can do. The objective
is not so much to provide answers as it is to provoke questions.
I originally had a chapter on visualisation. I then realised that a chapter at the
end of the book on visualisation was not a help to the student. I have distributed
the material through the book. Any data that you generate, learn to plot it and
visualise it. Try to get a better understanding of what is happening.
It is clear from what I have written so far that I do not, in my mind, see
this book as being restricted to providing some skills in CFD. That is the primary
objective of the book. However, it is not the only objective. I truly believe that
while trying to study the specific one should keep an eye on the big picture.
Finally, there are other books that the student can looks at for a variety of
related, advanced and background material[Ham73],[Wes04],[Act96] and so on.
This is not a comprehensive book on CFD. It is not a reference. It meant to
be a journey.
Over the years, students have asked me why they are learning something. I
will give an example here. Why am I learning about the second moment of inertia
in statics? What use is it?
Engineering curricula have been designed by engineers. We have abstracted
out bits that are fundamental and invariant and moved them to a point in the
curriculum where they can be taught. Sometimes the student wonders why they
are learning something. They do not see any immediate application. Teachers
then jump to some future course as a justification, buckling of columns is easy
to illustrate with a yardstick and can be presented as evidence for the use of the
second moment of inertia to a student who has not studied differential equations and
eigenvalue problems. Spinning the yardstick about various axes is also useful in the
illustration as students can actually perform these experiments to see a correlation
with the second moment of inertia. The example, though not satisfactory, at least
allows the teacher to move on gracefully rather than bludgeoning the student with
authority.
Computational fluid dynamics is, fortunately, an advanced course and I am
able to determine the content and sequence in which the material is presented to
the student. In my courses, I have tried to arrange the material in a fashion that I
PREFACE
create the need before I present the solution. However, some of the strategies that
I employ in the class room do not translate well to paper.
This book is meant to be an introductory one. Though it contains some advanced topics, most of the emphasis is on why a few simple ideas actually work.
Some of the analysis techniques do not carry over to complex problems that you
may encounter later; the questions that these tools answer do carry forth. It is very
important indeed to ask the right questions.
We will start chapter 1 with the question: what is CFD? We will expand the
idea of modelling and show that we have a need to represent various mathematical
entities on the computer.
Having determined this need, we will demonstrate the representation of numbers, arrays, functions and derivatives in chapter 2. We will also answer the question: How good is the representation?
In chapter 3 we will look at some simple problems as a vehicle to introduce
some fundamental ideas. Laplaces equation is a good place to start as a confidence builder then we go on to the wave equation and the heat equation. Through
these equations we will study the ideas of convergence, consistency and stability of
schemes that we develop to solve them.
Having established a foundation in modelling (the student hopefully has written
a variety of codes), we look at systems of non-linear equation in chapter 4. We use
the one-dimensional flow equations as a vehicle to extend the ideas from chapter 3
to fluid flow equations. A little background in gas dynamics would help. However,
in my experience, students without the background have managed. We will look at
some details of the algorithms and application of boundary conditions.
Now, up to this point, we have been looking at a hierarchy of problems which
we posed in the form of differential equations and boundary conditions. In chapter
5, we look at tensor calculus and derive the governing equations in a general framework. This material is independent of the preceding material and can be read at
any time by the student. On the other hand, the first few chapters can also be read
independent of chapter 5. Tensor calculus and the derivation of the general form
of the governing equations is important for the rest of the book.
Chapter 6 deals with flows in multiple dimensions, techniques of representing
the equations and applying the boundary conditions. It also gives an overview of
grid generation. We will look at just enough of grid generation so that you can
solve problems using grids other than Cartesian coordinates in two-dimensions.
A taste of variational techniques, random walk, multi-grid acceleration and
unsteady flows is given in chapter 7. The book ends with a closure in chapter 8.
This is meant to round off the discussion started in chapter 1.
There are a few appendices at the end. The first deals in a simple manner with
computers and programming. The rest provide the jargon and an exposition of the
minimal manipulative skill required for complex variables, matrices and Fourier
series.
I would suggest that a student work through the book in a linear fashion. For
the teacher there is more choice. I tend to start with Taylors series, representation
of derivatives and quickly get to the iterative solution to Laplace equation. This
allows the student to start coding Laplace equation in the first week. I then start
each class with a conversation on the numerical solution to Laplace equation make
a few suggestions based on the conversation and then proceed with the rest of the
PREFACE
CHAPTER 1
Introduction
In this chapter, we try to find the motivation for all the things that we do in
this book. Further, we will try to lay the foundation on which the rest of this book
depends. At the end of the chapter, I hope you will have an idea of what you will
get out of this book.
1.1. What is Computational Fluid Dynamics?
We will start by putting computational fluid dynamics, CFD as it is often
called, in context. We will try to see what it is and what it is not. Here we go.
An initial attempt at definition may be along the lines: the use of computers
to solve problems in fluid dynamics. This unfortunately is too broad a definition of
CFD. Let us see why. We will first check out what CFD is not.
The general attempt at understanding and then predicting the world around
us is achieved by using three tools: experiment, theory and computation. To understand what computational fluid dynamics is not, remember that computation
is used along with experiment and theory in numerous ways. Experiments can
be automated. Raw experimental data can be reduced to physically meaningful
quantities. For example, one may have to do some post-processing to clean up
the data, like de-noising and so on. Data can be processed to convert many measurements to useful forms: like obtaining streamlines or master curves that capture
behaviour so as to aid design. All of this can be done with the help of computers.
Similarly, computers can be used to perform symbolic manipulation for a theoretician. They can be used for visualising closed-form solutions or solving intermediate numerical solutions. So, I repeat, defining computational fluid dynamics
as using computers to solve fluid dynamic problems is too broad a definition.
On the other hand, computational fluid dynamics is the use of computer algorithms to predict flow features based on a set of conservation equations. However,
you may have encountered computer packages that simulate flow over and through
objects. One could classify the use of these packages as experimental computational
fluid dynamics (ECFD). After all, the we are using something like a simulated wind
tunnel to understand the problem at hand. In order to be good at ECFD, some
knowledge of CFD in general and the algorithms used in that package in particular would help. Though, an understanding of the physical principles and fluid
mechanics may often suffice.
We have seen some of the things that I would not consider to be CFD. So, what
is CFD? We use the conservation laws that govern the universe to build computer
models of reality. We want to understand the computer model and would like to
predict its behaviour. How faithful is it? How robust is it? How fast is it? All
of these questions are asked and answered in the context of fluid dynamics, so the
10
11
discipline is called Computational Fluid Dynamics. By the end of this chapter, you
should have a better understanding of these questions and some answers to them
by the end of the book. Lets now take a very broad view of affairs, before we
start getting into the nitty-gritty details of modelling/representing things on the
computer and predicting their behaviour.
1.2. Modelling the Universe
Modelling the universe is a very broad view indeed. The idea is that we are
interested in modelling any aspect of our surroundings. To illustrate the breadth of
application and the importance of these models, consider the following scenarios.
Scenario I:
It is 3:30 in the afternoon in Chennai. The sea breeze has set in
and is blowing in from the east. The residents heave a sigh of relief
as they recover from the hot afternoon sun. Leaves flutter in this
gentle breeze. Some flap up and down as though nodding in assent.
Some sway left to right. If it cools down enough, it may even rain
in the night. This is nature at work.
Scenario II:
The oil company has been prospecting. They have found oil off
the coast and want to pipe it to a refinery north of the city. The
crude oil that comes out of the ground is a strange concoction of
chemicals. We may lay roads with some of the chemicals, we may
drive on the roads using other chemicals. The flow of the oil in the
pipeline is clearly very important to the economic well being of a
nation.
Scenario III:
At the mark the time will be T 10 seconds says the voice over
the loud speaker and then proceeds to tick off the count down 9, 8,
7, 6, 5, 4... As the count down proceeds, many scheduled events in
the launch sequence take place automatically. The strap-on rocket
motors are fired. All of them reach a nominal value of thrust, the
main rocket motor is fired. The thrust generated by the hot gases
rushing out of the nozzles of all the rocket motors is greater than
the weight of the launch vehicle and it lifts off slowly from the pad.
There is a gentle breeze from the sea, enough to make the vehicle
drift ever so little. This is not a concern, however. The vehicle
accelerates quite rapidly as it thrusts its way to its destination of
delivering a communication satellite or a remote sensing satellite;
all part of the modern way of life.
All of these scenarios involve fluid dynamics. These are all situations where we
would like to predict the behaviour of our system. To add a bit of urgency to the
state of affairs, we suspect that the second and third scenario may be having an
effect on the first one.
12
1. INTRODUCTION
We will use the first scenario to indicate that we may have issues of interest
that are small in scale or we could have questions that are enormous in scale. In the
first scenario, consider one small puzzle. Why do some leaves flap up and down,
while the others sway left to right? Call it idle curiosity, but one would like to
know. Is the direction of the breeze with reference to the leaf important? One
would think so. How about the shape of the leaf? Yes, I would think the shape
of the leaf would be a factor. What about the weight of the leaf and the stiffness
of the stem to which it is connected? All determining factors and more. The first
scenario also has a problem in the large embedded in it. Why does the sea breeze
set in? Is it possible that it sets in at a later time in the day, making Chennai an
uncomfortable city? Can we predict the weather somehow? Can we see the effect
our activity on the weather? Can we do anything to avert a disaster or to mitigate
the effects of our activity? These are all very relevant questions and hot topics
of the day.
In the second scenario, fluid mechanics and modelling again play a very big
role. The oil field was likely located by using technology similar to echo location.
The oil company used a combination of explosions to infer the substructure of the
earth and drilled some sample locations for oil. There is still some guess work
involved. It is, however, better than wildcat prospecting where you take a random
shot at finding oil. Having struck oil or gas, we then come to handling the material.
Pumping to storage yards. How large should the pipes be? How much power is
required to pump the crude oil? How fast should we pump? Is it possible that the
pipeline gets clogged by the constituents of this crude?
Having fetched the crude where we want, we now process it to generate a wide
variety of products from fabrics to plastics to fuels. Are there safety related issues
that should worry us?
What happens to the reservoir of oil (or the oil field as it is normally called)
as we pump out the oil? How do we ensure we maximise the amount that we can
get out economically? If we are removing all of this material, what happens to the
space left behind? Do we need to fill it up?
The final scenario is one of exploration on the one hand and lifestyle on the
other. Access to space, especially low-cost access to space is something for which we
are striving. Putting something in orbit was traditionally done using rocket motors.
A rocket motor basically consists of a chamber in which the pressure of a working
medium is raised and maintained at a high level, say sixty times the atmospheric
pressure at sea level. This working medium, which may in fact be made up of
the products of combustion that led to the increased pressure, is then vented out
through a strategically placed nozzle. A nozzle is a fluid-dynamic device that causes
fluid that flows through it to accelerate while its pressure drops. The high pressure
acts on the interior of the rocket motor walls. This high pressure results in a net
force acting on the rocket motor walls. This is the thrust generated by the rocket.
Meanwhile, the launch vehicle powered by this rocket motor is accelerating through
the atmosphere. The flow over the vehicle becomes important.
Finally, it takes ages for the carbon in the atmosphere to be trapped by trees.
It takes very little time for us to chop the tree down and burn it. The consequences
of scenarios II and III on scenario I may be quite severe. Instead of the sea breeze
coming in, the sea itself may rise and come in if the polar ice melts and the sea
13
water expands, like all things do as they warm. With this in mind we can now ask
the question: How do we develop these models?
1.3. How do we develop models?
Let us look at the process that leads to a model. We observe the world around
us. It seems complex and very often unpredictable. Scientist and engineer take a
sceptical view of an astrologer[GU72], all the same, they would like to predict the
behaviour of any physical system. In seeking order in our universe, we try mirroring
the universe using simpler models. These models necessarily exclude
(1) things that we do not perceive,
(2) features that we deem unimportant,
(3) phenomena that are too complex for us to handle.
We have no control over the first reason. How can we control something we cannot
sense? We can only try to keep our eyes open, looking for effects that we are not
able to explain. In this fashion, we can start coming up with ideas and theories
that explain the behaviour. We may even employ things that we cannot see; like
molecules, atoms, subatomic particles...
The last two reasons for excluding something from our model we handle by
making assumptions. In fact, a feature we deem unimportant is typically an
assumption that it is unimportant. We sometimes confuse the reasons for our
assumptions. These come in two forms.
We make assumptions that certain effects are unimportant. These can
and should always be checked. They are not as easy to check as you
may think. Let us look at a problem that has a parameter in it, say .
Consider a situation, where assuming small makes the problem easier to
solve. So, we solve the resulting simpler problem. If is indeed small,
the assumption is consistent, but may be self fulfilling. We do not know
if it makes physical sense and will actually occur. If turns out to be
large, our assumption is definitely wrong. (There is of course the more
complicated possibility of being small when it is considered in the model
and turning out to be large when neglected.)
A more subtle kind of an assumption comes from assuming that a
small parameter leads to small effects. In this context, the reader should
check out the origins of the theory of boundary layers. The tale starts
with the assumption that viscosity is very small and the creation of
DAlemberts Paradox and the final resolution with the realisation that
small viscosity does not always mean small viscous effects.
We make assumptions that make our lives easier. Whether we admit it
or not, this is the fundamental drive to make assumptions. We can and
should try to find out the price of that ease.
There can be a difference of opinion as to whether an assumption is worthwhile or
too expensive.
Having made all the necessary assumptions, we come up with a model. We
now study this model and try to make sense of the world around us and maybe
try to predict its behaviour. If we can analyse a physical system and predict its
behaviour, we may be able to synthesise a physical system with a desired behaviour
and enter the wonderful world of design.
14
1. INTRODUCTION
15
random motion about the mean may have to be accounted through an associated
new variable which is internal to our model, called the temperature.
This is what we have managed to do. We have come to a continuum. We
have field properties like mass, momentum and energy. We also have their densities
associated with every point in that continuum. Now, we can generate equations to
help us track the evolution of these properties. We do this by applying the general
principles of conservation of mass, conservation of momentum and conservation of
energy. The equations associated with these balance laws can be derived in a very
general form. We typically use the set of equations known as the Navier-Stokes
equations. These equations are derived in chapter 5. We will now look at a typical
application.
1.3.2. Example II - A Nozzle. We have seen that a series of assumptions
led us to the Navier-Stokes equations. These assumptions were quite general in
nature. We will proceed with that process in the context of an actual problem and
see where that takes us.
Let us consider the problem of the flow through a converging-diverging nozzle.
This is a very popular problem and these C-D nozzles, as they are called, are
cropping up everywhere. They are used to accelerate fluid from subsonic speeds to
supersonic speeds, when employed as nozzles. They will also function as supersonic
diffusers and decelerate supersonic flow to subsonic flow. They are used in Venturi
meters to measure fluid flow rate and as Venturis, just to create a low pressure
region in a flow field. The fact is, they have been studied in great detail and you
may have spent some time with them too. C-D nozzles continue to be of great
interest to us.
system looks like. As you build up your intuition, it is nice when you expect something and it
16
1. INTRODUCTION
Surface normal
zoom
Figure 1.2. Nozzle with an imagined realistic flow. I say imagined since I just made this up. A zoomed in view of part of the
flow field is also shown. In this case the velocity variation along a
direction perpendicular to the wall is indicated.
So, we continue with the process of making assumptions in the solution to
the flow through a nozzle. In the context of fluid mechanics, for example, the flow
of a fluid in the region of interest, may be governed by the Navier-Stokes equations.
This is an assumption. ( In fact a whole set of assumptions ) I am not going to
list all the assumptions in great detail here because there are too many of them.
We will look at a few that are relevant. The governing equations will capture the
laws of balance of mass, momentum and energy. Our objective is to predict the
parameters of interest to us using these general principles.
The walls of the nozzle are assumed to be solid. In this case, from fluid mechanics, we know the following three statements are valid and equivalent to each
other.
(1) The fluid cannot go through the wall.
(2) The velocity component normal to the wall is zero.
(3) The wall is a streamline
works out that way. After you have built up a lot of experience, its a lot more fun when it does
not work out the way you expected.
17
Figure 1.3. Nozzle flow with the assumption that the flow is
quasionedimensional
Figure 1.3 shows a flow that has all the velocity vectors pointed in one direction.
The flow is uni-directional. However, it is more than that. A closer examination
of the figure shows that the speed of the flow seems to change along the length
of the nozzle. So, one would think that the flow depends only on one dimension.
It is directed along only one coordinate direction and therefore should be onedimensional flow. Why does the speed change? Well, the area changes along the
length of the nozzle and that results in a speed change along the length of the nozzle.
To differentiate this situation from one where there is no such geometrical change
in the other coordinate directions, we call this a quasi-one-dimensional model.
We see that we can make a variety of assumptions that lead to models of
different complexity. The validity of our assumptions needs to be determined by
us. For instance, in the quasi-one-dimensional assumption, we assume the velocity
vectors are pointed along the axis and that the area variation alone contributes to
a change in the magnitude of the velocity. Our intuition tells us that the nozzle
problem will be better represented by a quasi-one-dimensional model as the area
variations get smaller.
So, from these two examples we see that we take a physical system of interest,
and we then come up with a model that we hope captures all the important features
of this system, at least those in which we are interested. Having come up with a
mathematical model, we need to study how it behaves. To this end, we turn to
mathematical analysis and the computer. Before we go on, let us summarise what
we are up to.
18
1. INTRODUCTION
(1) We have the real world system. We would like to predict its behaviour.
(2) To this end, we come up with an abstract model. The behaviour of the
model, we hope, is the same as that of the real world system. So, we have
reduced our original problem to understanding our abstract model.
(3) Since the abstract model may not be amenable to direct prediction of its
behaviour, we create a computer model of the abstract model. Again, we
hope this computer model captures the behaviour of our abstract model.
So, we have reduced our problem to running the computer model.
I repeat, this book is about understanding the computer model and trying to
predict some of its more generic behaviour. The things we would like to know are
(1) How faithful is the model? This property is called fidelity. You have a
recording of a concert hall performance. Is it Hi-Fi, meaning is it a high
fidelity recording? Does it reproduce the performance?
(2) How robust is it? Does the model work for simple problems and not
work at all for other problems? A model which fails gently is said to be
robust. This means that the answer you get degrades (it is not as faithful
as we like it to be, the model fidelity degrades), however, you still get an
answer. Small changes in parameter values do not cause drastic changes
to the quality of the model.
(3) How fast is it? Will I get an answer quickly? Meaning: will I get the
result in time to use it?
All of these questions about our computer model are asked and answered in the
context of fluid dynamics, so the discipline is called CFD.
1.4. Modelling on the Computer
Now that we understand where CFD fits in the big picture, we can focus on
the computer model. In order to model something on the computer, we must
first be able to represent it on the computer. If we are not able to represent
something exactly on the computer, we approximate it. Even though we make an
approximation, which may or may not be the best approximation, we shall still refer
to it as our representation on the computer or simply the computer representation.
This whole book is about representations, especially that of mathematical entities related to fluid mechanics, on the computer. Somethings we can represent
exactly, some not so well.
In the next chapter, we will look at how to represent mathematical ideas on
the computer. Before we do that, let us try to see what we need to represent. To
answer this, we take a step back to see what we actually want to do.
Computational fluid dynamics falls into the general realm of simulation. The
object of simulation is to create an automaton that behaves like the system we are
simulating. We would like to create an automaton that can solve the flow past an
object or a flow through a system and give us the data that is of interest to us. We
may use the data for design or for a better understanding of the behaviour of our
models and from there infer/predict the behaviour of our system.
Directly or indirectly, it involves the computational solution to a variety of
equations, especially differential equations. Which leads to the immediate question:
What does it mean to ask:
find the solution of a differential equation?
19
We will try to answer this question throughout this book. Here is the first
shot at it. A differential equation is a description of a function in terms of the
derivatives of that function. A closed-form2 solution is a description of the same
function/behaviour in terms of standard functions. A numerical solution could be
a description of the function by tabulated data, say.
g(x)
f (x)
xa
x
xa
Figure 1.4. A function g(x) and its derivative f (x) = g (x). The
extremum of the function g(x) occurs at xa . Calculus tells us that
f (xa ) = 0. So finding the extremum involves finding the zero of
f (x).
What does it mean to find the solution of any equation? We will look at
an example that just involves elementary calculus. Suppose you are told that the
profit that your company makes is determined by a function g(x) where is x is some
parameter that determines operation of your company. You have a simple question
that you ask. What is the value for x for which your profit g(x) is a maximum?
You have learnt in calculus that given a function g(x), one may try to obtain its
maxima or minima by setting the derivative f (x) = g (x) to zero and solving for
x. The here indicates differentiation with respect to x. As I have picked an
example where g(x) is differentiable, we are reduced to finding a solution to the
equation f (x) = 0. Let us see how this works. Figure 1.4 shows the scenario we
just discussed. If I were to give you a and claim that it is an extremum for g(x),
how would you test it out? Well, you could find f (x) = g (x), if that were possible,
and then substitute my candidate and see if f () was indeed zero.
So there you have it. You have a predicate: g (x) = 0?. This will tell you,
given a value for x, whether it is a stationary point or not. You just need to
come up with ways by which you can generate candidate values for x. Most of the
problems/assignments that you have encountered so far would have involved finding
x. Maybe even simpler than the find the maximum problem given here. Finding
the square root of two can be posed as finding the zero of a function f (x) = x2 2.
Ultimately, we find an x 1.414 which is a zero of this function. There are two
points to take away from this simple example.
2I refrain from using the term analytic or analytical solutions so as not to confuse with the
technical meaning in other disciplines of mathematics
20
1. INTRODUCTION
(1) Given a candidate solution we can substitute back into the equation to
determine whether it is a solution or not. This is an important idea and
there are times later on in the book where we may have to take a weaker
stand on the substitute back into the equation part.
(2) The important lesson to take from the simple example is that the problems
that we have been solving so far have solutions which are numbers.
Lets study situations that are a little more complicated. We need to find the
zeros of functions of the form f (u, x) meaning
(1.4.1)
f (u(x), x) = 0
and the objective is to find the function u(x) that satisfies this equation. We could
call this a functional equation as we are trying to find a function. For example we
may seek a solution u(x) such that
(1.4.2)
u2 (x) + x2 R2 = 0
u(x) =
p
x2 + R2
It is important to note here that one is trying to find the function u(x). The
equations (1.4.1), (1.4.2) are called implicit equations, as the entity we want to
determine is embedded in an expression from which it needs to be extracted. As
opposed to equation (1.4.3), which is an explicit equation. The term we seek, in
this case u(x), is available on one side of the equals sign and does not appear on
the other side. The implicit function theorem tells us when we can extract out the
u(x) and write the implicit equation in an explicit form[CJ04].
So far we have looked at algebraic equations. Let us now consider a differential
equation that is, I hope, familiar to all of us: Bernoullis equation. Consider the
streamline shown in Figure 1.5. s a measure of length along that stream line
from some reference point on the streamline. On this streamline, we have for the
incompressible flow of an inviscid fluid
(1.4.4)
1 dp
d
+
ds ds
u2
2
= 0,
where is the density, p(s) is the static pressure and u(s) is the speed at the point
s. Given u(s), one can actually solve the equation for p(s) as
(1.4.5)
p(s)
u(s)2
=C
u(x) =
x2 + R2
21
3
~u(s)
1 dp
ds
s
p(s)
d
ds
2
u
2
=0
1
s=0
Figure 1.5. The segment of a streamline. s = 0 is the origin from
which we measure length along the streamline. In incompressible
flow, the speed of flow is a function of s, u = u(s), as is pressure.
By the definition of the streamline, the direction of flow is tangent
to the streamline and is also a function of s
A third problem we will consider is even more explicit in its application. Say you
walk from your room/home to the classroom every day. There are many possible
routes that you could take. Which is the shortest? Which is the fastest? Which
is the safest? There can be any number of selectors or predicates that we can
invent in order to pick an optimal path. Remember, we are picking a path and
not some point in our three spatial dimensions. We are picking a path that we may
be able to define as a function. Figure 1.6 shows this scenario in two dimensions.
It shows three possible paths between the two points A and B. The path we select
depends on which property of that path is important to us. However, it is clear
from the figure that we are talking about three different functions.
This then is the focus of this book: We are looking for solutions which are
functions. We will spend a little time looking at how to organise functions so that
we can search in a systematic fashion for solutions. We will generate algorithms
that will perform just such a search and study these algorithms on how well they
hunt for the solution.
When we do a search, it is clear that we will pick up candidates that are actually
not solutions. We check whether a given candidate is a solution to the equation by
actually substituting it into the equation. If it satisfies the equation, the candidate
is a solution. If it does not satisfy the equation it leaves behind a residue. This
22
1. INTRODUCTION
B
x
Figure 1.6. Different paths to go from point A to B. Some could
be shortest distance, others could represent shortest time or least
cost.
residue is a signal that we do not have a solution. It can be used effectively in the
search algorithms of our computer model.
We see now, when we model physical systems on the computer, very often, we
are trying to find solutions to equations. Whether the solutions are functions or
otherwise, we need a mechanism to represent them on the computer. Why should
we be able to represent them on a computer? Does it sound like a stupid question?
Obviously if you want to solve a problem on the computer you need to represent
both the problem and potential solutions on the computer.4
This is not quite the whole story. You may have heard of equations having a
closed-form solution. Let us find out a little more about closed-form solutions.
What exactly is a closed-form solution? Think of all the functions that you
learnt in your earlier classes. You learnt about the functions shown in Table 1.1.
You know polynomials. You know of a class of functions called transcendentals:
trigonometric, exponentials and logarithms. Typically you know how to construct
new functions by taking combinations of these functions. This could be through
algebraic operations performed on functions or by compositions of these functions
when possible.
Think about it, Table 1.1 about sums it up. Yes, there are a category of
functions called special functions that expand on this set a little. You can take
combinations and compositions to form new functions. If we are able to find a
solution to our problem in terms of these primitive functions, we say that we have
a closed-form solution. If I make up a function from these primitive functions, it is
likely you will be able to graph it on a sheet of paper (This idea was first proposed
by Descartes). You may have difficulty with some of them; sin(1/x) would be nice
to try and plot on the interval (, ). So, there are some functions that we can
write down which are difficult to graph. How about the other way around. Does
every graph that you can sketch have a closed-form representation? No!! That is
the point. Look at Figure 1.6 again. Can every possible path from your home to
the classroom be represented in combinations of these simple functions? Maybe,
Basic Function
23
Examples of Combinations
g(x) = ax2 + bx + c,
Monomials,
f (x) = axn
h(x) =
Transcendentals,
f (x) = sin x, cos x, ex , log x
ax2 + bx + c
dx2 + ex + f
g(x) = ex sin x,
h(x) = log(| cos x|)
dy
= f (x),
dx
x=ay=c
This a fairly simple equation. You think it is easy to integrate? How about
(1.4.7)
f (x) = exp(x2 ) ?
Even the simplest differential equation that we write may not have a closed-form
solution. The definite integral of the function given in equation (1.4.7) is extremely
important in many fields of study and is available tabulated in handbooks. The
indefinite integral does not have a closed-form.
Okay, big deal, there is no closed-form solution. you say. Why do you want
a closed-form solution? Closed-form solutions very often give us a better handle
on the solution. We can perform asymptotic analysis to find out how the solution
behaves in extreme conditions. We can use standard calculus techniques to answer
a lot of questions regarding maxima, minima, zeros and so on. Lastly, closed-form
solutions are great to check out computer models. More about this later. So, we
would like to have closed-form solutions. Failing which, we would like to represent
the solution on the computer.
Many fluid flow models that we use involve differential equations and finding a
solution is basically integration. Before we go further, we will take a closer look at
the process of integration. Integration is more difficult to do than differentiation.
You most probably knew that already. Why is this so? We have a direct method
by which we get the derivative at a point through its definition. Bearing in mind
that we are talking of a closed-form expression, the process of differentiation is
24
1. INTRODUCTION
in fact pretty easy to automate. Integration is the inverse of this process.5 This
means that in order to integrate a function, we come up with a candidate integral
and differentiate it. If the derivative turns out to be right, we have the integral,
otherwise we try again. The process of integration fundamentally involves guessing.
Either we are good at guessing or we look up tabulations of guesses of others. So,
again, we need to be able to hunt in a systematic fashion for these functions.
This brings us back to our original need: The equations that govern the behaviour of our systems typically have functions as their solution. We need a mechanism to represent functions on the computer and algorithms to hunt for solutions
in a systematic fashion. Please note, there are many a response that will answer
this need. We will look at one class of answers.
First, we will see how to represent numbers on the computer. Normally, in
calculus we construct the real line so as to solve problems like find the zero of the
function f (x) = x2 2. Here, we will start at the same point and then we can
get on with vectors, matrices and functions. We will go through this exercise in
chapter 2.
1.5. Important ideas from this chapter
We make assumptions to generate models that capture what we want or
can of the actual real world system: Always check your assumptions. This
is the general idea of abstraction. We remove the nonessential (or assumed
nonessential) and leave only the abstract model.
Our models are mathematical and the solutions we seek here are likely
to be functions and not just numbers. We do not have a whole host
of functions that we can use to get closed-form solutions and therefore
have to consider, carefully, how we are going to represent functions on the
computer.
Finding the solution to a differential equation is like integration. Integration is a process of guessing at a function whose derivative would be the
function at hand.
Finally, remember, that if you are fortunate to have your computer model
agree with your experiments, it may be that the errors in both of them
have gone in the same direction. A pinch of scepticism is always good.
We started this chapter talking of the sea breeze and pumping oil. By the time
you are done with this book, the hope is that you will have an idea as to how to
model them. At the least, you should have an idea of some of the difficulties that
you will encounter.
5Integration predates differentiation by millennia, however that it is the anti-derivative has been
CHAPTER 2
26
27
With one swipe we eliminated an infinity of rationals. Of course, the slippery part
of infinity is that we still have an infinite number of rationals from which to choose,
leaving us with only one question. Now, how do we distribute the numbers that
we choose to represent between [L, L]? It may strike you that one may as well
distribute them uniformly. It turns out most people like to know whats happening
around zero. So, we pack more points close to zero. In the bad old days (good old
days?) everybody represented numbers in their own way. Now we have standardised
this representation on the computer. It is called the IEEE754 standard. A lot of
CPU (Central Processing Unit) manufacturers conform to this standard. In the
IEEE754, 32 bits are allocated to a floating-point number as follows:
31 30
23} |22
|{z} |
{z
exponent
sign
{z 2 1
mantissa
0}
28
2.1.1. Machine Epsilon. The epsilon of a floating-point data type on a computing platform is defined as follows.
It is the smallest instance of that data type such that
(2.1.1)
1.0 + m 6= 1.0
on that particular machine. It can also be defined as the largest instance of the
data type such that
(2.1.2)
1.0 + m = 1.0
on that particular machine. These two definitions will not give us the same value.
They will differ by a tiny amount. (can you guess by how much they will differ?)
We will use the definition given in equation (2.1.1).
With hardware becoming compliant with the IEEE standards the m s are
gradually turning out to be the same across vendors. However, with optimising
software and hardware architectures it requires more effort to actually measure the
m .
Before you do the assignment, Given what we have seen about representing
numbers using the IEEE754, can you guess what m should be?
Assignment 2.1
Here is a simple algorithm to evaluate the m . Implement it in your favourite
programming language. Try it for single precision, double precision and long double
if it is available. Are you surprised by the results? Fix the problem if any.
Candidate m = 1.0
while 1.0 + Candidate m 6= 1.0
Candidate m = Candidate m * 0.5
endwhile
print epsilon m = , Candidate m
Did you get the m the same for float and double? A float will typically occupy
four bytes in the memory. A double will occupy eight bytes in memory. Now,
computers usually have temporary memory called registers with which they perform
their mathematical operations. So,
Candidate m = Candidate m * 0.5
will result in Candidate m and 0.5 being stored in two registers. A multiply operation is then performed employing the contents of this register and should in theory
be stored back in memory as Candidate m . The registers are usually at least as a
large as the double precision variable, that is eight bytes in size. So, even the single
precision computations are performed in double precision. However, the minute
the Candidate m is stored in memory it goes back to being a single precision value
since you have set aside only four bytes for it.
Here is one other test that you can try out. What happens if the test is
1.0 m 6= 1.0 in equation (2.1.1), that is we use a minus instead of a plus. You
can rework the assignment 2.1 with this test.
What can we conclude from the definition of m and the study we have made
so far? It is clear that any number that lies in the interval (1 21 m , 1 + m ) will
10128
10128
1 L
29
1 + R
1
R = m
L = 12 m
10(1 L ) 10(1 + R )
10
Figure 2.2. The real line and some sample points on it that are
represented on the computer along with the intervals that they
represent. Zero actually represents a smaller interval than shown
in the figure if we allow leading zeros in the mantissa. As the
points are clustered towards the origin, given a representation r,
the next representation to the right is s away and the next available
representation to the left is 12 s to the left
be represented on the computer by 1.0. So, here is the fundamental point to be
remembered when dealing with floating point numbers on the computer. Each
number represents an interval. This is shown in Figure 2.2.
The arithmetic we do on the computer is interval arithmetic. If some computation generates a number b, which falls into the interval represented by the number
a, then that number will become a on the computer. The difference r = b a is
the roundoff error.
Definition:
The difference between a number and its representation on the computer is called
roundoff error in that representation.
Assignment 2.2
Redo assignment 2.1. Find the epsilon of the machine using the test
(1) 1. + m 6= 1.
(2) 10. + 6= 10., and values other than 10 that you may fancy.
and generating a new candidate using
(1) = 0.5
(2) = 0.1 , and any values other than 0.1 that you may fancy.
30
How does roundoff error affect us and why should we be worried about it? First,
it is clear from the definition and everything we have seen so far that the act of
representing a number on the computer with finite precision is very likely to result
in roundoff error. If you combine two numbers with roundoff error in any particular
arithmetic operation, you may have a significant error in the result which is greater
than the original roundoff error due to the representation of the two numbers. That
is, if c = a + b the error in c may be much more than the roundoff error in a and
b. Unfortunately, colloquially, this error in c is often referred to as roundoff error.
We will call it the cumulative roundoff error or accumulated roundoff error.
Every numerical computation we perform, we are actually doing operations
between intervals and not numbers [Ded63], [Moo66]. On the computer, we start
with an uncertainty when we represent our number. As we perform operations
upon these intervals. . . if the intervals grow, our uncertainty grows. We should
be concerned that the roundoff errors in our computations may accumulate over
numerous such operations.
We define cumulative roundoff error as the net error in a numerical value
which is the result of a sequence of arithmetic operations on numbers that may
have either roundoff error or cumulative roundoff error.
Before we go on with our discussion we will look at cumulative roundoff error
a little more closely as a justification for the remarks that we have just made. It
seems that cumulative roundoff error is an accumulation of roundoff errors. Is it
just a matter of a lot of s accumulating? Actually, it can be worse than that.
Case 1: Consider the difference between two positive numbers that are very close
in value. This is the whole point of the calculation. We try to compute
1 + 2 1. If all the digits in our original number are significant, how
many significant digits do we have now? Very often not all the digits are
significant digits. We happen to have a number A which is 1 + 2 which
we have arrived upon after many calculations. It is possible that it should
actually have been 1. Maybe the actual answer to A 1 should have been
zero and not 2.
Case 2: There is a second kind of a problem that can occur. Bear in mind our
experience with polynomial algebra. We can add the coefficients of two
terms only when the terms are the same degree in the variable. That is
3x2 + 2x2 = (3 + 2)x2 = 5x2 . The same holds when we do floating-point
arithmetic on the computer. After all a floating-point number on the
computer is really a signed mantissa coefficient multiplying xn , where n is
the exponent part of the floating-point number and in the binary system
x = 2. So, when we combine two numbers with addition or subtraction,
we need to make sure that the exponent n is the same for the two numbers
before we actually perform the operation. Which means, one of the mantissas will have as many leading zeros added to it as required to make the
exponents equal. So, in performing this operation we have fewer significant digits. In fact, you could possibly have none at all! You may protest
saying this is why we dropped fixed-point arithmetic in the first place.
This is not quite right. If we had a situation where, n = 20 for the first
number and -25 for the second, fixed-point arithmetic will not be able to
represent either, especially if the nominally expected values are near one.
In the case of floating-point arithmetic, we have possibly a more accurate
31
representation for both the numbers, and one of them looses accuracy
when combined with the other in an addition or subtraction.
Case 3: The two cases we have seen just now deal mainly with errors caused by the
limited mantissa size. There is one more situation where the mantissa limit
will haunt us. We may encounter numbers where the exponent cannot be
adjusted and we are essentially stuck with fixed-point arithmetic in the
representation of the numbers.
If you have had a course in probability and statistics you can have a little fun
playing around with this. Most programming languages will allow you to roundoff
numbers to a certain number of decimal places. For example round(x, r) may
round x to r places. We will use this to perform an experiment[KPS97].
Assignment 2.3
First let us cook up some iterative scheme to generate a sequence of numbers
{x0 , x1 , x2 , . . . , xn , . . .}.
(2.1.3)
xn+1 = xn ,
= 1.01,
x0 = 1
Let us decide to round to four places, that is, r = 4. Call the rounded number x
n .
Then the roundoff error that we have made is
(2.1.4)
E = xn x
n
What are the extreme values (the maximum and minimum values) that E can take?
Plot a histogram to get an idea of the distribution of roundoff error.
Did the histogram look about uniform? That is not good news. It is not bad
news either. We would have been happy if the histogram had a central peak and
died off quickly since that would have meant that our roundoff error is not as bad as
we had feared. On the other hand it could have been bad news and the histogram
could have had a dip in the middle with most of the results having the largest
possible error. We can live with a uniform distribution. In fact this is the basis of
generating pseudo random numbers on the computer.
We have seen two possible structures or schemes by which we can represent
numbers. Generally, floating-point arithmetic tends to be more resource hungry.
However, it is very popular and it is the representation that we will use routinely.
Now, we go on to look at representing other data, which are made up of multiple
numbers. The preferred data structure in mathematics for this entity is a matrix.
We will look at the representation of matrices and multidimensional arrays in the
next section.
2.2. Representing Matrices and Arrays on the Computer
Matrices are easy to represent on computers. Many of the programming languages that you will learn will allow you to represent matrices and you should find
out how to do this. Some programming languages may call them arrays. You
should be aware that the programming language may not directly support matrix
algebra. We just need to implement the necessary matrix algebra or use a matrix
32
algebra library and we should be done. Into this happy world we now shower a
little rain.
Your favourite programming language may allow you to represent a square matrix. However, the memory in the computer where this matrix is actually stored
seems linear 3. Meaning, we can only store vectors. Now, how does your programming language manage this trick of storing something that is two-dimensional in
memory that is one-dimensional? Simple, store the two-dimensional arrays as a
set of one-dimensional arrays. Do we store them by rows or by columns? They
make an arbitrary decision: Store the matrix one row after another as a vector.
Remember now that someone else can make the decision (and they did) that they
will store the matrix one column after another. There are two reasons why we
should worry.
A lot of the mathematics that we have learnt is more easily applicable
if we actually store the data as vectors. This will allow us to naturally
translate the mathematics into our data structure and the algorithm.
If we do our matrix operations row-wise and it is stored column-wise we
may (and often do) pay a performance penalty.
Instead of using the word matrix, which has many other properties associated
with it, we will use the term array. This is also to convey to the reader that we
are talking of how the object is stored. Arrays come in different flavours. A onedimensional array of size N made up of floating-point numbers can be indexed using
one subscript: a[i]. These can be thought of as being written out in a row
a[1], a[2], a[3], , a[i], , a[n]
(2.2.1)
They are stored as such on most computers. On the other hand, if we are interested
in a two-dimensional array then we have
(2.2.2)
a[1, 1]
a[2, 1]
..
a[i, 1]
..
a[n, 1]
a[1, 2]
a[2, 2]
..
.
..
a[i, 2]
..
.
..
.
a[i,j]
..
.
a[n, 2]
a[n, j]
a[1, j]
a[2, j]
a[1, m]
a[2, m]
..
a[i, m]
..
a[n, m]
Here, we have laid out the two-dimensional array in two dimensions with two
subscripts i and j. Now, most programming languages will allow you to use two
subscripts to index into the array. One can continue to use this. However, we
3Actually memory is very often organised at the chip level as a two-dimensional array which
makes it possible to reduce the number of physical electrical connections to the chip. This is a
very serious cost consideration. This reduction happens as the address of any memory location
can now be split into two: a row address and a column address. So, one can use half the number
of pins and pass it two pieces of data corresponding to the row address and column address one
after the other.
33
have already seen that the memory in which it is actually to be stored is typically
one-dimensional in nature. Poor performance may result if one were to access
the array contrary to the order in which it is stored. Let us see how storing a
multi-dimensional array as a one-dimensional array works. If we were to lay a twodimensional array out row by row and employ only one subscript p to index the
resulting one-dimensional array we would have
a[1, 1], a[1, 2], , a[1, m], a[2, 1], a[2, 2], , a[2, m], , a[i, j],
(2.2.3)
Since it is now a one-dimensional array we use one subscript p to index into this
array as follows
a[1], a[2], , a[p], , a[nm].
(2.2.4)
In a similar fashion we can rewrite the same one-dimensional array with a single
subscript in the form of a matrix as shown below. This allows us to relate the two
sets of subscripts to each other.
(2.2.5)
a[1]
a[m + 1]
..
a[(i 1)m + 1]
..
a[(n 1)m + 1]
a[2]
a[m + 2]
a[j]
a[m + j]
..
.
..
a[(i 1)m + 2]
..
.
..
.
a[(i-1)m+j]
a[(n 1)m + 2]
a[(n m) + j]
..
a[m]
a[2m]
..
a[im]
..
a[nm]
In the two methods of indexing shown in (2.2.2) and (2.2.5) the element indexed
by (i, j) are boxed. Clearly, the relationship between p and i and j is
(2.2.6)
p(i, j) = (i 1)m + j
where m is called the stride. This is the number of elements to go from a[i, j] to
a[i + 1, j]. Every time you use two subscripts to access this array, your program
needs to perform the multiplication and addition in equation (2.2.6). On the other
hand, if you were to create a one-dimensional array of size nm so that you can
store two-dimensional data in it you can very often avoid the multiplication shown
above. For example, if we wanted to evaluate the expression a[i 1, j] + a[i + 1, j],
we would write a[p m] + a[p + m].
This way of storing a row at a time is often referred to as row-major. We could
also store the data one column at a time, which would be called column-major.
We will use row-major here.
We have seen how to represent two-dimensional arrays and matrices on the
computer. What if we had data that was in three dimensions? The solution is
simple: we view the three-dimensional array as a set of two-dimensional arrays
that are stacked in the third dimension. So, we just end up with another stride
consisting of the size of the two-dimensional array that is being stacked. If this
34
stride is designated by s, then our formula relating the indices i, j, and k of our
array to the index p of our representation is given by
(2.2.7)
s = nm
p(i1 , i2 , ..., id ) =
d1
X
(il 1)sl + id
l=1
We have not said anything about the memory efficiency of the representation
so far. We will look at only one scenario here. If we have a diagonal matrix of
size 10000 10000 we would have 108 elements to store. We do note that all but
10000 of them are guaranteed to be zero. What we do in this case is just store
the diagonal of the matrix as a one-dimensional array and then replace our formula
given by equation (2.2.6) with a small algorithm.
ZeroElement = 10001
def p(i, j):
if i == j:
return i
return ZeroElement
The 10001 element in the representation needs to be set to zero. There are more
sophisticated ways of doing this, but they are outside the scope of this book. You
can check out the section on programming A.1.
Assignment 2.4
(1) What happens to the formulas relating indices if our array subscripts
started at zero instead of one as done in many programming languages.
(2) How would you modify the algorithm given above to handle tridiagonal
matrices? (A tridiagonal matrix may have non-zero elements on the main
diagonal and each of the diagonals above and below the main diagonal.)
We are now able to represent numbers and arrays of numbers. Let us now look
at functions and the domains on which the functions are defined. The latter part
is a little more difficult and we will devote a large portion of a chapter to it later.
We will restrict ourselves to representing intervals on the computer and functions
that are defined on such intervals.
2.3. Representing Intervals and Functions on the Computer
We concluded at the end of Chapter 1 that we need to be able to represent
functions on the computer. A function is defined on a region that we call the
domain of definition. This means that at every point in the domain of definition,
the function takes on a value. In one dimension, this domain of definition may be
an interval.
We have already seen that floating-point numbers on a computer represent
intervals. What if we want to represent a part of the real line, say the interval
[0, 1]? Yes an array with the two values (0, 1) will suffice. However, unless we
35
3x2 + 2x + 1,
3 + 2
+ k,
(2.3.3)
(2.3.4)
3 cos + 2 sin + 1,
3 + 2 + 1,
(2.3.5)
(2.3.6)
(2.3.7)
(3, 2, 1),
321,
3dx + 2dy + dz,
3
+2
+
,
x
y z
(2.3.1)
(2.3.8)
where , , k are unit vectors, dx, dy, dz are differentials and the last term consists
of partial derivatives in the respective coordinates. What do all of these expressions (2.3.1-2.3.8) except one have in common? There is really only one case where
one would actually perform the addition and simplify the expression further, that
is expression (2.3.4), 3 + 2 + 1 = 6. Of course, for x = 1, the polynomial produces
the same result.
Why would you think it crazy if I tried to do this kind of a summation with
the other expressions. How come I can perform the addition when x = 1 and not
otherwise? Clearly, the x2 and the x prevent us from adding the coefficients 3 and
2. We look at this further. Let us perform an operation. How about this:
(2.3.9)
(3x2 + 2x + 1) + (2x2 + 5x + 2)
36
+ (2 + 5
(3 + 2
+ 1k)
+ 2k)
(2.3.10)
You see that this argument seems to work in the other cases where something
prevents you from adding the 3 to the 2. That something is our notation and
meaning that we ascribe to what we write and type. The x2 and x stop you
from adding the coefficients together just as the cos , sin or the , . In fact,
this is exactly the role of the , between the 3 and 2. The 321 is in fact the
polynomial with x = 10. The decimal notation is something with which we have
been brainwashed for a long time.
If you were to take a course in linear algebra they will point out to you that
you are dealing with a three-dimensional linear vector space. They will proceed to
talk about the properties of entities that deal with linear vector spaces. The point
that we should make note of here is that
In our examples, the space looks like our usual three-dimensional space or at least
promises to behave as such. The expressions
(2.3.11)
and
(2.3.12)
(3
+2
+
) + (2
+5
+
)
x
y z
x
y z
seem also to behave in the same fashion. However, we will not pursue these two
expressions at this point. They are stated here so that one can puzzle over them.
So what is the essence of the examples? The triple (3, 2, 1). In fact, (3, 2, 1)
is something that can be represented by an array on the computer. It can then
represent any of the functions in our list; it is our interpretation that makes it so.
We make the following observations
(1) Functions can be considered as points in some space. If we were to organise
this space as we did the real line it would be possible for us to come up
with algorithms to hunt down a function in a systematic fashion.
(2) Right now we have one way of representing a function on the computer
using arrays as long as we interpret the array entries properly. We can
use the ability to combine simple functions to represent a general function
as a linear combination of simple functions.
We not only want to represent functions on a computer, we would also like to
have them organised so that we can search for one of interest in a systematic and
efficient fashion. We will review the process of organising vectors in vector algebra
so that we understand how it works with functions. You would have done all of
this before when you learnt vector algebra, where the power of algebra is brought
to bear on the geometry of vectors.
37
V~
~B
~ = |A|
~ |B|
~ cos
A
~ is the magnitude of A
~ and |B|
~ is the magnitude of B.
~ |A|
~ and |B|
~ would
where, |A|
~
be the lengths of the line segments in the graphical representation of the vectors A
~ is the angle between the line segments also called the included angle. The
and B.
~ with unit vector a
dot product of B
is shown in Figure (2.4). It shows us that the
~
B
~a
~ cos
|B|
~
Figure 2.4. Graphical representation of the dot product of B
~ in the direction
with unit vector a
. |B| cos is the projection of B
of a
.
~ onto the line parallel to the unit vector
dot product is the projection of the vector B
a
. This definition of the dot product allows us to transition from drawing vectors
to algebraic manipulation. In particular one can immediately derive an expression
for the magnitude of the vector as
(2.3.14)
~ =
|A|
p
~A
~
A
38
~ as
We can consequently define a unit vector along A
(2.3.15)
a
=
~
A
~
|A|
~ R
~ = 0,
For two vectors whose magnitudes are not zero, it is now clear that if Q
then the two vectors are orthogonal to each other.
(2.3.16)
~ =
~ =
|Q|
6 0, |R|
6 0,
~ R
~ =0Q
~ R
~
and Q
~ and B
~ are two vectors on our page and they are not parallel to each
Again, if A
~ = B,
~ or a
other (if they were parallel to each other A
= b), we can represent any
other vector on the page as a linear combination of these two vectors. The plane
containing the page is called a linear vector space or a Banach space. Since any
~ and B,
~ they are said to span the
vector in the plane can be generated using A
~
plane. That is some other vector P in the plane of the page can be written as
~
~ + B
P~ = A
(2.3.17)
(2.3.18)
How do we find and ? The dot product maybe useful in answering this question.
If we find the projection of P~ along a
and b, we would get
(2.3.19)
Pa
(2.3.20)
Pb
P~ a
= + b a
P~ b =
ab+
So, to find and we would need to solve this system of equations. However, if
a
and b were orthogonal to each other, things would simplify and then P~ can be
written as
b
(2.3.21)
P~ = Pa a
+ Pbb, when a
see Figure(2.5). This simplification that orthogonality gives is so great that quite
~ and B
~ started of by not being orthogonal
often we go out of our way seeking it. If A
P~
|P~ | sin
~b
~a
|P~ | cos
~ and B
~ using
Figure 2.5. P~ is resolved into components along A
the dot product and hence can be represented by these components
39
~ R
~ or an
to each other we could do the following to obtain an orthogonal set Q,
orthonormal set q, r.
~ =A
~
(1) Set Q
(2) then q = a
is the corresponding unit vector
~ q is the projection of B
~ onto q and the vector component is (B
~ q)
(3) B
q.
~ =B
~ (B
~ q)
~ This can be easily
(4) R
q is vector that is orthogonal to Q.
checked out by taking the dot product.
~ which is
(5) if the problem was in three dimensions and a third vector C
~
~
~ which is
independent of A and B is available, we can obtain a vector S
~
~
~
~
~
~
orthogonal to Q and R as S = C (C q)
q (C r)
r, where r is the unit
~
vector along R
~ =B
~ {B
~ q} q
R
~
B
r
q
~ =A
~
Q
~ q
B
~
Figure 2.6. The Gram-Schmidt process applied to two vectors A
~
and B.
The first few steps to obtain two orthogonal unit vectors is shown in Figure 2.6.
The next step to find the third vector perpendicular to the first two is shown in
Figure 2.7.
~
C
~s
~q
~r
~ ~q }~q + {C
~ ~r }~r
{C
Figure 2.7. Having applied the Gram-Schmidt process to the two
~ and B
~ as shown in Figure 2.6 we now include C.
~
vectors A
This is referred to as the Gram-Schmidt process and can be carried out to as
~ 1, E
~ 2 , ..., E
~ n,
many dimensions as required. For example, if we have a set of vectors E
4
which span an n-dimensional space, we can then generate an orthonormal set as
follows.
4Remember that this means that any vector in that n-dimensional space can be represented as a
40
~1
E
~ 1|
|E
~ 2 (E
~ 2 e1 )
then E~2 = E
e1
(2) Set e2 =
E~2
|E~2 |
..
.
~ i Pi1 (E
~ i ej )
E~i = E
ej
j=1
E~i
|E~i |
(i) Set ei =
..
.
~ n Pn1 (E
~ n ej )
E~n = E
ej
j=1
(n) Set en =
E~n
|E~n |
Thus, we generate n unit vectors that are orthogonal to one another. This technique
is illustrated here as it is quite easy to appreciate. However, as has been pointed
out before, it is liable to accumulate roundoff errors with all those subtractions.
There are more robust techniques to achieve the same end of obtaining an
orthogonal set of vectors. We could perform rotations[GL83]. For the sake of our
discussion here, the Gram-Schmidt process will suffice. It is clear that the process
requires not only the addition and subtraction of vectors, but also the dot product.
Once we had the dot product, we were able to build a basis of vectors with which
we could represent any other vector in our space. With this in mind, we are now
in a position to extend this idea of vectors to functions. To this end, we look for
functions that will act as a basis with which we can represent functions of interest
to us.
2.4. Functions as a Basis: Box Functions
We will start off by looking at a very specific example. We will define box
functions and check out their properties. In fact it seems, we will find an easy way
to build a basis without using the Gram-Schmidt process.
2.4.1. Box Functions. Consider the two functions
(2.4.1)
f(x)
=
=
g(x)
=
=
and
(2.4.2)
support of f
support of g
The support of a function is that part of its domain of definition where it, the
function, is non-zero. The two functions f and g are shown in Figure (2.8).
41
Figure 2.8. Plots of f and g defined on the interval [0, 1]. These
functions are orthogonal as the supports are non-overlapping. We
will call these functions box functions.
It should be clear to you that we can add/subtract these functions to/from
each other. That is, af + bg for two numbers a and b makes sense and the operation
can actually be performed. As we noted earlier, the whole process works with the
definition of the dot product. If we define the dot product or the inner product as
(2.4.3)
hf, gi =
f()g()d
0
we can actually take dot products of functions defined on the interval [0, 1]. Of
course, here we assume that the integral exists. We use the h, i notation for the dot
product since the is usually used for the composition of functions and the use of
may create confusion.
In general, for functions f and g defined on the interval [a, b], we can define the
dot product as
Z b
f ()g()d.
(2.4.4)
hf, gi =
a
Just as a matter of general knowledge, a linear vector space of functions where the
inner product is defined is called an inner product space or a Hilbert space. You
will notice that even here, as with the earlier dot product, we have
hf, gi = hg, f i.
(2.4.5)
The definition of the inner product immediately allows us to define the following
terms. The magnitude of f (or the norm of f) is given by
(2.4.6)
p
kfk = hf, fi =
f()2 d =
0
0.5
d =
0.5
We use kk for the norm as | | is already used for the absolute value. Consequently,
we have the norm (or magnitude) of a function f defined on a general interval as
s
Z b
p
f ()2 d.
(2.4.7)
kf k = hf, f i =
a
42
Without any loss of generality we will look at examples on the interval [0, 1].5
It is clear that hf, gi is zero. From our analogy with vector algebra, we conclude
that f is orthogonal to g. In fact, we can generalise the definition of the angle
between two vectors to the angle, , between two of these functions as follows
(2.4.8)
= cos
hf, gi
kfk kgk
kfk 6= 0, kgk 6= 0
In particular, as we had already noted if the = /2, then the functions are said
to be orthogonal to each other. It is likely that you have seen this earlier if you
have already studied Fourier series.
Look closely at the definitions of f and g and their graphs in Figure 2.8. Do
you see why they are orthogonal? f is non-zero where g is zero and vice-versa. So,
the support of f is [0, 0.5] and the support of g is (0.5, 1.0]. Since the supports of
the two functions do not overlap, the functions turn out to be orthogonal to each
other. We chose f as being the constant 1.0 in the interval (0, 0.5). Now we see that
it could have been almost any function as long its support is (0, 0.5). In a sense, the
intervals are orthogonal to each other. We can pick any function on the interval;
for now the constant function suffices. It would be nice if f and g were orthonormal,
meaning their norms (or magnitudes) are one.
In order to make f orthonormal, we
need to divide the function f by its norm, 1/ 2. We can do the same for g.
We can define the set of all functions generated by taking linear combinations
of f and g to be S2 . In fact we could have defined three functions f, g, and h with
corresponding support [0, 1/3], (1/3, 2/3] and (2/3, 1]. These form S3 . We can even
add 3f + 2g + h to our list of expressions (2.3.1).
Assignment 2.5
(1) Given two arrays (3, 2, 1) and (5, 1, 4), perform the following operations
+ (5 + 1
and (3f + 2
+ (5f + 1
(a) (3 + 2
+ k)
+ 4k)
g + h)
g + 4h)
(5 + 1
and h(3f + 2
(5f + 1
(b) (3 + 2
+ k)
+ 4k)
g + h),
g + 4h)i
43
Pf f + Pg g + Ph h
where Pf would be the f-component of P . Likewise, the other two are the gcomponent and the h-component, respectively.
Taking the scalar product with f we see that
hP, fi = hPf f + Pg g + Ph h, fi
(2.4.10)
Therefore,
(2.4.11)
Pf =
hP, fi
kfk2
With the new definition of f, g, and h we have kfk, kgk, and khk as 1/ 3. We have
by definition
1/3
Z 1/3
Z 1
x2
1
xdx =
P fdx =
(2.4.12)
hP, fi =
=
2
18
o
0
0
giving us Pf = 1/6. Verify that Pg = 1/2 and that Ph = 5/6. This results in the
approximation to the straight line in our coordinate system as shown in figure
2.10.
Clearly, the function and the representation are not identical. This, we see, is
different from our experience with vectors in three spatial directions. We have here
a situation very much like roundoff error. There is a difference between the original
function and its representation. We should really write
(2.4.13)
P (x) P = Pf f + Pg g + Ph h
Can you make out that the area under the two curves is the same? We will refer
to this generalised roundoff error as representation error. How do we get an
estimation of this representation error? We need the distance between two points
in our function space. We can use the norm that gives us the magnitude to define
a metric which will give the distance between two points as
p
(2.4.14)
d(F, G) = kF Gk = hF G, F Gi,
F and G are points in the function space. This allows us to get a measure for the
error in our representation as
v(
)
u Z b
2
u
(2.4.15)
E(P , P ) = d(P , P ) = t
P (x) P (x) dx = d(P, P )
a
Let us now calculate E. From Figure 2.9, it is clear that the difference between our
function and its representation over the interval (0, 1/3) is identical to that over
44
1.0
1
3
2
3
x (xi , xi+1 )
hi
(2.4.17)
Bih =
0
otherwise
The Bih are orthonormal. That is
hBih , Bjh i
(2.4.18)
1.0
1
0
45
i=j
i 6= j
1.0
1
3
2
3
1
3
2
3
(2.4.19)
P (x)
N
X
ai Bih
i=1
The basis functions Bi defined so far are called box functions as suggested by the
graph of any one basis function. In all of the equations above, we have used the
equals sign to relate the function to the representation on the right hand side.
However, we see that the right hand side is only an approximation of the left hand
side. Remember that h (xi+1 xi ) really determines the definition of the box
functions. We can use the symbol P h to indicate the representation on the given
basis. So, we really should write
(2.4.20)
P h (x) =
N
X
i=1
ai Bih
46
where, Bih are defined on a grid of size h. The error at any point in the representation is
e(x) = P (x) P h (x).
(2.4.21)
E = kek =
hP P h , P P h i =
b
a
(P (x) P h (x))2 dx
For the function f (x) = x, we can work out the general expression for the error E.
It is given by
(2.4.23)
E=
1
h
= .
2N 3
2 3
The reader is encouraged to verify this. The point to note is that we get closer and
closer to the actual function as N increases and this happens at the rate proportional
to 1/N . The approximation is said to converge to the original linearly. The error
goes to zero in a linear fashion. The error is said to be of first order.
On the other hand, we can represent a constant function ax0 exactly (this is
a times x raised to the power zero). The representation is said to accurate to the
zeroth order, or simply the representation is of zeroth order.
It is important to note that the first corresponds to the rate of convergence
and the second represents the order of the polynomial that can be represented as
exactly up to roundoff error.
As promised, we have come up with a basis that allows us to represent our
functions in some fashion. However, the more accurate is the representation, the
jumpier it gets. Instead of breaking up the region of interest into intervals, we
could have tried to use polynomials defined on the whole interval of interest. These
polynomials can then be used to represent functions on the whole interval [0, 1].
Assignment 2.6
(1) I would suggest that you try to represent various functions using the box
functions on the interval (0, 1). Do this with 10 intervals and 100 intervals.
(a) x2 ,
(b) sin x,
(c) tan x
(2) For the first two items repeat the process for the interval (0, ).
(3) Find e(x) and E(P, P h ) for all the approximations that you have obtained
in the first two problems. Is e(x) orthogonal to P h (x)?
2.4.2. Polynomial on the Interval [0, 1]. We have already asked in an earlier assignment whether the functions p0 (x) = 1 and p1 (x) = x, both defined on
the interval [0, 1], are orthogonal to each other. That they are not, is clear from
the fact that
Z 1
1
xdx =
(2.4.24)
hp0 , p1 i =
2
0
47
If we define
(2.4.25)
pi (x) = xi ,
you can verify that none of these are orthogonal to each other. Why not try to
apply the Gram-Schmidt process and see where it takes us? For convenience we
will write p instead of p(x). We will indicate the orthonormal basis functions as pi .
Let us find the expression for a few of these pi . The first one is easy. It turns
out p0 = p0 . In order to find the second one we need the component of p1 along
p0 . We have already taken the dot product in equation (2.4.24). Using this, the
component of p1 along p0 is given by the function h(x) = 21 . So, we get
p1 h(x)
1
=2 3 x
(2.4.26)
p1 =
kp1 h(x)k
2
Along the same lines we have for the third basis function We need to find the
components of p2 along p0 and p1 . We take this out of p2 and normalise to get
1
1
1
(2.4.27)
p2 = 6 5 x2 x +
= 6 5 x2 x
6
2
3
You can try a few more of these and other problems from the assignment.
Assignment 2.7
(1) Find p3 and p4 .
(2) Repeat this process on the interval [1, 1].
(3) Find p0 and p1 on the interval [1, 2].
(4) Repeat the last question with the arguments of the functions taken as x1
instead of x. How about if we calculate the other two basis functions p3
and p4 ?
If you have had a course in differential equations, you may have recognised the
polynomials that you got from problem 2 of assignment 2.7 as being the Legendre
polynomials. We can clearly apply this scheme to obtain a set of basis functions
on any interval [a, b], bearing in mind that as x increases all polynomials of degree
one and greater will eventually diverge.
It looks like we have something here that we can use. The functions are smooth
and right now look as though they are easy to evaluate. The scheme of using these
polynomial bases does suffer from the same problem that Fourier series representation does. That is, they lack a property that we call locality. Just say you were
trying to fit a function and there was some neighbourhoods where you were particular that the representation should be good. You find coefficients and pick enough
terms to fit one neighbourhood and then find that at some other spot it is not that
good. Adding terms and tweaking may fix the second spot but then change the
representation in the first spot on which we have already spent a lot of time. By
locality, we mean that if we were to change something (for example a coefficient)
in one location it is not going to affect our representation in any other location.
On the other hand, if we think back to our box functions, we see that we
did indeed have the locality property. That is, changes in coefficients of one basis
function did not affect the representation elsewhere. We will now try to merge these
two ideas in the following section.
48
1.0
1.0
xi
xi+1
xi
Ni1
xi+1
0
Ni+1
0
Figure 2.11. Plots of two functions Ni1 and Ni+1
defined on the
interval [xi , xi+1 )
Consider two functions shown in Figure 2.11). They are defined as follows
(2.5.1)
Ni1 (x)
(
1 i (x)
=
0
and
(2.5.2)
0
Ni+1
(x) =
i (x)
0
where,
(2.5.3)
i (x) =
x xi
xi+1 xi
0
The functions Ni1 and Ni+1
are shown in Figure 2.11. They are first degree polynomials in the interval [xi , xi+1 ) and zero outside the interval. [xi , xi+1 ) is the
0
support for the two functions Ni1 and Ni+1
.
What is the graph of the function
(2.5.4)
0
f (x) = aNi1 + bNi+1
?
This function is graphed in Figure 2.12. It is zero outside the interval [xi , xi+1 ).
49
0
f (x) = aNi1 + bNi+1
b
1.0
0
Ni+1
Ni1
xi
xi+1
(2.5.6)
f (x) =
n
X
i=1
0
ai Ni1 + bi+1 Ni+1
h is the size of a typical xi+1 xi and is used as a superscript here to distinguish the
representation of the function from the function. This representation is illustrated
in the Figure 2.13.
Look carefully at this now. The function is continuous at the point xi . The
value of ai must be the same as that of bi . In this case, we can expand the summation in equation (2.5.6) to get
(2.5.7)
n
X
i=1
ai Ni0 + Ni1
50
bi+1
1.0
1
0
f (x) = ai1 Ni1
+ bi Ni0 + ai Ni1 + bi+1 Ni+1
1
Ni1
Ni1
bi
ai
ai1
Ni0
0
Ni+1
xi
xi+1
Ni = Ni0 + Ni1
1.0
Ni0
xi1
Ni1
xi+1
xi
Figure 2.14. The sum Ni0 + Ni1 gives us Ni which is called the
hat function. It also called the tent function
Now, we can write
(2.5.8)
f h (x) =
n
X
a i Ni
i=1
where the Ni are hat functions that have unit value at the grid point xi . Again,
we should get a closer approximation to the function if we increase the number of
intervals. Functions represented by the hat function can be continuous, but the
51
derivatives at the nodal points will not be available. However, one could take as
the derivative at a node, the average of the slopes of the two line segments coming
into that node.
It is clear that we can represent polynomials of first order exactly. Hat functions
give us a first order representation. Consequently, the error is of second order. By
checking the error in representation of a quadratic we can see that the rate of
convergence also happens to of second order. That is as h 0, the error goes to
zero as h2 . The error goes to zero in a quadratic fashion.
Is there a difference between our approach to the box functions and the hat
function here? There is. In the definition of the Bi of the box function, we made
sure that they were orthonormal. In the case of the hat functions, we have an
independent set that allows us to represent any function. We could, but do not
normalise them. The advantage with defining the hat functions in this fashion is
that linear interpolation of tabulated data is very easy. The coefficients ak are
the nodal values taken directly from the tabulated data. No further processing is
required.
1.0
x=a
i= 0
N0
N1
N2
N3
N4
x=b
4
52
(3) For the first three items, repeat the process for the interval (0, ).
(4) Given n grid point that are equally spaced on the interval [a, b] so as to
divide that interval into n 1 subintervals and n values of a function at
those points do the following.
(a) Fit hat functions and graph the function.
(b) Write a function interface, h(x), to your representation so that given
an x in [a, b] the function will return h(x) which is the value given
by your representation.
(c) Write a function mid(x) that return an array of the function values at
the midpoints of the n 1 intervals as obtained from the hat function
representation
(2.5.9)
2h
hNi (x), Nj (x)i =
j <i1
j =i1
j=i
j =i+1
j > i + 1,
(2.5.10)
f (x) =
n
X
ai Ni
i=1
Ni (x) =
(2.5.11)
53
x < xi1
x (xi1 , xi )
x (xi , xi+1 )
x > xi+1
1
h
xi1
xi
1
h
xi+1
=
+
(2.5.13)
f (xi ) =
2
h
h
2h
(2.5.12)
f () = ai Ni () + ai+1 Ni+1
() =
54
sin nx,
55
sin(x)
1
0.5
0
-0.5
-1
Figure 2.17. sin x sampled at 11 grid points on the interval [0, 2].
essential features are captured.
The extrema are not captured by the representation. Since we are not
sampling the sin x function at /2 and 3/2, that is, we do not have
a grid point at /2 and 3/2, the extrema of sin x are not represented.
Clearly, this problem will exist for any function. If we do not sample the
function at its extrema, we will loose that information.
The zero crossings, the point at which a function traverses the x-axis,
are quite accurate. Try to generate these graphs and verify that the zero
crossings are good (see first problem of assignment 2.10).
sin(2x)
1
0.5
0
-0.5
-1
Figure 2.18. sin 2x sampled at 11 grid points on the interval [0, 2].
In Figure 2.18, we double the frequency of the sine wave. Or, equivalently, we
halve the wavelength. Again, we see that the extremal values are not represented
well. With some imagination, the graph looks like the sine function. Again,
verify that the zero crossings are fine. So, with ten intervals we are able to pickup
the periodicity of the sine function along with the zero crossings. The inherent
anti-symmetry of the sine function is also captured by the representation.
Now we check out what happens at three times the fundamental frequency.
The wavelength is consequently one third the original length. We have not lost the
anti-symmetry in the representation. Our peaks now are quite poor. The number of
zero crossings is still fine. The location of the zero crossings is now a bit inaccurate.
56
sin(3x)
1
0.5
0
-0.5
-1
Figure 2.19. sin 3x sampled at 11 grid points on the interval [0, 2].
Through the zero crossings, our estimation of the basic frequency of the signal is
still accurate.
sin(4x)
1
0.5
0
-0.5
-1
Figure 2.20. sin 4x sampled at 11 grid points on the interval [0, 2].
It would take some pretty good imagination to see the sine wave in our representation for the sin 4x employing ten intervals as shown in Figure 2.20. The
extrema are off. The zero crossing are off. The number of zero crossings is still
correct.
sin(5x)
1
0.5
0
-0.5
-1
Figure 2.21. sin 5x sampled at 11 grid points on the interval [0, 2].
The sin 5x curve represented on the ten intervals is shown in Figure 2.21. In fact
you could ask: What curve? Here, we have clearly sampled the original function
at exactly those points at which it is zero. We have completely lost all information
on the extrema. This is a disaster as far as representing functions goes.
We have seen a continuous degeneration in our representation as the wave
number increased. We are using ten intervals for our representation. At a wave
57
number five, that is half of the number of intervals that we are using, we have seen
complete degeneration. It has no amplitude information or frequency information
that we can discern. We will press on and see what happens as we further increase
the wave number.
sin(6x)
1
0.5
0
-0.5
-1
Figure 2.22. sin 6x sampled at 11 grid points on the interval [0, 2].
We see in Figure 2.22 the representation for sin 6x. Does it look familiar. Go back
and compare it to sin 4x. They are in fact negatives of each other.
Consider the graph of the representation of sin 7x shown in Figure 2.23 This is
sin(7x)
1
0.5
0
-0.5
-1
Figure 2.23. sin 7x sampled at 11 grid points on the interval [0, 2].
the negative of the representation of sin 3x. We should be able to guess that the
next representation shown in Figure 2.24 would look like sin 2x. The most shocking
of them all is the fact that the representation of sin 9x on ten intervals looks like
the representation of sin x on the same ten intervals. The order of graphs is just
reversed from the fifth one. For ten intervals, wave number five is called the folding
frequency.
Just for the sake of completeness we plot the representation of sin 10x on ten
intervals in Figure 2.26. We are not surprised by the graph that we get and expect
this whole drama to repeat as we increase the wave number. We see that there is
a highest wave number that we can represent on this grid. I repeat:
A given grid on which we are going to sample functions
to obtain a representation of that function, has associated with the grid a maximum wavenumber that can be
captured. The function representation using hat functions will not be good at the higher wave numbers.
58
sin(8x)
1
0.5
0
-0.5
-1
Figure 2.24. sin 8x sampled at 11 grid points on the interval [0, 2].
sin(9x)
1
0.5
0
-0.5
-1
Figure 2.25. sin 9x sampled at 11 grid points on the interval [0, 2].
sin(10x)
1
0.5
0
-0.5
-1
Figure 2.26. sin 10x sampled at 11 grid points on the interval [0, 2].
Looking at all the figures that we have plotted it looks as if sin 4x is the highest
frequency sinusoid that can be captured by our grid of 11 points. Wave number
four corresponds to a high frequency on a grid of 11 points. It corresponds to a
lower frequency on a grid of 101 and an even lower frequency on a grid of 1001
and so on. The point being, when we say high frequency or low frequency, we are
talking about a given frequency in comparison to the size of the grid.
Assignment 2.10
(1) Generate the graphs shown in figures 2.17 2.26. Verify amplitudes and
zero crossings.
(2) What happens if we try the same thing on 10 grid points? Is there a
difference between an even number of grid points and an odd number of
grid points.
59
(3) Try out the following. Represent the sin 4x function using 41, 81, 101, 161
grid points.
(4) Repeat the previous problem using sin 4x + sin 40x.
N 0 (x)
i (x)2
(2.6.2)
N 1 (x)
2i (x)(1 i (x))
(2.6.3)
N (x)
(1 i (x))2
where, again
(2.6.4)
i (x) =
x xi
.
xi+1 xi
As in the case of the hat functions the sum of the three functions on the interval
is always one. So any quadratic on the interval (xi , xi + 1) can be represented as
a linear combination of these three functions. Instead of dwelling further on the
quadratic representation let us, instead skip directly to a cubic. We will follow the
same process that we did with the hat function. We consider an interval and ask
0
ourselves what kind of cubics would we use akin to Ni1 and Ni+1
.
60
0.8
N(t)
0.6
0.4
0.2
0.2
0.4
0.6
0.8
N 0 (x)
i (x)3
(2.6.6)
N 1 (x)
(2.6.7)
N 2 (x)
3i (x)2 (1 i (x))
(2.6.8)
N 3 (x)
where, again
3i (x)(1 i (x))2
(1 i (x))3
x xi
.
xi+1 xi
Before we proceed any further, it should be noted again that in all the four
cases studied here,
(1) The sum of the functions in given interval add to one (go ahead and check
it out.)
(2) takes values in [0, 1] and is kind of a non-dimensionalised coordinate on
any particular interval.
(3) the coefficients seem to correspond to those of a binomial expansion of the
same order. If you remember, it was just a matter of combinatorics.
On the interval (xi , xi + 1) we can represent any function as
(2.6.9)
i (x) =
(2.6.10)
f h (x) =
3
X
i=0
ci N i (x)
61
0.8
N(t)
0.6
0.4
0.2
0.2
0.4
0.6
0.8
c0
(2.6.12)
c1
(2.6.13)
c2
(2.6.14)
c3
a1 = f (xi )
d1 h
+ a1 , d1 = f (xi ), h = xi+1 xi
3
d2 h
a2 , d2 = f (xi+1 )
3
a2 = f (xi+1 )
62
sin(x)
0.5
0
-0.5
-1
sin(x)
0.5
0
-0.5
-1
sin(x)
0.5
0
-0.5
-1
Figure 2.29. Three cubic spline representations of sin x, the function was sampled at 5, 6, 11 grid points along with derivatives at
those points. The constituent components that add up to the final
function are also shown. The spline in each interval is sampled at
ten points for the sake of plotting the graphs.
Figure 2.29 shows three plots of cubic spline representations of the function
f (x) = sin x. The first graph is generated by using five grid points or four equal
intervals with h 1.57. The second graph is generated with five intervals, h 1.26.
Finally, the third one is with ten intervals and an interval size h 0.628. The first
and last graph in Figure 2.29 have a grid point at x = . The second graph does
not. Using our superscript notation f h (x) for the representation, we will refer to
each of the above representations as f 1.57 , f 1.26 , and f 0.628 . On paper, the three
function representations may look equally good. However, that just says that our
visual acuity is not good enough to make the judgement. We can estimate the error
by numerically evaluating the norms kf f h k for the three different h values. We
get
(2.6.15)
(2.6.16)
(2.6.17)
kf f 1.57 k
kf f
kf f
1.26
0.628
= 0.01138,
= 0.00474,
= 0.000304.
For the last two graphs, the interval size was halved and you can check that the
error dropped by a factor of about 15.6 16. This seems to tell us that on the
63
interval the error goes as h4 . This is an empirical estimate. We could follow the
same process that we did with box functions and hat functions to derive the analytic
expression. Again, it must be emphasised that the representation is accurate to a
third degree polynomial and as h 0 the error goes to zero as h4 . We will look at
another way to determine this relationship between the error and the interval size
h in the next section.
As was indicated earlier, there are quite a few ways by which one can fit a cubic
to a set of points. At the grid points one could provide the function and the second
derivative instead of the function and the first derivative. Of course, the ultimate
solution would be to use only the function values at the grid points to obtain the
cubics.
Assignment 2.11
(1) Try out the cubic representation for various wave numbers and compare
them to the examples shown using hat functions. Compute the error in
the representation. How does this error change as a function of wave
number?
All the function bases that we have seen so far, starting at the linear hat function, ensure continuity at the nodal points. The higher order bases offer continuity
of higher derivatives too. Is it possible to get a better representation for the function if we did not seek to have this continuity? For the hat functions, you can go
back to equation (2.5.6) and see where we imposed the continuity condition. We
will now relax that requirement and see what we get. Some analysis is done in
section 7.1.2.
2.6.1. Linear Interpolants on an Interval. We want to go back and see if
we can use a function that is linear on an interval as the mechanism of building a
function basis. The only difference now is that we will not impose the requirement
that the representation be continuous at the grid points, even if the the function
being represented is continuous. We will go back to equation (2.5.6) which is
rewritten below
(2.6.18)
f h (x) =
n
X
i=1
fih =
n
X
i=1
0
.
ai Ni1 + bi+1 Ni+1
We could ask the question what are the best values of ai and bi+1 to approximate
the given curve on the interval (xi , xi+1 )? By best we mean kf fih k in (xi , xi+1 )
is a minimum. This question is beyond our scope right now, we will answer this
question in section 7.1.2.
We will end this section with one more question. We have suggested that there
could be different linear interpolants using straight lines. We have suggested a little
before that there are numerous cubics that we can use to represent our function.
We have to ask: How do we know that we can always find this representation?
Let us see what we have. Consider the space of functions defined on the interval
64
[a, b] spanned by a linearly independent basis bi (x). For any given function F (x)
defined on [a, b], as long as hF, bi i is defined for all i we can find the components.
To keep the discussion simple, let us assume that the basis is orthonormal. Then
the components of F , we have seen, are ci and F can be written as
X
(2.6.20)
F =
c i bi
i
Is the representation found by you going to be the same as the one found by me?
Let us say for the sake of argument that you have gone through some process
and obtained the coefficients, Ci . Then, the function can be written by the two
representations as
X
X
(2.6.21)
F =
c i bi =
C i bi
i
X
i
c i bi
X
i
C i bi =
X
i
(ci Ci )bi = 0
65
to measure of the difference between functions over the whole domain of definition.
So, if we have a function and its representation, we can find the magnitude of the
difference between them. This gives us a global measure of the difference between
a function and its representation. At each point in the domain of definition of the
function, we could get a local error by just subtracting the representation from
the original function. Is there a way we can get an a priori estimate of the error?
Put another way, can I say my error will be of this order if I use this kind of a
representation? If I have this mechanism to estimate the error before I actually go
about my business (that is the a priori part), I can make a choice on the basis
functions, size of grid - meaning how many grid points, distribution of grids and so
on.
I repeat: We have an idea of how our computations using these approximations
of numbers affect our final result. We need to look at how we approximate functions
more clearly. We have so far seen a few methods to approximate functions. We will
now try to figure out what exactly is the approximation. That is, what is the error
in the representation? We will look at this problem of approximation of functions
and related entities anew.
Very often, we are interested in estimating and approximating things You say,
wait a minute this is what we have been talking about so far. Not quite. Until
now, we have been looking at a situation where we are given a function and we
want to represent it exactly, failing which, we will seek a good approximation. Now
consider a situation where we do not have the function ahead of time. How is this
possible? you ask. Well, lets look at the following example.
You are a student who has completed 140 credits of a 180 credit
baccalaureate program. You have applied to me for a job. You plan
to take up the job after the 180 credits are completed. If you provided
me with your CGPA (Cumulative Grade Point Average) at the end
of your 140 credits, how would I estimate your CGPA at the end of
your studies?
Let us state it as a mathematics question. If I knew that a function f (x) had a
value 1.0 at x = 0.0, then is there a way I could estimate the value at x = 0.1? See, it
is different from having a function ahead of time and then finding an approximation
to it. A year from now, I will have your actual CGPA. However, I want to have an
estimate now as to what it will be and I want to get an idea of the error in that
estimate.
So, if f (0) = 1 what is f (0.1)? Let us use Taylors series and see what we get.
(2.7.1)
0.12
f (0) +
f (0.1) = f (0) + 0.1f (0) +
2!
|
{z
}
truncation error
and are used to indicate derivatives. If we were to assume the value of f (0.1)
to also be 1.0 then, we are essentially truncating the Taylors series after the first
term. This looks like we were using our box functions to represent the function f .
The terms we have thrown away are identified as the truncation error. Very often,
you will find that the truncation error is declared using the lowest order term
in the truncation series. If we were to rewrite the above example in general terms
66
f (x + x) = f (x) + x
f (x) +
x2
f (x) +
2!
f (x + x) f (x) + xf (x).
x
2!
f (x) +
and the new estimate is expected to be better than the zeroth order estimate. This
can used to represent up to a first order polynomial accurately. The truncation
error is second order. This is as good as representing a function using hat functions
as the basis. The hat function is a first degree polynomial, a polyline to be precise.
So, the representation is only first order. I repeat this for emphasis so that there is
no confusion between the order of the representation and the order of the truncation
term. As we go along, you will see that the order of the truncation term will be
source of confusion since it is tagged by the exponent of the increment and not by
the order of the derivative occurring in the truncation term.
Of course, if you want an a priori estimate of the truncation error, you need
an estimate of the derivative involved in the truncation error. How do we get an
estimate of the derivative? Look at equation (2.7.3) again. It gives us an idea of
how to proceed from here.
(1) The way it is written, it tells us that if we have the function and the
derivative at some point x, we can use that derivative to step off by x
to get the function at the new point x + x. Now all we need to do is
to find a way to get the derivative at x + x and we are in business.
We can integrate in x to construct the function. We would effectively be
solving a differential equation. We will see how this works at the end of
this chapter.
(2) If we have the value of the function at two points x and x + x, we can
get an estimate of the derivative.
We will start by looking at approximating the derivative. Equation (2.7.2) gives
us a way to get an estimate for the first derivative and the associated error. First
we rewrite it as
(2.7.5)
xf (x) = f (x + x) f (x)
x2
f (x)
2!
67
or
(2.7.6)
f (x) =
f (x + x) f (x) x
f (x)
x
|2!
{z
}
trucation error
We see that an estimate with a first order truncation error for the derivative at the
point x is
(2.7.7)
f (x) =
f (x + x) f (x)
.
x
(2.7.8)
Please note that if the derivative f (x) is known, then it is a first order estimate
of f (x + x) from Taylors series (2.7.8). This estimate is with an error xf (x),
which is twice the error of (2.7.7).
To summarise, if we have a function at hand or know the nature of the function
( linear, quadratic, . . . , periodic, . . . ), we can decide on the representation and
get an estimate of the error in our representation. On the other hand, if we are
constructing the function as we go along we can still get an idea as to how good or
poor is our estimate. We also see that though there are two different view points
of representing the derivative of a function, they are actually related. We can tie
the finite difference scheme with representing the function by, say, a hat function.
As a side effect from our look for local error estimates, we have found that we
can estimate the first derivative of a function. We are interested in solving differential equations. Is there a systematic way by which we can generate approximations
to derivatives? Can we again get an idea of the error involved and the sources of
the error?
2.8. Representing Derivatives - Finite Differences
In the previous section, we have seen that we can get a representation for
the first derivative of a function in terms of the nodal value of the function. The
representation of the function was linear and the derivative was a constant on the
interval of interest. We will now start over and see if we can build up estimates of
various orders for derivatives of various order. For example, we could be interested
in first order, second order..., estimates of derivatives. The derivatives of interest
68
maybe first derivatives, second derivatives and so on. Lets see how we work this.
In many calculus texts, the derivative is defined as follows
df
f (x + x) f (x)
(x) = lim
x0
dx
x
In order to estimate the derivative at a point x, one can use the values known at
x and x + x and eliminate the limiting process. Thus, we have a finite difference
approximation for the first derivative.
f (x + x) f (x)
df
(x)
dx
x
The question that remains to be answered is how good is this estimate?.
To this end, we turn to Taylors theorem.
(2.8.1)
x
x2
x3
f (x) +
f (x) +
f (x)
1!
2!
3!
x4
+
f (x) +
4!
We see that by rearranging terms we can extract out the approximation (2.8.2)
for the first derivative and write an equation instead of the approximation. We get
(2.8.2)
f (x + x) = f (x) +
f (x) =
(2.8.3)
f (x + x) f (x) x
f (x) +
x
2!
2
d f
The error therefore is of the order of x
2! dx2 (x) As opposed to starting with a
classical definition of the derivative, we see that using Taylors series to expand the
function at a point (x+x) about the point of interest we can isolate the derivative
of interest.
f (x + x) f (x)
.
x
The superscript h is to remind us that this is an approximation with h x.
We will drop the use of the superscript unless there is ambiguity. The expression
on the right hand side of equation (2.8.4) is called a forward difference and
the truncation error is referred to as first order. This approximation for the first
derivative of f is called a forward difference as the approximation is at the point
x and it involves the points x and x + x. We can proceed to derive a backward
difference formula in a similar fashion. Using the Taylors series expansion for the
function at x x, that is
f h (x) =
(2.8.4)
x2
x3
x
f (x) +
f (x)
f (x)
1!
2!
3!
x4
+
f (x) +
4!
we get the backward difference approximation for the first derivative at x as
(2.8.5)
f (x x) = f (x)
(2.8.6)
f (x) =
f (x) f (x x) x
x2
+
f (x)
f (x)
x
2!
3!
+
x3
f (x) +
4!
69
Again we see that the truncation error is first order. We will now inspect the
truncation error in the two approximations.
x
f (x),
2
and
Central Chord
Tangent
Backward Chord
p1
x
f (x)
2!
Tangent
Forward Chord
f (x)
x
p
p+1
(2.8.7)
f (x)
1
{Forwardf (x) + Backwardf (x)}
2
1 f (x + x) f (x) f (x) f (x x)
+
=
2
x
x
f (x + x) f (x x)
=
2x
70
Well, thats not bad, we have a centred difference approximation to the first
derivative now. Can we derive it from Taylors series? Lets take equation (2.8.2)
and subtract equation (2.8.5) from it. We get
f (x + x) f (x x) = 2
(2.8.8)
x
x3
f (x) + 2
f (x)
1!
3!
(2.8.9)
f (x + x) f (x x)
2x
x2
f (x)
3!
ab =
f (b) f (a)
= f (),
ba
This tells us that ab is the exact derivative of f (x) somewhere in the interval
[a, b]. We just do not know where. Given no other information, our expression
for the truncation error tells us that ab has a first order truncation error as an
approximation to the derivative at the points x = a and x = b. It has a second
order truncation error as an approximation at the midpoint x = (a + b)/2. The
two examples in Figure 2.30 show the part of the hat function used to approximate
the function from which the derivative is inferred. It is clear comparing the two
functions that even the centred difference can be off quite a bit.
x
Forward Difference
Backward Difference
Central Difference
0.1
3.31
2.71
3.01
0.01
3.0301
2.9701
3.0001
0.001
3.003001
2.997001
3.000001
0.0001
3.00030001
2.99970001
3.00000001
71
Study Table (2.1) carefully. Look at the error carefully. For convenience the
error is tabulated in Table (2.2). The error term in the forward and backward differences are indicated as the sum of two terms. These can be identified as the first and
second leading terms in the truncation error. The central difference approximation
x
Forward Difference
Backward Difference
Central Difference
0.1
0.3+ 0.01
-0.3+0.01
0.01
0.01
0.03+0.0001
-0.03+0.0001
0.0001
0.001
0.003+ 106
-0.003 + 106
106
0.0001
0.0003 + 108
-0.0003+108
108
72
Order
type
Difference formula
truncation error
forward
fi+1 fi
x
backward
fi fi1
x
central
fi+1 fi1
2x
forward
x2
f (x)
3
backward
x2
f (x)
3
backward
central
x
f (x)
2
x
f (x)
2
x2
f (x)
3!
x3 iv
f (x)
12
x4 v
f (x)
30
We see from Figures 2.31 and 2.32 that the plot of the error is quite complex.
Note that on the x-axis we have log x. This increases left to right, indicating a
coarsening of the grid as we go from left to right. Let us consider the first order
error term to understand the graph better.
(2.8.11)
fh f
error =
=
f
| {z }
calculated
x f
2 f
| {z }
truncation error
log|error| = log x + c
where c is a constant. So, if we plot the absolute value of the relative error of a first
order scheme versus log x, we expect to get a straight line with slope one. That
73
log(abs(error))
0.0001
1e-08
1e-12
1e-16
1e-16
1e-12
1e-08
log(DeltaX)
0.0001
74
log(abs(error))
0.0001
1e-08
1e-12
1e-16
1e-16
1e-12
1e-08
log(DeltaX)
0.0001
f (x + x) = f (x) + xf (x) +
f (x + x)
f (x + x) f (x)
x1
0.0
-0.1234
-0.1234
x2
0.1
-0.0234
-0.02345
x3
0.12
-0.0034
-0.003456
x4
0.123
-0.0004
-0.0004567
75
On the roundoff side, we see that once roundoff error equals or exceeds the
truncation error, for every bit in the representation of x that we reduce, we lose
one bit in the relative error in the derivative. Which explains/is a conclusion drawn
from the unit negative slope of the error curve.
One lesson that you pick up from here is that for some reason if you want to
take very small increments, remember you may be just accumulating round off error
instead of getting the accuracy that you wanted.
Another point that must be noted here is that if we were to use the finite
difference schemes to evaluate the first derivatives of polynomials of various degrees,
there is a degree of the polynomial up to which a given scheme will give the exact
derivative.
A finite difference approximation for the first derivative will give the exact value
for all degrees of polynomial up to a maximum n for the given scheme. This scheme
is called an nth order scheme.
Now we look at obtaining approximations for higher order derivatives. We try
first to get second derivatives. We add equations (2.8.2) and (2.8.5) to eliminate
the first derivative term and get an expression for the second derivative as
(2.8.14)
f (x) =
f (x + x) 2f (x) + f (x x) x2
f (x) +
x2
12
Like we did earlier, we can get different linear combinations of the function evaluated at grid points. So, we see that in general, using the Taylors series one can
approximate a derivative of order n, written here as f hni , as
(2.8.15)
f hni =
X
i
i fi ,
fi = f (xi )
76
where i are the weights[Gea71]. For example the second derivative at some point
xi can be approximated by
(2.8.16)
fi = 1 f1 + 2 f2 + 3 f3
where
1
2
1
, 2 =
, and 3 =
x2
x2
x2
We get these weights by adding equation (2.8.2) to equation (2.8.5) You can try
this out for yourself to check the derivation and also to get the truncation error.
Again one can try to derive one-sided expressions to the second derivative just
as we did the first derivative. In a similar fashion, we can find expressions for the
third and fourth derivatives. Here is a third derivative that is forward biased.
(2.8.17)
1 =
f
f + ...
x3
2
4
We say forward biased because it has the i1 grid point included in the expression.
If we derived an expression with the fi+3 instead of fi1 we would had a a pure
forward difference expression.
(2.8.18)
fi =
(2.8.19)
hivi
fi
(2.8.20)
x
2
f (iv)
As x 0 this error goes to zero linearly or the representation has first order
convergence. On the other hand the error is zero for polynomial up to the order
three, that is, the representation is third order which we infer from the fourth
derivative in the truncation error.
Assignment 2.14
Make sure you are able to derive the difference approximation to derivatives of
various orders.
(1) Verify the truncation error for the third derivative given in equation
(2.8.18).
(2) Derive the expression for the fourth derivative given in equation (2.8.19)
and the associated truncation error in the representation.
77
(2.9.1)
u(0) = uo
In order to solve for a particular problem we need the boundary conditions. In this
case we may have u(0) = uo .
If we were to use a first order forward difference representation for the derivative
at some time, tq , and evaluated the right hand side at tq , we would get
(uq+1 uq )
= f (uq , tq )
t
(2.9.2)
(2.9.3)
(2.9.4)
We already know that u0 = uo . Hence, we can find u1 . We can repeat this process
to find u2 , u3 ...uq ...
This scheme is called the Eulers explicit scheme. Using the definition of the
derivative we could also write equation (2.9.1)
(2.9.5)
du = f (u, t)dt
u(t)
u0
du = u(t) u0 =
f (u( ), )d
t0
We could discretise the integral on the right hand side of equation (2.9.6) using
the rectangle rule and we would get the automaton given by equations (2.9.2) and
(2.9.3). The point is that the same automaton may be obtained through different
paths and for historical reasons may have different names. The objective is to
recognise this and not get too hung up on names. Instead of using the rectangle
rule one could use the trapezoidal rule and would simultaneously get the modified
Eulers scheme which is given by two steps
(2.9.7)
uq + tf (uq , tq )
(2.9.8)
uq+1
uq + tf (u , tq )
The second equation can of course now be iterated to get the iterated Eulers scheme
and so on.
78
You can try to use this to solve simple differential equations for which you
already know the solution. This way you can compare your computed solution to
the actual solution.
Assignment 2.15
Try solving the following equations:
du
= t2 ,
(2.9.9)
dt
and
u(0) = 0
du
= cos t, u(0) = 0
dt
Try these with different values of the boundary conditions, different values for
t. Solve them for t (0, 4).
(2.9.10)
We are generating a sequence made up of uq in all our schemes. You will see
through out this book that we tend to do this very often. In mathematics, we have
studied the properties of sequences. Mostly, we were interested in sequences that
converged. Here, we explicitly define a sequence to be divergent if the magnitude
of its terms eventually get larger and larger. One way to guarantee that the sequence
does not diverge is to make sure that the gain in magnitude across any given time
step is not greater than one. This is a stronger requirement than our definition,
since we do not mind something that grows and decays as long as it does not
diverge. Anyway, this is conventionally what is done and we will be happy if
q+1
u
(2.9.11)
uq < 1
79
(2.10.1)
v
uZ
u
E = ||e({xi })|| = t
sin x
10
X
i=0
ai hi (x)
)2
dx
is a minimum. Remembering that x0 and x10 in this example are fixed, we can
differentiate the equation with respect to xi , i = 1, . . . , 9 and set it to zero. If we
solve the resulting system of equations we should get a good distribution of the
nine interior points. This is easy to describe in this fashion. It is quite difficult
to do. Remember that ai = hhi , sin xi. We will see problems like this in a later
section 7.1. The sub-discipline is called adaptive grid generation. The name is an
indication that the grid adapts to the function that we are trying capture.
Can we do something with whatever we have learnt so far? We can look at
equation (2.10.1) and see what it signifies. It is an accounting of the total error in
the interval of our representation. We have derived expressions for the local error
in terms of the truncation error. So, we can find the total error by adding up the
magnitudes of the truncation errors. We now have an estimate of E. If we were
using box functions E can be written as
X
(2.10.2)
E=
|cos(xi )xi | , xi = (xi+1 xi )
i
This E is the total error over the whole domain. Given that we have N intervals
with which we are approximating our function the average error is E/N .
Very often, we do not have the solution at hand to get the grid. One obvious
thing to do is to get an approximate solution on an initial grid and then adapt the
grid to this solution. We can then proceed to get a solution on the new grid and
repeat the process. This is not always an easy proposition. If we have an idea of
the solution we can cluster the grids ahead of time.
If for example we know that the function we are representing has a large derivative at x0 in the interval [x0 , x1 ], we can generate a grid to accommodate this
behaviour. The easiest thing to do is geometric clustering or stretching. If we knew
a solution varied very rapidly near the origin we could take fine grids near the origin, x0 = 0, and stretch them out as we move away from the origin. For example,
We could take the first increment to be xo . Then, if we propose to stretch the
grid using a geometric scheme with a stretching factor we could then take the
next increment to be x1 = xo . In general we would have xi+1 = xi
80
CHAPTER 3
Simple Problems
We have enough machinery in place to tackle a few simple and classical problems. We are going to do the following here. We will take the naive approach and
try some obvious schemes for solving simple equations. We will try to develop the
analysis tools that help us to ask and answer questions such as
One: Is the proposed technique going to produce anything at all?
Two: Are we generating garbage or a solution?
Three: How does the technique behave?
Four: Can we do better?
These questions help us improve on the simple schemes. Some of the improvements
we will study in this chapter. Others we will study in latter chapters.
We propose to look at equations which are prototypes for a class of problems.
These are: Laplaces equation which happens to the prototypical elliptic equation,
the heat equation, which will represent the parabolic problems, and finally the wave
equation for the hyperbolic problems. We will see more about the nature of these
problems and shed some light on the classification at the end of the chapter.
3.1. Laplaces Equation
Laplaces equation is a good place to start our discussions. It is easy to conjure
simple problems that it describes. The corresponding program is easy to write and
is well behaved. As we shall see, it is amenable to the simple-minded analysis that
is done in this book [ZT86], [Ame77], [Arn04], [Sne64].
Lets first place Laplaces equation in a physical context. Consider the irrotational flow of a fluid. We will assume for the purpose of this discussion that the
flow is two-dimensional and incompressible. The equations governing the motion
of fluid are derived in section 5.3. The law of conservation of mass can be stated as
u v
+
=0
x y
u and v are the velocity components along x and y respectively. As the flow is
irrotational, we can define a potential function (x, y) such that
(3.1.1)
(3.1.2)
u=
,
x
v=
Substituting back into the equation (3.1.1) we get the potential equation or Laplaces
equation.
Laplaces equation is the prototype equation for an elliptic problem [ZT86]. It
should be pointed out here that the equation is referred as Laplaces equation or
81
82
3. SIMPLE PROBLEMS
the Laplace equation. In two dimensions, using the Cartesian coordinate system,
the equation is
2 2
+ 2 =0
x2
y
Since the flow is irrotational and two-dimensional, we also know from the definition of vorticity that
2 =
(3.1.3)
v
u
=0
x y
The stream function is defined such that
(3.1.4)
z =
(3.1.5)
u=
,
y
v=
If this is substituted into equation (3.1.1), we see that the stream function generates an associated velocity field (equation (3.1.5)) that automatically satisfies the
equation governing conservation of mass. In turn, if we were to substitute from
equation (3.1.5) into equation (3.1.4), we see that the stream function also satisfies
Laplaces equation.
2 2
+
=0
x2
y 2
In the discussions that follow, one could use either the stream function or the
potential function. In using these equations to describe the solution to a problem,
the boundary conditions will depend on whether the stream function is being used
or the potential function is being used.
Consider the following problem.
2 =
(3.1.6)
2 u = 0
u(x, 0) = x2
u(1, y) = 1 y 2
u(0, y) = y 2
u(x, 1) = x2 1
83
figure. It would be convenient at this point if we had a problem for which we had
a closed-form solution. Then, we could act as though we did not have the solution
and proceed to solve the problem and check the answer that we get. To this end,
the boundary conditions are chosen so that the solution is (x, y) = x2 y 2 (verify
that this is a solution). So, we have the problem definition as shown in Figure
3.1. Well, what does this mean? We want to find a (x, y), that satisfies equation
(3.1.3) everywhere inside the square and satisfies the boundary conditions given.
Boundary conditions where the dependent variable, in this case , is specified are
called Dirichlet boundary conditions.
Remember that by satisfies the equation we mean that we can substitute
it into the equation and find that we do indeed have an equation: the left hand side
equals the right hand side. In order to substitute into the equation, we need to be
able to evaluate the second derivatives with respect to x and y. So, we are already
searching for within a class of functions that have second derivatives everywhere
inside the unit square. You may have studied techniques to solve this problem
analytically. Here, we are going to try and get an approximation for . This means
that we will have approximations for its second derivatives. Consequently, we will
have an approximation for Laplaces equation. This then is our plan of action.
Given three points that are x apart, we have already seen that the second derivative of f (x) can be approximated at the middle point in terms of its neighbours
as
(3.1.7)
f (x + x) 2f (x) + f (x x)
2f
=
2
x
x2
x, y + y
x x, y
x, y
x + x, y
x, y y
Figure 3.2. Points used to approximate Laplaces equation in two
spatial dimensions.
We can then approximate both the xderivative and the yderivative at the
central point (x, y). So, Laplaces equation (3.1.3) can be rewritten at the point
(x, y) as
84
3. SIMPLE PROBLEMS
2 2
+ 2
x2
y
(x + x, y) 2(x, y) + (x x, y)
+
x2
(x, y + y) 2(x, y) + (x, y y)
=0
y 2
(3.1.8)
We can solve for (x, y) in terms of the at the neighbouring points to get
(3.1.9)
(x, y) =
x2 y 2
2(x2 + y 2 )
(x + x, y) + (x x, y)
x2
(x, y + y) + (x, y y)
+
y 2
What are we actually doing when we solve this equation? For clarity, we have
so far taken x and y to be constant. To get a better picture, we will set
x = y = h as shown in Figure 3.3. In this case, equation (3.1.9) reduces to
(3.1.10)
(x, y) =
(x + h, y) + (x h, y) + (x, y + h) + (x, y h)
4
x, y + h
x h, y
x, y
x + h, y
x, y h
Figure 3.3. Points employed in approximating Laplaces equation, x = y = h
Referring to Figure 3.3, we see that the value of at the point (x, y) is in fact
the average of the values from the neighbouring points. We can use this to generate
a solution to the problem numerically.
We know how to get an approximation of at a point based on four neighbours.
Clearly, we cannot compute at every point in the unit square using equation
(3.1.10). So, we represent the unit square with a discrete set of points. We refer
to these points as grid points. We will now try to approximate the differential
equation on these points. We can identify a grid point by its location. However, it is
easier for us to index them in some fashion. Figure 3.4 shows one such arrangement.
We use two indices to identify a grid point. i is an index along the x direction and j
is an index along the y direction. The coordinates of the grid point (i, j) in general
are (xij , yij ). Since we constrained x = y = h, and the mesh is Cartesian the
85
(i, j) grid point in fact has coordinates (xi , yj ). The approximation at that point
would be ij .
0, 6
0, 5
0, 4
0, 3
0, 2
0, 1
0, 0
i, j
..
.
1, 3
1, 2
1, 1
1, 0
2, 1
2, 0
3, 0
4, 0
5, 0
6, 0
i
Figure 3.4. Sample grid to represent and solve Laplaces equation
If we focus again on one grid point (i, j), we get
i1j + i+1j + ij1 + ij+1
(3.1.11)
ij =
4
or to put it in a programming style
(3.1.12)
The ij on the boundary grid points are determined using the given boundary
conditions. At the grid points in the interior we use equation (3.1.11). So, we do
the averaging only at the internal grid points. Taking the average at one grid point
is called relaxing the grid point. In order to start taking the averages we need to
assume some initial value. We could, for example, assume all the interior values
to be zero. That is, 0ij = 0 for the interior values. What does the superscript 0
mean? We propose to calculate a new set of values 1ij by averaging the 0ij s. This
is called one iteration or one relaxation sweep. Of course, we only iterate on the
interior points. By this, we mean that we do not change the boundary values of 1ij .
The 1ij is, hopefully, a better approximation to than is 0ij . We can then iterate
one more time to get 2ij . 0ij , 1ij , 2ij ...qij ... are called iterates or candidate
solutions. We can now write an iterative version of equation (3.1.10) as
(3.1.13)
q+1
= 0.25 qi1j + qi+1j + qij1 + qij+1
ij
86
3. SIMPLE PROBLEMS
averaging can be done in parallel. When do we decide to quit iterating? For what
value of q can we stop taking averages? In order to answer this critical question,
we need to take a good look at what we are doing? We seem to be generating a
sequence of ij s. So we are really asking the question: when does the sequence n
converge? We could go with the answer: when ||q+1 q || < c . This is called
the convergence criterion for the iterative scheme. How do we evaluate it? One
way would be
(3.1.14)
q+1
k=
s
X
i,j
q+1
qij
ij
2
< c
where c is specified by us. Let us write the steps involved in this discussion so that
you can actually code it.
One: At any grid point (i, j) we can define a qi,j . The q superscript tells us
that it is the q th iterate or approximation.
Two: In our problem the qi,j is given on the boundaries. This is called a
Dirichlet boundary condition.
Three: In order to use equation (3.1.13), we need 0i,j in the interior. We
will assume this value, for example, 0i,j = 0.
Four: We can now repeatedly apply equation (3.1.13) to find the next
and hopefully better approximation to i,j .
Five: We stop iterating when the condition given by equation (3.1.14) is
satisfied. When this occurs, we say our code has converged.
Assignment 3.1
(1) Write a program to solve Laplaces equation on a unit square, see Figure
3.1. Use the boundary conditions provided there.
(2) You can iterate away, till convergence.
(3) Try solving the problem with grids of various sizes: [1111], [2121], [41
41], [101 101] . . . [m m].
(4) Pick three different convergence criteria: c = 102 , 104 and, 106 . Call
c = ||q+1 q || the change in solution. For a given grid [m m], define
N (m, c ) as the last value of q, when the code has converged. That is
c < c .
(a) Plot c versus q.
(b) For each m, plot N versus c .
(c) For each c , plot N versus m.
(d) For each c , plot N versus (m 2)2 .
You may have already written a solver for Laplaces equation using the scheme
we just developed. The hope is that you have been playing around with it trying
to learn as much as possible from that code. In the assignment, I have suggested
somethings that you can try out. What I mean by playing around with the code is
that you try out a variety of boundary conditions, grid sizes, convergence criteria,
the order in which the points are picked for iteration. How about if we pick points
at random and take averages. Try out things like this and other things that may
(n + 1)(m + 2) + 2
87
(n + 1)(m + 1) + 2
i+n
i1
i+1
..
.
(n + 2)(m + 2)
1 + 2n
nm + 1
in
1+n
2+n
2n
n(m + 1) + 2
q+1
=
ij
q
q+1
q
q+1
i1j + i+1j + ij1 + ij+1
4
88
3. SIMPLE PROBLEMS
Assignment 3.2
(1) Repeat assignment 3.1 using the Gauss-Seidel scheme.
(2) Plot c versus q for both Jacobi and Gauss-Seidel schemes on a semi-log
scale. (c is plotted on the log-scale). Compare the slopes of the two
graphs.
(3) Find the above slopes for various grid sizes and tabulate. How does the
slope depend on the size of the grid? Is the relationship between the slopes
corresponding to the two schemes independent of the grid size?
(4) What about the CPU times or the run times for the two schemes. Plot
time versus number of interior grid points. If you fit a straight line to this
curve where does it intersect the axis for zero grids?
(5) Compute the time per iteration per grid point for the various grid sizes.
Plot it versus grid size to see if goes to an asymptote with number of grid
points increasing.
A suggestion that I had made as an example of playing around was to pick
grid points at random and take averages. The whole point is that we normally do
not pick points at random, we pick them up in a natural numerical sequence. Is
there a more convenient way to number these grid points? It all depends on what
we want. The way we have indexed grid points so far seemed pretty convenient
and natural. How about the one shown in Figure 3.5? One could number all the
interior points in a sequential order with one index say i as shown in Figure 3.5.
The boundary points can be numbered separately after that. This has done two
things for us.
(1) It has allowed us to represent using a one-dimensional array instead of
a two-dimensional array.
(2) It has clustered all the interior points together at the beginning (or top)
of the array and all the boundary conditions at the end (or bottom) of
the array.
Incidentally, if you do not care to have all the interior points clustered you could
number the grid points serially starting at the bottom left hand corner. However,
we will stick to the version shown in Figure 3.5.
Referring still to Figure 3.5, an interior grid point with index i, has a left
neighbour i 1, a right neighbour i + 1, a bottom neighbour i n and a top
neighbour i + n. Here n is called the stride. We have to be careful now. What are
the neighbours of the first interior grid point? It has neighbouring points from the
boundary. At a generic interior point, the approximation for Laplaces equation
becomes
(3.1.16)
i =
i1 + i+1 + in + i+n
4
We can then proceed to iterate again and solve the problem. Nothing should
change in the solution as all we have done is change the symbols. Redo the assignment and make sure there is no change to your solution and any of the other
parameters that you have checked (like convergence plots and so on).
89
This renumbering of the grid points gives us another way of looking at this
problem and solving it. Let us first consider the way we have numbered the grid
points. Take another look at this equation for a unit square at the origin. n = m
since the x = y. In reality, we have n2 unknown interior points and n2 equations.
We are solving the equations here using either Gauss-Seidel or Jacobi schemes. So,
the equation (3.1.16) should be written as part of a system of equations. The ith
equation of this system of equations can be written as
in + i1 4i + i+1 + i+n = 0
(3.1.17)
32
31
30
29
28
33
13
14
15
16
34
10
11
12
35
36
17
18
19
20
21
27
26
25
24
23
22
We will write the full system for a [6 6] grid with 16 interior points as shown
in Figure 3.6. We have 16 unknowns and the corresponding 16 equations. These
equations can be written as
(3.1.18)
Ax = b
where x is vector made up of the i on the interior grid points. This equation is
written out in full detail in equation (3.1.19).
90
1
4
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
4
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
1
4
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
4
1
0
0
1
0
0
0
0
0
0
0
0
1
0
0
1
4
1
0
0
1
0
0
0
0
0
0
0
0
1
0
0
1
4
1
0
0
1
0
0
0
0
0
0
0
0
1
0
0
1
4
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
4
1
0
0
1
0
0
0
0
0
0
0
0
1
0
0
1
4
1
0
0
1
0
0
0
0
0
0
0
0
1
0
0
1
4
1
0
0
1
0
0
0
0
0
0
0
0
1
0
0
1
4
0
0
0
1
0
0
0
0
0
0
0
0
1
0
0
0
4
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
4
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
4
1
0 1 36 18
0
19
3
20
4
21 23
5
35
0
6
0 8
24
=
0
9
34
10
0
0
11
12
25
0
13
33 31
14
0
30
15
29
4
16
28 26
3. SIMPLE PROBLEMS
(3.1.19)
4
1
0
0
91
From this it is clear that for an arbitrary grid with [n m] interior grid points
the stride is n (or m). The diagonal will have a 4 on it. The sub-diagonal will
have a 1 on it if possible. The super-diagonal will have 1 on it if possible. All the
possible diagonals above and below will be zero excepting the nth above and below
which will be one. Where it is not possible to place a one on any diagonal the
corresponding entry will show up in b as a term subtracted from the right hand
side.
This matrix is clearly a sparse matrix. Most of the entries in a sparse matrix
are zero as is the case here. For this reason, we very rarely construct/assemble this
matrix in CFD. Once we have identified the problem as that solving a system of
equations, many ideas will come to mind. For small problems, you may actually
consider assembling the matrix and using a direct method like Gaussian elimination
to solve the problem. However, as we have done here, for bigger problems we usually
use an iterative scheme like Gauss-Seidel or Jacobi schemes. There is a very simple
reason for this. Direct methods like Gaussian elimination which is identical to LU
decomposition involves an elimination step.
(3.1.20)
U x = L1 b
x = U 1 L1 b
Both U and L are triangular matrices. The calculations, once you have the decomposition, are easy to do as at any given time you have an equation with just one
unknown. For example, the first step in equation (3.1.20) would be to solve for the
first unknown.
(3.1.22)
{U x}1 = b1 /L1,1
where the subscripts indicate the corresponding component of that matrix. The
second equation would be in terms of the known right hand side and the first term
just calculated from the first equation. At the end of evaluating equation
(3.1.20), we would have all the terms that make up the vector U x. The fact of
the matter is that the last term would have been calculated based on all preceding
terms. In an n n system of equations, n is the number of unknowns, this would
have involved of the order of n(n + 1)/2 calculations. We repeat a similar process in
the backward direction to solve for x using equation (3.1.21). So, it turns out that
the very first element of the vector x is a consequence of n2 operations performed
one after the other. The potential accumulation of cumulative roundoff error is
enormous. However, the solution from this direct method may be a good guess for
an iterative scheme. As a personal bias, I do not use direct methods to solve linear
system of equations specified by anything more than a 400 400 matrix, though
people have come up with some clever techniques to get around the size issue.
Note that the matrix A is symmetric, has negative terms on the diagonal and
positive ones on the off-diagonals. It has a lot of nice properties [Var00] based
on its structure. Anyway, now that we recognise that we are solving a system of
linear equations we can bring to bear other methods from numerical linear algebra
[GL83] to aid us in the solution of the system.
92
3. SIMPLE PROBLEMS
Assignment 3.3
(1) Rewrite a program to solve Laplaces equation on a unit square using the
new numbering system.
(2) Is there any difference in the time per grid point per iteration?
(3) Is there a difference in the solution, convergence rate? The solution and
convergence rate should be the same as before.
(4) What happens to the convergence rate if we iterate in a fashion alternating
relaxation of the interior points between the ends of the array? That is
relax the first point, then the last point, then the second point, the last
but one point, and so on, till we do the middle point at the end of the
iteration.
(5) If you feel up to it, you can try writing a solver using Gaussian elimination,
LU decomposition or better LDLT decomposition. Refer [GL83] for more
details.
xq+1 = g(xq )
xq+1 = P xq + C
93
Finally, as you should have guessed, the matrix D is a diagonal matrix with the
entries from the diagonal of A. In this particular case, D is a diagonal matrix with
4 on the diagonal and zeros for all other entries. With this framework in place,
we can write the Jacobi scheme as
(3.2.5)
xq+1 = D 1 {b Lxq U xq }
For an interior point, with no neighbouring boundary point, for example the seventh
equation from the system of equations (3.1.19) is
(3.2.6)
3 + 6 47 + 8 + 11 = 0
Clearly,
(3.2.8)
P J = D 1 {L U },
C J = D 1 b
xq+1 = {D + U }1 {b Lxq }
P GS = {D + U }1 {L}
xq+1 = xq
x is a real number. For a given finite, real number , equation (3.2.11) is a map
from the real line back onto the real line. Does this equation always have a fixed
point? That is, is there a such that = ? Is x = 0 a fixed point? Yes, for
94
3. SIMPLE PROBLEMS
equation (3.2.11), x = 0 is a fixed point. If I guess any old x0 , will the iterations
always converge to zero? How does affect the process? Clearly > 1 is not going
to get to any fixed point unless the initial guess is x0 = 0. How about if = 1?
= 1 seems to make every point on the real line a fixed point. Finally, when < 1,
we do generate a sequence of numbers that shrink towards the origin. All of this
makes sense if we notice that we are generating a sequence x0 , x1 , . . . , xq , . . . and
that the ratio test for the convergence of a sequence tells us that we have a fixed
point if < 1. We are now looking to get some understanding using this simple
example, so that when we encounter a system of equations we are able to cope with
it a little better.
Consider a region of points containing the origin r 0 r. If this were in
two dimensions we would have called it a circular region. So, we will just call it a
circle of radius r about the origin. What happens to the points in this circle when
they are run through equation (3.2.11) for < 1? This is illustrated in Figure 3.7.
Clearly, the equation maps into a circle that is smaller, in fact in this case
rq+1 = rq
(3.2.12)
xq
rq
xq+1 = xq + c
rq+1 = rq
xq+1
rq+1
95
Meanwhile, you can ask the question: Isnt it enough if the mapping confines
us to a fixed region in space? Do we really need < 1? Consider the map given by
for xq < 21
xq
q+1
(3.2.13)
x
=
(1 xq ) for xq 12
Try this out for various initial conditions x0 and you will see that though we are
confined to the unit interval at the origin we may never converge to a fixed point.
Assignment 3.5
Run the above map for various starting points x0 and = 2.0 and = 1.9
(1) Try 0, 1, 12 , 0.25. Draw conclusions.
(2) Try other values of , 0.5, 1.2, 1.4 and so on.
In all our endeavours here, we would like to look at spaces that are complete.
What do we mean by this? Let us say that we are generating a sequence of solutions.
We would like the sequence to converge to a point that is in the space in which
we are looking for the solution. A policeman would like the hunt to end within his
jurisdiction. If the person being pursued enters another country with which there
is no extradition treaty, then the hunt does not converge. So, lets say we have a
complete space S and we have a map P that takes a point in S and maps it back
into S. We could restate this as, if you have some x S, then
(3.2.14)
y = P x,
yS
xq+1 = P xq ,
xq , xq+1 S
where d(a, b) is the distance between the points a and b and is called the metric of
the space S. What equation (3.2.16) says is that the map P maps the points a and
b into points that are closer together. It turns out that if this is the case, we can
assert that there is an unique point, , in S, called a fixed point, such that
(3.2.17)
= P
That is, P maps this particular point into itself. I have paraphrased the Banach
fixed point theorem here.
Fine, we see how this works for a scalar equation. However, we are looking at
a system of equations. How does it work for a system of equations? To answer
this question we will consider a simple problem using two scalar equations. We
will convert this problem into a slightly more complicated problem by performing
a rotation of coordinates. You can review the material on matrices given in the
appendix B.2 and try out the following assignment.
96
3. SIMPLE PROBLEMS
Assignment 3.6
Consider two equations
(3.2.18)
xq+1 = x xq
(3.2.19)
y q+1 = y y q
Write a program to try this out, or better still plot a the region as it is mapped by
hand on a graph sheet. Take both x and y to be in the range (1, 1). You can
run each equation separately. Choose different combinations of x and y .
We look at the problem in the assignment. As we now expect, if |x | < 1
the x sequence will converge. Similarly if |y | < 1 the y sequence will converge.
So, in order for the the combined sequence (xq , y q ) to converge we require =
max(|x |, |y |) < 1. The condition to converge seems coupled. Right now it looks,
sort of, contrived. We are at times fortunate to pick the right coordinate system
and get a problem that is simple. In this case, the simplicity comes from the fact
that the two equations are decoupled. Since, we are not always this fortunate, we
will now perform a rotation of the coordinate system to convert this simple problem
into something that you would generally encounter.
First, lets rewrite the two equations as a matrix equation. Equations (3.2.18),
(3.2.19) can be written in matrix form as
q+1
q
x 0
x
x
=
(3.2.20)
y q+1
0 y
yq
This equation can be rewritten as
(3.2.21)
~x q+1 = ~x q
where
(3.2.22)
x
~x =
y
and
(3.2.23)
x
0
0
y .
The iteration matrix looks nothing like the matrix P that we got with GaussSeidel scheme for Laplaces equation. Let us rotate the coordinate system through
an angle . We can do this by pre-multiplying equation (3.2.20) by the matrix
cos sin
(3.2.24)
R=
sin cos
You will notice that this matrix performs a pure rotation of any vector on which it is
applied. By pure rotation, we mean that it does not perform a stretch. Performing
the multiplication, we will get an equation of the form
(3.2.25)
xq+1 = P xq
97
P = RR1
and x = R~x. R is very often called the modal matrix. It should be obvious that
x and y are the eigenvalues of the iteration matrix P . The largest of these, , is
called the spectral radius of the iteration operator P .
(3.2.27)
(P ) = max(|x |, |y |)
A = D + F,
Okay, now we know what to do and how long to do it for convergence. We have
plots of the solution and other parameters related to the behaviour of our code.
Can we say something about the solution so that we can be confident that it is a
good approximation to the solution to the original problem?
Is it possible for two different people to solve this problem and get different
answers? In this particular case, we chose the boundary conditions from a function
that already satisfies our equation. How do we know for some other boundary
condition whether we got the solution or not? Are there any kind of sanity checks
that we can make to assure ourselves that we have a solution? Can we say something
98
3. SIMPLE PROBLEMS
about the solution to a problem without actually solving it? We will take a shot at
it.
Lets now take a look at what we are actually doing. Equation (3.1.10) says
that the value of at a given point is the average of its neighbours. So, what is
the average? The average cannot be larger than the largest neighbour or smaller
than the smallest one. This is true of all the interior points: No point is larger than
the largest neighbour or smaller than the smallest one. Therefore, we can conclude
that the maximum and minimum of cannot occur in the interior points. The
maximum or minimum will be on the boundary.
Though we have not actually done all the mathematics to make this statement,
we will go ahead and extend our conclusion to the Laplace equation by saying:
The solution to Laplaces equation will have its maximum
and minimum on the boundary
We can pursue this line of reasoning to come up with an interesting result. If
two of us write a solver to Laplaces equation, is it possible to get two different
answers? Let us for the sake of argument assume this is possible. So, you get an
answer 1 and I get an answer 2 and both of them satisfy Laplaces equation.
That is
(3.3.1)
(3.3.2)
2 1
2
They also satisfy the same boundary conditions on . Subtracting one from the
other tells us that
(3.3.3)
2 (1 2 ) = 0
99
especially true for the larger grid sizes. What can we do to get to the solution
faster?
Clearly, if we started with the solution as an initial guess, we would converge
in one iteration. This tells us that a better initial guess would get us there
faster. Possible initial conditions for our problem on the unit square in increasing
complexity are
the value on one of the sides can be taken.
linear interpolation of boundary conditions for two opposites sides of the
square.
the solution to a coarser grid say [5 5] can be used to determine the
initial condition on a finer grid, say, [11 11].
You have seen that the spectral radius of the iteration operator P determines
convergence. It also determines convergence rate. Let us see how this happens.
The iteration that leads to the solution h to Laplaces equation approximated on
a grid of size h is
(3.4.1)
n+1 = P n + C
where, n is a candidate fixed point to equation (3.4.1) and consequently a candidate solution to Laplaces equation. On the other hand h is the solution to the
discrete Laplaces equation and is the fixed point of the iteration equation (3.4.1).
Therefore,
(3.4.2)
h = P h + C
We will designate the difference between the actual solution and the candidate
solution as e, that is
(3.4.3)
en = n h
en+1 = P en
This works as both P and C, in our case, do not depend upon . Now, if we
premultiply equation (3.4.4) by the inverse of the modal matrix R, the equation
becomes
(3.4.5)
E n+1 = E n
where s is the sign of the largest eigenvalue and is the spectral radius. Or,
E n+1
(3.4.7)
n = (P )
E
Now, it is clear that for very nearly one, the number of iterations required to get
|E | below a predetermined c will be large. Let us see if we can come up with an
estimate of . The n , for all n and represent functions that satisfy the boundary
100
3. SIMPLE PROBLEMS
conditions of our problem. en and E n represent the error, e(x, y), which is zero on
the boundaries. We will expand e(x, y) in terms of the Fourier series as
N
1 N
1
n my o
X
X
lx
(3.4.8)
e(x, y) =
al bm exp i
exp i
L
L
m=1
l=1
where, N is the number of intervals of size h, (N h = L), and L the side of the square
on which we are solving the problem. You will notice that we have restricted the
sum to the range 1 to N 1. Wave number zero will not contribute anything as the
boundary condition is zero (homogeneous). At the other end of the summation, we
already know (see chapter 2.9) that the highest frequency that we can represent is
2(N 1)/2 = (N 1). Since both Laplaces equation and our iteration equation
are linear, we can check out what happens to each wave number separately.
Assignment 3.7
my
Verify that elm (x, y) = exp i lx
is an eigenvector (or an eigenL exp i L
function) of the Laplaces equation. That is, show that 2 elm = elm .
What happens to elm when we crank it through one iteration of our Jacobi
iteration?
(3.4.9)
n
n
en+1
lm (xp , yq ) = 0.25 {elm (xp+1 , yq ) + elm (xp1 , yq )
where, p and q are indices along x and y respectively. Since xp+1 xp = yq+1 yq =
h, we can write this as
(3.4.10)
en+1
lm (xp , yq ) =
n o
n o
0.25 exp i lh + exp i lh
L n
o L n
o
101
3.4.1. Successive Over Relaxation - SOR. Since the gain is the ratio
of two successive iterates, our hope lies in using the two iterates to reduce the
maximum gain. This is an acceleration technique that works by taking a linear
combination of the new iterate and the old iterate. Take an (0, 2). Why this
restriction on ? We will come to that later. This is how the algorithm works
when applied to the Gauss-Seidel scheme. Instead of calling the average of the four
neighbouring points the new iterate, we treat it as some intermediate value in
our computation. We give it a new name, , so as not to confuse it with the new
iterate. Our new algorithm is a two step process as follows.
(3.4.14)
(3.4.15)
ij =
n+1
n
n
n+1
i1j + i+1j + ij1 + ij+1
4
n+1
ij = ij + (1 )nij
cos N
pJ
=
(3.4.16)
=
1 + sin N
1 + 1 2J
is the spectral radius of the iteration operator corresponding to SOR.
For a general problem we may not be able to obtain an expression for the
optimal . What we usually do is perform several test runs with different values
of so as to locate an optimal one. Clearly, there is no sense solving the problem
many times since we could just take = 1 and solve it just once. Instead, we
hunt systematically for the best . We could, for instance, iterate ten times with
different values of and take the value that resulted in the largest drop in the
residue. Different values of ? How do we pick them? Why did I say Take an
(0, 2)? We will find out now.
Before we embark on a campaign to hunt down the optimal , let us try to
understand what we are doing when we use SOR. Remember that we had shown
that we were actually solving a system of linear equations, Ax = b. Let us now
push this a little further. We had also pointed out earlier that A is symmetric,
meaning, A = AT . Consider the following scalar function of x
(3.4.17)
Q(x) =
1 T
x Ax xT b
2
Q maps an n-dimensional vector into the real line. How would you find an extremum
for this function? We find the gradient with respect to x and set it equal to zero
and solve the resulting equation for x. We take the gradient. Lo and behold, we
get Ax = b. So, with symmetric A, solving the linear system of equations is like
finding the extremum of Q(x).
To get a geometrical idea of the SOR algorithm let us see what we are doing
in the process of finding this minimum. We will look at a specific simple example
so that we understand SOR and get an idea as to why is constrained to the
interval (0, 2) in our case. We will look at the quadratic in one single variable.
This works for us, since, when we do the operation (averaging) at a grid point,
i, corresponding to one iteration, we are working on minimising Q along that one
102
3. SIMPLE PROBLEMS
dimension i . That is, b has all the other terms absorbed into it. This scenario is
graphed in Figure 3.8.
xn
Figure 3.8. Graph of Q(x) and finding its minimum in one spatial
dimension. For simplicity the axis of the parabola is parallel to the
Q-axis
Consider the scenario where we have xn and would like to find xn+1 to get
at the next approximation to the minimum. In the one-dimensional case we could
solve the problem directly as
b
(3.4.18)
xmin =
a
and reach the minimum of a quadratic in one iteration. We act as though we are not
aware of this and see what we get with SOR. The quadratic that we are minimising
is
1 2
ax bx
2
Resulting as we know in the equation for the minimum as ax = b. Our iteration
equation is
b
(3.4.20)
x =
a
and we use it in the SOR algorithm as follows
(3.4.19)
(3.4.21)
q(x) =
xn+1 = x + (1 )xn
where, x is the solution that we get from our iteration equation (3.4.18) We subtract xn from both sides of this equation to get
(3.4.22)
xn = x
Now in the one-dimensional case we would get the exact solution to the minimisation problem in one iteration. = 1 would indeed get you the answer. However,
what happens if we take = 0? Nothing. The solution does not progress. On the
other hand, what happens if we take = 2, we end up oscillating between point
A and B (see Figure 3.8). < 0 and > 2 would cause the resulting Q(x) to
increasing causing the iterations to diverge. We realise from this simple argument
that we must seek our optimal in the interval (0, 2).
103
If you are not sure if this is true for a general quadratic as opposed to the one
shown in Figure 3.8, we can work it out algebraically. Equation (3.4.18) tells us
that the minimum that we seek is at b/a. Let us say that our guess is off by an
amount d. This means x = d. That is
b
(3.4.23)
xn = + d
a
Q(xn ) then works out to be
2
1
b
b
(3.4.24)
Q(xn ) = a
+d b
+d
2
a
a
The question is, does Q(xn+1 ) take the same value? Where
b
d
a
It does! That is, if xn = b/a + d, and xn = b/a d, Q(xn ) is indeed the same as
Q(xn+1 ). You can verify this.
xn+1 =
(3.4.25)
0.001
|| Residue||
1e-06
1e-09
11x11
21x21
31x31
41x41
51x51
61x61
71x71
81x81
91x91
101x101
1e-12
1e-15
1e-18
0
200
400
600
800
1000
Iteration index n
Figure 3.9. Plot of the norm of the residue versus iteration for
the solution to Laplaces equation using the Gauss-Seidel method
for various grid sizes
To summarise
(1) (0, 2), the value of Q decreases. That is Q(xn+1 ) < Q(xn ).
(2) = 0 or = 2 the value of Q remains the same. Q(xn+1 ) = Q(xn ).
(3) Finally if < 0 or > 2, then the value of Q increases leading to a
situation where Q(xn+1 ) > Q(xn ).
Now that we have an idea of why needs to be in the range (0, 2), how do
we find the optimal value? A plot of the residue versus number of Gauss-Seidel
iterations is shown in Figure 3.9. It is clear that as the grid size gets finer, or the
number of grids increases for the given domain, that the rate of convergence to the
104
3. SIMPLE PROBLEMS
solution slows down. For now it should be noted that the initial drop is rapid for
all of the grids. Subsequently, there is a slow down in the convergence rate.
In order to accelerate convergence, we have tried using point SOR. A plot of
the residue versus iterations for various is shown in Figure 3.10. The plot of the
terminal points is shown in Figure 3.11, where we see the residue after a hundred
iterations versus value. Clearly, the most rapid drop occurs near = 1.8. How
does this seem from the perspective of the contraction mapping we looked at in
section 3.2? Are we looking for an that will give us the best contraction?
|| Residue ||
0.01
=1
= 1.3
= 1.6
= 1.7
= 1.8
= 1.9
0.0001
1e-06
1e-08
20
40
60
80
100
Iteration index n
Figure 3.10. Plot of the norm of the residue versus iterations for
the solution of Laplaces equation on a 41x41 grid using point SOR
for various values
There are two possible ways by which we can try to get at the optimal . One is
to run the program till the residue drops by one order in magnitude. The optimal
is the one that takes the lowest number of iterations to cause the prescribed
drop in residue. This works fine for Laplaces equation. In a general problem,
however, this may result in the code running for a very long time to converge be
the predetermined amount. Worse, it may never converge.
The other way is to run a set number of iterations and then decide based on
the one that has the greatest residual drop. This corresponds to the plot shown in
Figure 3.11. This figure illustrates how dramatic the improvement with SOR could
be.
105
100
|| Residue ||
0.01
0.0001
1e-06
1e-08
1.2
1.4
1.6
1.8
Figure 3.11. Plot of residue after 100 iterations versus for the
solution of Laplaces equation on a 41x41 grid using point SOR
Assignment 3.8
(1) Verify that the roots of the quadratic Q(x) = c are symmetric about the
minimum of Q(x). Hint: The minimum of the quadratic ax2 + bx + c
occurs at x = b/2a. Study the expression for the two roots.
(2) Repeat the computation involved in generating Figure 3.11 for 10, 50, 100,
200, 500, 1000, iterations. For each of them, once the opt is recovered
using = 0.1, repeat the computations for (opt 0.1, opt + 0.1)
with a new value of = 0.01.
Did you see a drift in the opt to the right with an increase in number of
iterations?1 This is of great concern if we do not have an expression for the optimal
value and we have to hunt for the through numerical experiment.
fortunately, does not have uniform convergence. However, that makes finding the opt more
difficult.
106
3. SIMPLE PROBLEMS
= 0,
n
where, n is along the normal to a no-penetration boundary. In our particular
problem we take the no-penetration boundary as the unit interval (0, 1) on the
x-axis. As a result, the no-penetration boundary condition turns out to be
= 0,
(3.5.2)
y y=0
(3.5.1)
How is this implemented? Most of the code that you have written does not change.
After each sweep of the interior points, we need to add the boundary condition on
the bottom grid points.
(3.5.3)
i,0 = i,1
is the simplest way to apply this boundary condition. It uses the first order approximation of the derivative and sets it equal to zero. Subsequently, we solve for
the i,0 on the boundary. For the most part this should suffice. We could also use
a second order representation for the first derivative and derive an expression in a
similar fashion to get
(3.5.4)
We will see, as we go along, that we apply these kinds of boundary conditions quite
often.
Assignment 3.9
(1) Translate the equations (3.5.3), (3.5.4) to a form using one subscript and
a stride.
(2) Repeat the first assignment 3.1 and implement the Neumann condition
on the x-axis as indicated above.
(3) Does it make a difference if you first iterate in the usual fashion with
Dirichlet conditions and apply the Neumann condition in later iterations.
Start with the solution to the Dirichlet problem as the initial condition
for the Neumann problem.
(4) Plot contours and see what changes occur due to the change in boundary
conditions.
How does changing the boundary condition on one side affect the properties
of our solution? Well, the maximum principle does not change. How about the
uniqueness of the solution? It is outside the scope of our study here. We will just
state that the solution in this case is indeed unique.
3.6. First Order Wave Equation
Keep this picture in mind as you read the next few pages. There is a stream of
water flowing at a speed from your left to your right. You have some paper boats
that you have made and are placing these boats in the stream one after another
and they are carried away by the stream. Can we figure out where the boats are
at any given time after they are released?
107
x
Figure 3.12. A series of paper boats released in a stream of water
flowing along the positive x direction
Cold water
Hot
water
Contact Surface
x
108
3. SIMPLE PROBLEMS
t
a.
b.
c.
x
t
d.
t
e.
Figure 3.14. xt plots. Try to figure out what each curve means.
x is the length measured along a road in IIT Madras. t is time.
With these preliminaries behind us, let us look at an equation which is as
important to our study as Laplaces equation. The one-dimensional, first order,
linear wave equation is
u
u
(3.6.1)
+
=0
t
x
where is positive. u is just some property for now. We will try to understand the
behaviour of this equation and then tie some physical meaning to u.
Let us see what this equation does for us. If the unit vector in the t direction
were and the unit vector in the x direction were , then, the equation can be
rewritten in terms of the gradient operator as
(3.6.2)
(
+ ) u = 0
x0
x
+s
=
(3.6.4)
1
0
t
where, (x0 , 0) is the point , the point where the characteristic intersects the x-axis.
s is measured along the characteristic. It takes a value zero at x0 . We could write
out the two equations separately as
(3.6.5)
(3.6.6)
x = x0 + s dx = ds
t = s dt = ds
109
t
s
110
3. SIMPLE PROBLEMS
f ()
t
u
111
boundary condition at x = 0, say, u(0, t) = g(t). These are the boundary conditions
as required by the physics of the problem.
Let us consider a few cases and see what we get. Try out the following problems.
Assignment 3.11
Given the initial conditions and a boundary condition find the solution to the
equation on the interval [0, 1].
u u
+
=0
t
x
(3.6.8)
(1)
(2)
(3)
(4)
(5)
All of these clearly indicate (except the first one) that we are just shifting the
solution in time along xt. In fact, given any function f () as the initial condition
at t = 0, f (x t) should be a solution. This is easily shown as follows.
(3.6.9)
f
f
f
=
=
()
t
t
and,
(3.6.10)
f
f
f
=
=
(1)
x
x
(3.6.11)
112
3. SIMPLE PROBLEMS
(3.6.13)
u(x, t) =
An ein2(xt)/L =
un
This can be substituted back into equation (3.6.1) to see if it is satisfied. As the
differential equation is linear, we can test each term un individually. You can
checkout uniform convergence. I am not going to bother about it here. The two
terms of equation (3.6.1) give
(3.6.14)
(3.6.15)
un
t
un
x
=
=
2
2
An ein2(xt)/L = in un
L
L
2
2
in An ein2(xt)/L = in un
L
L
in
We can clearly see that un satisfies the wave equation. Remember, the use of
Fourier series presupposes a periodic solution.
Okay. Reviewing what we have seen on the wave equation, it is clear that the
discussion surrounding equations (3.6.1), (3.6.2), and (3.6.3) is pivotal to everything
accomplished so far. We also see that the argument does not require that is a
constant. What do we mean by this? We have nowhere made use of the fact that
is a constant. Or, so it seems. One way to find out is to see what happens if
is not a constant. Since, we know now that is a propagation speed, we are aware
that a situation of varying
can actually occur. The speed of sound in air, for
(3.6.16)
u
u
+ (x, t)
=0
t
x
(3.6.17)
u
u
+u
=0
t
x
This is the quasi-linear one-dimensional wave equation. You may also see it referred as the inviscid Burgers equation. We can go back to our discussion on
characteristics to see what it gets us. The characteristic equation (3.6.7) in this
case becomes
(3.6.18)
dx
= u(x, t)
dt
113
characteristic originates from a point taken from a set of equally spaced points on
the x-axis. You can make out from the figure that, the characteristics emanating
from the interval [0, 1] of the x-axis are spreading out. This is called an expansion
fan. Inspection of the expansion fan shown in figure 3.18 should tell you that the
characteristics pass through (0, 0), (0.2, 0), (0.4, 0), (0.6, 0), (0.8, 0), and (1.0, 0).
For the characteristic i, we know the ui and xi at time t = 0. Now, at t = 0.2,
what is the corresponding xi of this characteristic? Using the slope-intercept form
114
3. SIMPLE PROBLEMS
(3.6.19)
(3.6.20)
Using this expression, we plot the solution for various times. Figure 3.19 shows the
initial condition at t = 0. Check that the characteristics in Figure 3.18 match this
function. Figure 3.20 shows the solution after one time unit. Let us compare the
u
A
1
Figure 3.19. The initial condition at t = 0.
two figures to see how the solution is evolving in time. Consider the line segment
AB as shown in Figure 3.19. It is moving at unit speed from left to right. In
Figure 3.20, we see that it has moved to the right by unit length in unit time and
only point A can be seen on the page. At the origin, the speed of propagation
is zero and hence that point does not move at all. In between, we see that we
have proportionate speed. Hence, the linearly increasing part of our function, line
segment OA, experiences a stretch. After two time units, the ramp is much less
u
A
115
u
A
and the end of the ramp is moving at unit speed away from the origin, the ramp
angle is going to keep decreasing.
Now consider the last problem. The characteristics corresponding to the initial
condition are shown in Figure 3.22. First the easy part, the characteristics coming
from the t-axis with arrowheads correspond to the boundary condition given at
x = 0. These are not extended all the way to the end of the figure to keep the figure
from becoming unreadable. The vertical characteristics on the right correspond to
the u = 0 part of the initial condition. Which leaves the characteristics emanating
from the unit interval at the origin. They intersect each other at x = 1, t = 1.
We have extended the lines beyond this point to show them intersecting other
characteristics. Is this a problem? Yes! This is a problem. ui is supposed to be
constant on the characteristic xi (t). If they intersect, what value does u take at
the point (1, 1)?
Let us go ahead and draw the other figures and see what we get. We have the
initial condition drawn in Figure 3.23. It is clear that we expect to have motion from
left to right for the point at the origin. This corresponds to our first characteristic
starting at the origin in Figure 3.22.
After a time of about t = 0.25 units we will get a graph of the new state of u
as shown in Figure 3.24. Here we notice, and it should not come as a surprise to
you, that the ramp is getting steeper. In fact, if you look at Figure 3.25, we see
that our solution continues to be continuous but the ramp is getting steeper and
at t = 1 we end up with the function shown in Figure 3.26. All the characteristics
intersect at (1, 1) and this results in this jump. How do we interpret this jump?
We will draw one more figure and see what we get for a time t > 1. This is shown
in Figure 3.27.
We will look at a discrete analogue first. A railway line is a good example of a
one-dimensional space. We imagine there is a railway station at x = 1 and that we
have a train stopped at that station. We also have at least five other trains speeding
along at different speeds. The positions of the trains are indicated in Figure 3.22.
Fortunately, at the station they have enough sidings and parallel through tracks
that the trains are able to overtake each other. The fastest express train gets ahead
of the others and yes, the function is multi-valued at the station at that time since
you have more than one train going through. Now, if the trains were not allowed
116
3. SIMPLE PROBLEMS
Figure 3.22. The characteristics corresponding to the initial condition in problem 3 of the assignment. u decreases from 1 to zero
and is then constant at zero.
u
117
118
3. SIMPLE PROBLEMS
Figure 3.28. The characteristics corresponding to the initial condition in problem 3 of the assignment. u decreases from 1 to zero
and is then constant at zero.
tube, if it results in an increase in the speed of sound behind it, the next wave that
comes along is going to be travelling faster than the first and so on. The problem
with this train of waves is they cannot overtake each other. They are confined to
the pipe. In this context, Figure 3.27 makes no sense and does not represent the
physics of the problem. Instead as shown in Figure 3.26, a discontinuity called a
shock is formed. This shock continues to propagate through the pipe beyond (1, 1).
So we have to redraw Figure 3.22 for the gas dynamic case as shown in Figure 3.28
Is there a way to find out how fast the shock propagates [Lax73]? We
will investigate this in greater detail in section 3.13. We can make the following
observations.
119
(1) The shock seems to consume all characteristics that intersect it.
(2) We started with a continuous function and the quasi-linear wave equation generated the discontinuity. Just because our initial and boundary
conditions are smooth does not mean our solution will be smooth.
We now return to our linear first order one-dimensional wave equation. How
do we solve this equation numerically? Considering our success with the Laplace
equation, we will just go ahead and perform the discretisation. We will do this in
a section on stability. Read on and you will understand why.
3.7. Numerical Solution to Wave Equation: Stability Analysis
This problem has a given initial value. With positive, we have seen that
property u propagates from left to right.
Since we were successful with the Laplace equation, we repeat the same process
for the wave equation. Lets consider some general grid point indexed as p, q. The
index p is in space, that is, along x and the index q is in time, that is, along t. The
equation can be discretised as
up,q+1 up,q1
up+1,q up1,q
(3.7.1)
+
=0
2t
2x
{z
}
{z
}
|
|
central difference in time
This gives us an equation for u at the next time step given that we know its values
at prior time steps. This clearly will create a problem at t = t, since we are
only given values at t = 0. This problem can be fixed. However, for now, we
will get around this problem by using a forward difference in time. We will retain
the central difference in space so as not to lose the advantage of a second order
representation for the spatial derivative. This gives us
up,q+1 up,q
up+1,q up1,q
(3.7.2)
+
=0
t
2x
|
|
{z
}
{z
}
forward difference in time
p, q + 1
p 1, q
p + 1, q
p, q
Figure 3.29. The grid points involved in the solution to the wave
equation using Forward TimeCentred Space (FTCS) scheme. The
wave equation is approximated at the point p, q by the finite difference equation (3.7.2).
(3.7.2) for up,q+1 as
(3.7.3)
up,q+1 = up,q t
up+1,q up1,q
2x
120
3. SIMPLE PROBLEMS
It looks like we have an automaton that we can use to just march in time picking
up the solution as we go along. All the quantities on the right hand side are know
at the current time level q.
We will now look at this equation and ask the question: does the automaton
generate a sequence of us that represent the solution or does the sequence diverge.
Colloquially, does the solution blow up? This last question relates to stability
analysis. It can be restated as: Is the scheme stable?
If you are wondering: whats the point of the discussion, lets get coding. I
would suggest that you can indeed get coding and see what happens. Meanwhile, as
the wave equation that we are looking at (equation 3.6.11) is a linear homogeneous
equation, a perturbation to it would also be a linear homogeneous equation with
homogeneous boundary conditions. The discrete equation would be the same as
equation (3.7.3). We have seen earlier that a periodic solution to the equation can
be written in the form given by equation (3.6.13) This can be rewritten as
X
X
(3.7.4)
u(x, t) =
An ein2(xt)/L =
un
n
For convenience we can take L = 2. You can verify that it makes no difference to
the analysis that follows.
Both equation (3.6.1) and equation (3.7.3) are linear. Just to remind ourselves
what we mean when we say something is linear we look at the example of a linear
function L(x). L(x) is linear means that L(a1 x1 + a2 x2P
) = a1 L(x1 ) + a2 L(x2 ).
Of course, if instead of a sum consisting of two terms,
n=1,2 an xn we have an
infinite number of terms as in the Fourier series expansion, we do have concerns
about uniform convergence of the series. This is something that you can look up
in your calculus text. We will sidestep that issue here and get back to our stability
analysis.
As we are dealing with a linear equation and scheme, we need to look only at
one generic term, un , instead of the whole series. If the scheme is stable for any
arbitrary n then it is stable for all the n. The fact that un does not diverge for
all n does not mean that the series will converge. On the other hand, if un does
diverge for some n, the series is likely to diverge.
To answer the question as to what happens as we advance in time using our
numerical scheme / automaton, we rewrite un as
(3.7.5)
un = an (t)einx
A blowup, as we move forward in time with our automaton, would mean the an (t)
is growing. We are going to be having lots of subscripts and superscripts going
around. Since there is no chance of confusion, we know we are considering one
wave number n of the Fourier series, we will drop the subscript n from the equation
(3.7.5) to get
(3.7.6)
u = a(t)einx
up,q = aq einpx ,
xp = px
121
which tells us that the gain from the time step q to q + 1 can simply be written as
up,q+1
aq+1
(3.7.8)
g=
=
up,q
aq
For stability we will require that |g| < 1. In equation (3.7.3), not only do we have
up,q , we also have up1,q and up+1,q . In order to obtain a simple expression g, we
need to rid ourselves of the p 1 and p + 1 subscripts. To this end, we do the
following.
(3.7.9)
and
(3.7.10)
up,q+1
= 1 (cos + i sin cos() i sin())
(3.7.12)
g=
up,q
2
(3.7.13)
g = 1 i sin
For stability, we do not want the u to grow. So, the factor g by which u is multiplied
each time should have a magnitude less than one. We require
(3.7.14)
|g|2 = g
g = 1 + 2 sin2 < 1
p, q + 1
p + 1, q
p, q
Figure 3.30. The grid points involved in the solution to the wave
equation using Forward TimeForward Space (FTFS) scheme. The
wave equation is approximated at the point p, q by the finite difference equation (3.7.15).
equation at the point p, q now becomes
122
3. SIMPLE PROBLEMS
up,q+1 up
up+1,q up,q
+
=0
t
x
(3.7.15)
or
up+1,q up,q
x
Substituting from equation (3.7.9) into equation (3.7.16) we get
(3.7.17)
up,q+1 = up,q ei 1 up,q
up,q+1 = up,q t
(3.7.16)
|g|2 = g
g = (1 + cos )2 + 2 sin2
= 1 + 2 + 2 cos2 + 2 2 cos
This gives
(3.7.20)
< 0 and
> 1
We get a condition for stability, but what does it mean? Well, let us look at the
condition that < 0, the condition says that
t
< 0 = t < 0 or x < 0
x
We do not want to go back in time so the condition on the time step is really
not of interest right now. The other condition, x < 0, can be interpreted as giving
us a hint to fix this problem. It seems to tell us to use a backward difference in
space rather than a forward difference. How do we conclude this?
Well, we are assuming is positive. That is the waves are moving left to right.
How about if were negative. Then our current scheme would work since the wave
would be moving from right to left. Our forward differencing also has points that
are on the right of our current spatial location. So, to get a scheme that works for
positive, we use a forward difference in time and a backward difference in space.
We use the grid points shown in Figure 3.31 to get the following equation.
up,q up1,q
(3.7.22)
up,q+1 = up,q t
x
We repeat our stability analysis to get
(3.7.23)
up,q+1 = up,q 1 ei up,q
(3.7.21)
123
p, q + 1
p 1, q
p, q
Figure 3.31. The grid points involved in the solution to the wave
equation using Forward TimeBackward Space (FTBS) scheme.
The wave equation is approximated at the point p, q and results in
the automaton given in equation (3.7.22).
giving
g = 1 + cos i sin
(3.7.25)
|g|2 = g
g = (1 + cos )2 + 2 sin2 < 1
This gives
(3.7.28)
(3.7.30)
Which tells us
(3.7.31)
( 1) < 0
> 0 and
<1
124
3. SIMPLE PROBLEMS
p, q + 1
p 1, q + 1
p + 1, q + 1
p, q
Figure 3.32. The grid points involved in the solution to the wave
equation using Backward Time-Centred Space (BTCS) scheme.
Note that the wave equation is approximated at the point p, q + 1
by the finite difference equation (3.7.32).
discrete equation
(3.7.32)
up,q+1 +
g=
1
1 + i sin
g
g=
1
1+
sin2
Which is always true for n > 0. Aha! BTCS is STABLE. No conditions. Remember,
as the saying goes: You dont get nothin for nothin. The BTCS scheme on closer
inspection requires that we solve a system of equations. We will look at this in
greater detail later. The lesson to take away from here is
we cannot just discretize the equations and assume that
the resulting scheme will work.
This analysis is referred to as the von Neuman stability analysis. An important
requirement is that the equation that we analyse is linear. If the equation is not
linear, we will linearise it. For this reason, it is also called linearised stability
analysis.
3.7.1. Courant number or CFL number. From this analysis we see the
significance of the parameter . It is referred to as the Courant number or the CFL
number. What do these stability conditions mean? What does this number mean?
A closer inspection of reveals that it is non-dimensional. We have seen from
our initial study of the wave equation that is a propagation speed. x/t is
called the grid speed. In a sense, the grid speed is the speed at which our program
is propagating u. is the ratio of the physical speed to the grid speed.
What does it mean that FTBS is stable if 0 < 1? This says that the
physical speed needs to be less than the grid speed. Since we are choosing the
grids, it tells us to choose the grids so that the grid speed is greater than the
physical speed.
125
(3.8.1)
(3.8.2)
u t2 2 u t3 3 u t4 4 u
+
+
+
+
t 2! t2
3! t3
4! t4
u x2 2 u x3 3 u x4 4 u
t
up,q + x
+
+
+
= up,q
+
2x
x
2! x2
3! x3
4! x4
u x2 2 u x3 3 u x4 4 u
+
+
)
(up,q x
x
2! x2
3! x3
4! x4
up,q + t
+
+
=
t
x
3! x3
2! t2
3! t3
4! t4
We have isolated the expression for the original wave equation on the left hand
side. On the right hand side, we have terms which are higher order derivatives in
x and in t. Since at any time level, we have u for various x, we convert the time
derivatives to spatial derivatives. We do this by taking appropriate derivatives of
equation (3.8.3) and eliminating terms that have time derivatives in them. Again,
remember, we will keep terms only up to the fourth derivative. A rather tedious
derivation follows this paragraph. You can skip to the final result shown in equation
(3.8.24) if you wish. I would suggest that you try to derive the equation for yourself.
We take the time derivative of equation (3.8.3) to get the second derivative in
time.
2u
t 3 u t2 4 u
x2 4 u
2u
+
+
=
(3.8.4)
t2
tx
3! tx3
2! t3
3! t4
The third time derivative is
3u
t 4 u
3u
+ 2
+
=
(3.8.5)
3
t
t x
2! t4
(3.8.3)
126
3. SIMPLE PROBLEMS
+
t4
t3 x
To get rid of the mixed derivative on the right hand side of equation (3.8.6) we
differentiate equation (3.8.5) with respect to x and multiply by , to get
(3.8.6)
4
4u
2 u
=
+
xt3
t2 x2
Subtracting this from equation (3.8.6) we get
(3.8.7)
4
4u
2 u
=
+
t4
t2 x2
In order to eliminate the mixed derivative on the left hand side of equation (3.8.4)
differentiate equation (3.8.3) with respect to x and multiply by to get
(3.8.8)
2u
x2 4 u
t 3 u
t2 4 u
2u
+ 2 2 = 2
+
xt
x
3! x4
2! xt2
3! xt3
Subtracting equation (3.8.9) from equation (3.8.4) we get
(3.8.9)
2
2 4
t 3 u
t2 4 u
2u
2 u
2 x u
=
t2
x2
3! x4
2! xt2
3! xt3
(3.8.10)
2
4
3
2 4
x u
t u t u
+
3! tx3
2! t3
3! t4
If we were to differentiate equation (3.8.9) twice with respect to x we can eliminate
the mixed derivative term in equation (3.8.10) which has only one time derivative
in it. On performing the differentiation, we get
4
4u
2 u
+
=
x3 t
x4
Substituting we get the latest expression for the second time derivative as
(3.8.11)
2
2 4
2
4u
t 3 u
2u
2 u
2 x u
2 t
=
t2
x2
3! x4
2! xt2
3! x2 t2
(3.8.12)
3
2 4
2 4
x u t u t u
+2
+
3! x4
2! t3
3! t4
We now differentiate this equation with respect to x to get
3
4u
3u
2 u
=
+
t
+
xt2
x3
x2 t2
Eliminating the mixed third derivative and substituting for the fourth time derivative in equations (3.8.5) we get
(3.8.13)
4
3
t 4 u
t 2 4 u
3u
3 u
2 t u
=
+
t3
x3
2! x2 t2
2! xt3
2! t2 x2
This can be simplified to
(3.8.14)
3
3u
4u
t 4 u
3 u
2
=
t
+
+
t3
x3
x2 t2
2! xt3
We now differentiate this equation with respect to x
(3.8.15)
(3.8.16)
4u
4u
= 3 4 +
3
xt
x
127
+
t3
x3
x2 t2
2! x4
We have one last mixed derivative in this equation. Actually, there are an infinity
of these mixed derivatives. We have restricted ourselves to fourth derivatives. To
get rid of this last mixed derivative in the third derivative term, we rewrite the
equation for the second time derivative (3.8.12) as
(3.8.17)
3
2
2 4
2
2u
4u
2 u
2 x u
3 t u
2 t
=
t2
x2
3 x4
2! x3
3! x2 t2
(3.8.18)
t 3 u
+
2! t3
If we differentiate this equation twice with respect to x, we get
4
4u
2 u
=
+
x2 t2
x4
The third time derivative finally becomes
(3.8.19)
(3.8.20)
4
3
3u
3 u
4 3t u
=
+
t3
x3
2 x4
4
2
2
2
u
3u
2u
2 u
3
2 x
4 11t
=
t
+
+
2
2
3
t
x
x
3
12
x4
and for completeness
(3.8.21)
(3.8.22)
4
4u
4 u
=
+
t4
x4
u
t 2 u
u
+
= 2 2
t
x
2! x
4
2
2
x2
u
t2 3 u t
2 x
4 t
+ 3
+
3!
3
x3
2!
3
2
x4
We will consolidate coefficients so that they are in terms of x and .
(3.8.23)
u
u
x 2 u
+
=
t
x
2! x2
3u
4u
x2
x3
+
1 + 2 2
2 + 3 2
3
3!
x
12
x4
This equation is very often referred to as the modified equation. A scheme
is said to be consistent if in the limit of t and x going to zero, the modified
equation converges to the original equation.
For the case of the FTBS scheme, the modified equation becomes
(3.8.24)
u
x
x2
u
+
=
( 1)uxx +
( 1)(2 1)uxxx
t
x
2
3!
x3
( 1)(6 2 6 + 1)uxxxx + . . .
4!
clearly for the first order, one-dimensional, wave equation, both FTCS and FTBS
are consistent. It is interesting to note that the modified equation of FTBS is
(3.8.25)
128
3. SIMPLE PROBLEMS
identical to the wave equation for = 1. What does this mean? We are solving the
original equation (in this case the wave equation) and not some modified equation.
Does that mean we get the same answer? No! We are still representing our solution
on a discrete set of points. If you tried to represent a step function on eleven grid
points you would actually get a ramp. Try it. This ramp then will be propagated
by the modified equation. Even though the modified equation becomes the wave
equation when = 1, we cannot get away from the fact that we have only eleven grid
points. Our representation of the differential equation is accurate in its behaviour.
Our representation of the solution at any time is approximate. So, if that is the
problem, what happens in the limit to the solution of the discrete problem as t
and x going to zero? If the discrete solution goes to the solution of our original
equation we are said to have a scheme which is convergent.2
Consistency, Stability, Convergence: There is a theorem by P. D. Lax that
states that if you have the first two of these you will have the third.
Assignment 3.13
(1) Verify the modified equation (3.8.25) for the FTBS scheme applied to the
linear wave equation given in equation (3.6.1).
(2) Derive the modified equation for the FTFS scheme applied to the linear
wave equation.
(3) Verify that the modified equation for the BTCS scheme applied to the
linear wave equation is
(3.8.26)
u
u
x 2 u
+
=
2
t
x
2
x
3 u x3 2
4u
x2 2
+
+
2 + 1
3
+
2
3!
x3
12
x4
u
2u
u
+
= 2 2
t
x
x
(3.9.2)
We seek a solution in the form
(3.9.3)
2If you have a serious conversation with a mathematician they may want the derivatives also to
converge.
129
A nice way of saying, I am guessing the solution looks like this. We find the
derivatives
u
(3.9.4)
= an {in + b}ein(xt) ebt = {in + b}u
t
u
(3.9.5)
= an {in}ein(xt) ebt = inu
x
2u
(3.9.6)
2 2 = 2 n2 u
x
3u
(3.9.7)
3 3 = 3 in3 u
x
4u
(3.9.8)
4 4 = 4 n4 u
x
Substituting into the equation (3.9.2) we get
(3.9.9)
b = 2 n2 = u(x, t) = an ein(xt) e2 n
We observe the following. If 2 > 0 then the solution decays. Higher wave numbers
decay faster than lower wave numbers. If we look at our modified equation for the
FTCS technique, equation (3.8.24), we see that the coefficient of the second spatial
derivative is 2 = x
2! . As 2 is negative, the scheme is unstable.
Now let us see the effect of having a third derivative term by looking at the
equation
u
3u
u
+
= 3 3
t
x
x
(3.9.10)
we immediately see that
(3.9.11)
= an ein[x(+3 n
)]t
The third derivative contributes to the speed of propagation of the wave. The speed
depends on n2 , for positive 3 , higher wave numbers travel faster than lower wave
numbers. This effect is known as dispersion. Now, finally, let us consider the effect
of the fourth derivative in the equation.
(3.9.12)
u
u
4u
+
= 4 4
t
x
x
and
(3.9.13)
b = 4 n4 = u(x, t) = an ein(xt) e4 n
In this case, for stability we require 4 < 0. This term is clearly a very strong
damping mechanism. Also, as with the second derivative term, high wave numbers
are damped much more than lower wave numbers.
The effect of the extra terms in the modified equation and the coefficients
corresponding to FTCS and FTBS are summarised in table 3.9. We see that both
FTCS and FTBS have second derivative terms. However, the coefficient 2 for
FTCS is negative and we would conclude the scheme is unstable as the solution to
the modified equation is unstable. The FTFS scheme is also unstable for positive
values of . As we have seen earlier, it is stable only for 1 < 0. The FTBS
scheme, on the other hand, is stable for 0 < 1 as 2 is positive. It will also
capture the exact solution at = 1. For < 1 it is quite dissipative. It will
dissipate higher frequencies faster than lower frequencies.
130
3. SIMPLE PROBLEMS
Term
2 n2
i3 n3
4 n4
FTCS
2 (1 + 2 2 )
23 (2 + 3 2 )
FTFS
1 (1 + )
2 (1 + )(2 + 1)
3 (1 + )(6 2 + 6 + 1)
FTBS
1 (1 )
2 (1 )(2 1)
3 (1 )(6 2 6 + 1)
Table 3.1. These are the coefficients that occur in the modified
equation of FTBS, FTFS, and FTCS. Summary of the effect of
higher order terms on the solution to the one-dimensional first
order linear wave equation: 2 > 0 or 4 < 0 is dissipative, high
wave numbers decay faster than low ones, 3 6= 0 is dispersive,
3 > 0, high wave numbers travel faster than low ones. Here,
s = xs /(s + 1)!, s = 1, 2, 3
Both of the schemes have third derivative terms. Since dissipation is rather
rapid it may be difficult to observe the dispersion in the FTBS scheme. In the
FTCS scheme we have dispersion, but unfortunately the scheme is not stable. However, small CFL values will slow down the growth in FTCS. High frequencies are
also less unstable (meaning they do not grow as fast as low wave numbers). So
we can observe the dispersion. Also, in the FTCS scheme, the coefficient of the
third derivative term is negative for small . Therefore, low frequencies will travel
faster than high frequencies. Fine. We will demonstrate dispersion in the following
fashion.
(1) We will employ the FTCS scheme to demonstrate dispersion.
(2) We will use a unit length domain divided into a 100 intervals.
(3) We will use the differential equation
u u
+
=0
t
x
The equation is propagating u at unit speed.
(4) We start with an initial condition u(x, 0) = sin(2x) + 0.05 sin(50x).
This initial condition corresponds to a the two wave numbers n = 1 and
n = 25.
(5) We will use a CFL of 0.05.
The results of the computation for time steps t = 1, 201, 401, 601, 801, 1001 are
shown in Figure 3.33. With = 0.05, we expect that our solution should progress
through the unit computational domain in 2000 time steps. This is because our
equation is supposed to be propagating u at unit speed. We can clearly see that
the component of the solution that has wavenumber 25 seems to be stationary. The
wavenumber n = 1, on the other hand, is being propagated by the wave equation in
a routine fashion. In fact, at the times shown, the trailing edge of the low frequency
131
0.1
0.2
0.3
0.4
0.5
Figure 3.33. The wave equation, with wave speed=1, is discretised using FTCS. The solutions generated using 101 grid points,
with = 0.05 are shown at time-steps t = 1, 201, 401, 601, 801, and
1001. The amplitude of the high frequency component is growing
as FTCS is an unstable scheme. You can make out the low frequency component propagating from left to right faster than the
high frequency wave
part is at locations 0.0, 0.1, 0.2, 0.3, 0.4, and 0.5 as expected. You will also observe
that the amplitude of the high frequency component is increasing in time. If we
were to run this a little longer, the program would blowup.
Let us collect our thoughts at this point and see where this leads us. We
have seen that numerical schemes that are applied to the wave equation can be
unconditionally unstable, conditionally stable or unconditionally stable. We have
also seen that this instability, attributed earlier to the the unbounded growth in
the von Neuman stability analysis can now be tied to the sign of the second order
or fourth order diffusion term. Further, we can see that all frequencies may not
diverge at the same rate. Schemes may introduce dissipation that did not exist in
the original problem. Added to this is the fact that some schemes may introduce
dispersion that does not exist in the original problem. A close inspection of these
techniques indicates that the unstable schemes have the wrong kind of dissipation.
132
3. SIMPLE PROBLEMS
A thought: We can take the FTCS scheme and add some second order
dissipation to it. After all, we are actually solving the modified equation
anyway. Why not modify it, carefully, to suit our stability requirement.
To justify this thought, we ask the following question:
Question: What is the difference between the backward difference approximation of the first derivative and the central difference representations of
the same?
First, we squint at this question to get an idea of the answer. We are asking
for the difference of two first derivatives. Well, it is likely to look like a second
derivative. You can make this out from Figure 2.30.
Finally, we just get in there and subtract one from the other
up,q up1,q
up+1,q up1,q
x
2x
up+1,q + 2up,q up1,q
x 2 u
=
2x
2 x2
Just out of curiosity, what is the difference between forward difference and the
centred difference
up+1,q up,q
up+1,q up1,q
(3.9.15) diff =
x
2x
up+1,q 2up,q + up1,q
x 2 u
=
2x
2 x2
Only the sign of the quantity representing the second derivative is flipped. The
magnitude is the same.
Let us try to pull all of this together now. For positive, waves propagating
from left to right, FTBS will work. For negative, wave propagating right to left,
FTFS will work. So, the scheme we use should depend on the sign of .
We will work some magic now by defining two switches, watch.
(3.9.14)
diff =
|| +
2
||
s () =
(3.9.17)
2
So, a scheme which takes into account the direction of the wind or stream can
be constructed as
(3.9.18)
up,q+1 = up,q s+ (){up,q up1,q } + s (){up+1,q up,q }
(3.9.16)
s+ () =
Lo and behold, if is positive we get FTBS otherwise we get FTFS. The extra
term that has been added is called artificial dissipation. Now, one could frown
on adding this kind of artificial dissipation to stabilise the solver. After all, we are
133
deliberately changing the governing equation. On the other hand, we are always
solving something like the modified equation. In fact:
The choice of the discretisation scheme is essentially
choosing a modified equation.
Why not engineer the modified equation? At least, then, we will get it in the form
that we want.
What exactly do we have to add to eliminate the second derivative term all
together from the modified equation of the FTCS scheme? By looking at the
modified equation given in (3.8.24), you may think that adding the term
x 2 u
2 x2
would do the trick. If we could add the term exactly, it would. However, we are
forced to approximate the second derivative term in the expression (3.9.20) using
a central difference approximation. The resulting modified equation still ends up
having a second derivative term. It turns out what we need to add is
(3.9.20)
2
{up+1,q 2up,q + up1,q }
2
This eliminates the second derivative term from the modified equation.
(3.9.21)
Assignment 3.14
Verify that adding the expression (3.9.21) does indeed eliminate the second spatial
derivative from the modified equation of FTCS applied to the first order linear
one-dimensional wave equation.
If we were interested only in a steady state solution, one could add a mixed
derivative to get rid of the dissipation term as we got to a steady state solution.
For exmaple,
3u
u
u
+
= 3
t
x
t 2 x
We will get the mixed third derivative term by taking the time derivative of the
second spatial derivative in equation (3.9.6)
(3.9.22)
2u
u
= 3 n2
= 3 n2 {in + b}u
2
t x
t
Substituting back into the equation we get
3
(3.9.23)
(3.9.24)
Which gives b as
(3.9.25)
b = 3 n2 {in + b}
{1 + 3 n2 }b = i3 n3
b=
i3 n3
{1 + 3 n2 }
134
3. SIMPLE PROBLEMS
(3.9.26)
u(x, t) = an ein(xt) e
3
i {1+
n2 } t
3
= an exp in x
3 n2
{1 + 3 n2 }
t
Again, like the earlier spatial third derivative, this mixed third derivative is also
dispersive. Let us consider some of the words we have used.
dissipation: This term used here in the CFD context is not to be confused
with the term that appears in the energy equation of your fluid mechanics
course. It is used here to indicate that the amplitude of a certain wave is
decreasing.
decay: Used here synonymously with dissipation. I personally prefer this
term to dissipation.
dampen: Dampen again means reduction, attenuation...
However, CFD literature typically refers to the process as dissipation.
t
p, q + 1
t
t =
x
p 1, q
p, q
Figure 3.34. Yet another way to look at stability, FTBS has the
upstream influence. The arrow indicates flow of information (u).
This figure represents the case of = 1. The arrow, in fact, is a
characteristic. Since = 1, the zone of influence and the zone of
dependence happen to be grid points
Lets look at a different interpretation of the stability conditions that we have
derived for the wave equation. Figure 3.34 shows the three grid points that participate in the FTBS algorithm. The figure shows a setting where = 1. The arrow
indicates the direction of propagation of information or influence. We clearly see
that the upstream influences the downstream, which is good. The point (xp1 , tq )
is called the zone of dependence of the point (xp , tq+1 ). The point (xp , tq+1 ) is
called the zone of influence of the point (xp1 , tq ).
Now take a look at Figure 3.35. The physical line of influence is reproduced
from Figure 3.34 for reference and is shown by the solid arrow. The grid point
135
t
x
p, q + 1
x
t <
p, q
p 1, q
x
up,q+1 = up,q
up,q up1,q
= (1 )up,q + up1,q
x
We see that up,q+1 is a convex combination of up,q and up1,q . This should remind
you of the hat functions, Laplaces equation, and SOR. First, let us look at why this
is not a solution to Laplaces equation or SOR. To make the discussion interesting
look at what happens when = 0.5. We get the average of up,q and up1,q . The
reason why this turns out to be the wave equation and not the Laplace equation
is the fact that this average is taken to be the new value of u at the point xp . A
clear case of propagation to the right. However, as we have seen in the modified
equation for FTBS, the second derivative term is also present.
We are using hat functions to interpolate here. So, values in [0, 1] will give us
linearly interpolated values on the interval [xp1 , xp ]. Since we have right propagating waves, the points on the interval x shown determine / influence the points
on the t = /x interval shown in Figure 3.34. The t shown in this figure is
the maximum permissible value from the stability condition. We will call it tm
for this discussion. The theoretically correct value of u at tq + tm can be obtained
by setting = 1 in the equation (3.9.27). This is u(xp , tq + tm ) = up1,q . For
a t < tm , see Figure 3.35, that is < 1, a point on the interval [xp1 , xp ],
determines the value of up,q+1 . We can also take the equation (3.9.27) as saying
that for < 1, the value up,q+1 is the weighted average of u(xp , tq + tm ) and up,q .
That is, the convex combination of a future value and a past value. This process is
stable, as up,q+1 is bounded by u(xp , tq + tm ) = up1,q and up,q .
136
3. SIMPLE PROBLEMS
p, q + 1
x
t >
p 1, q
p, q
Now, for the case when > 1, see Figure 3.36, t > tm . The zone of
dependence of the point (xp , tq+1 ) is a point to the left of xp1 . Our scheme only
involves up,q and up1,q . As a consequence, we end up extrapolating to a point
outside the interval [xp1 , xp ]. This requires > 1. So all of this is consistent, why
does it not work? Remember our example at the beginning of the chapter. You
may be placing boats on a stream of water, or adding dye to the stream of water.
The dye is carried along with the water. We may start and stop adding dye as
an when we wish. Downstream, we only have the power to observe how much dye
there is in the interval [xp1 , xp ]. Just because we observe dye in that interval now,
does not mean that the interval [xp2 , xp1 ] has the same amount of dye. In fact,
it may have none at all. The stream is flowing at a speed ms1 in the positive x
direction. Only after a time t > tm will the actual status of the dye currently
in the interval [xp2 , xp1 ] be known at the point xp . Fiddling around with values
up,q and up1,q , of dye density, does not help since they have no causal relationship
to the density of dye in the interval [xp2 , xp1 ]. The fact that our scheme does
not allow the u in the interval [xp2 , xp1 ] to influence / cause the up,q+1 is the
reason for the instability. Or simply, if > 1, we can no longer guarantee that the
computed up,q+1 is bounded by up,q and up1,q . This means that when we view the
computation along the time axis, we are extrapolating from two past data points
into the future which they do not influence.
The conclusion we draw from here is that when computing a point in the future
we need to ensure that we determine the future value from regions on which that
future depends.
137
There is a final point that we will make here. This involves terms we used
earlier: high wave numbers, low wave numbers. High and low are comparative
terms. High compared to what? Low compared to what? These colloquially
posed questions need to be answered. In fact, we have answered them in an earlier
section 2.9. A uniform grid has associated with it a highest wave number that can
be represented by that grid. We found that with eleven grids we could represent
frequency information up to wave number four using hat functions. A wave number
four would be high frequency on that grid. Wave number one would be a low
frequency. On the other hand, on a grid with a hundred and one grid points, wave
number four would be a low frequency and a wave number forty would be a high
frequency.
(3.10.1)
(3.10.2)
uq+1
= uqp + t
p
Performing a stability analysis as we have done before we see that the gain is given
by
(3.10.3)
g = 1 + ei + ei 2 ,
s=
t
,
x2
= nx
0<
t
1
<
2
x
2
This is an interesting result. We saw in the earlier section that adding a second
order term in the form of artificial dissipation was stabilising as seen from the
perspective of the advective equation. We see now that adding a dissipation term
brings with it another constraint on the time step. So, there is an upper limit on
the time step. We want
(3.10.5)
t =
x2 1
x
=
2
We basically have a limit on the amount of artificial dissipation that we can add.
We now find the modified equation for FTCS applied to the heat equation
by substituting for the various terms in equation (3.10.2) with the Taylors series
expanded about the point (p, q).
138
3. SIMPLE PROBLEMS
(3.10.6)
q
q
q
t2 2 u
t3 3 u
u
+
+
+
t p
2! t2 p
3! t3 p
(
q
q
q
t
u
x2 2 u
x3 3 u
q
q
= up +
u
+
x
+
+
p
x2
x p
2! x2 p
3! x3 p
uq+1
= uqp + t
p
q
x4 4 u
+
4! x4 p
q
q
q
u
x2 2 u
x3 3 u
q
q
2up + up x
+
x p
2! x2 p
3! x3 p
)
q
x4 4 u
+
+
4! x4 p
+
(3.10.7)
q
q
q
u
t 2 u
t2 3 u
+
+
+ =
t p
2! t2 p
3! t3 p
(
)
q
q
q
2 u
x2 4 u
x4 6 u
+2
+2
+
x2 p
4! x4 p
6! x6 p
q
q
q
2 u
x2
6t 4 u
u
=
+
+
t p
x2 p
12
x2
x4 p
This modified equation does not help as much with the stability analysis as it did
with the wave equation. However, it is interesting to note that the higher order
term can be made zero by the proper choice of parameters x and t.
How does this stability analysis technique work in multiple dimensions? We
will do the analysis for the two-dimensional heat equation to find out. There is an
ulterior motive to study this now. The two-dimensional heat equation governing
the temperature distribution in a material with isotropic thermal conductivity is
given by
u
=
t
(3.10.9)
2u 2u
+ 2
x2
y
q
uq+1
p,r = up,r +
t q
u
2uqp,r + uqp+1,r
x2 p1,r
t q
u
2uqp,r + uqp,r+1
+
y 2 p,r1
where, p and r are grid indices along x and y grid lines respectively. We define
(3.10.11)
sx =
t
, and
x2
sy =
t
y 2
139
q
q
q
q
uq+1
p,r = up,r + sx up1,r 2up,r + up+1,r
+ sy uqp,r1 2uqp,r + uqp,r+1
Does not look familiar? Multiply and divide the second term on the left hand side
by four.
(3.10.14)
q
uq+1
p,r = (1 4s)up,r + 4s
Now for the ulterior motive. What happens if we substitute s = 0.25. With s = 0.25
we get
uq+1
p,r
(3.10.15)
This, we recognise as the equation we got when we were trying to solve the
Laplace equation using central differences (3.1.11) earlier in section 3.1. Which
resembles the Jacobi iterative solution to the Laplace equation, and q suddenly
becomes an iteration index instead of time index. We see that marching in time is
similar to sweeping in space. We may look at the iterations done in our relaxation
scheme as some kind of an advancing technique in time. In fact, this suggests to us
that if we only seek the steady state solution, we could either solve the steady state
equations and sweep in space or solve the unsteady equations and march in time.
Having two different views of the same process has the advantage that we have more
opportunities to understand what we are actually doing. In a general case, if we
have difficulties to analyse a scheme one way, say analysing the relaxation scheme,
we can switch our view point to the unsteady problem and see if that would shed
any new light on things. If s = 0.25 corresponds to our first attempt at solving
the Laplace equation, what happens for other s values? For instance, to what does
s = 0.125 correspond?
(3.10.16)
q
q
q
q
q
uq+1
p,r = 0.5up,r + 0.5 up1,r + up+1,r + up,r1 + up,r+1
(3.10.17)
q
uq+1
p,r = (1 4s)up,r +
4s q
up1,r + uqp+1,r + uqp,r1 + uqp,r+1
4
You may think this is the the same as equation (3.4.21), with = 4s. Not quite.
The superscripts on the right hand side of equation (3.4.14) are not all the same.
Why did we do over relaxation with Gauss-Seidel and not Jacobi iteration? We
will answer this question now.
140
3. SIMPLE PROBLEMS
Let us get on with the stability analysis now. We see that we get a simple
extension of the shift operator that we obtained earlier.
(3.10.18)
uqp1,r
einx uqp,r
(3.10.19)
uqp+1,r
einx uqp,r
(3.10.20)
uqp,r1
eimy uqp,r
(3.10.21)
uqp,r+1
eimy uqp,r
g=
uq+1
p,r
= 1 + 2sx (cos x 1) + 2sy (cos y 1)
uqp,r
(3.10.23)
This shows that FTCS applied to the two dimensional heat equation leads to a conditionally stable scheme with a stability condition given by 0 < s < 0.25. Looking
at equation (3.10.17) as representing Jacobi iteration with a relaxation scheme, the
stability analysis tells us that the would be limited to (0, 1). = 1 turns out to
be optimal. In fact, from the heat equation point of view, we would be advancing
the solution with the largest possible time step. We can use the over relaxation
parameter with Gauss-Seidel, which is something to be borne in mind if we are
advancing to a steady state solution using an unsteady equation.
We will use this opportunity to point out that one can use the unsteady equations to march in time towards a steady state solution. Such schemes are called
time-marching schemes, which we will distinguish from attempts at solving the
unsteady problem using the unsteady equations, in which case we are trying to
perform time-accurate calculations. From what we have seen so far of dissipative and dispersive effects of solvers, we conclude that time-marching to a steady
state solution represents an easier class of problems in comparison to time-accurate
computations.
Fine, we have an idea of how this stability analysis could be extended to multiple
dimensions. We are now in a position to consider the stability analysis of the FTCS
scheme applied to the linear Burgers Equation. The linear Burgers equation can
be written as
u
u
2u
+
= 2
t
x
x
Any of the explicit discretisation schemes that we have studied so far are likely to
give a discrete equation that can be written as
(3.10.24)
uq+1
= Auqp + Buqp+1 + Cuqp1
p
(3.10.25)
Now, the stability analysis we performed in the earlier sections tells us that the
gain per time step can be written as
(3.10.26)
g=
uq+1
p
= A + Bei + Cei ,
uqp
= nx
141
g
g = (A + Bei + Cei )(A + Bei + Cei ) 1
The left hand side of this equation is maximum when = 0, for A, B, C positive.
This tells us that
A+B+C 1
(3.10.29)
This stability analysis was performed with FTCS applied to Burgers equation
in mind. However, if we pay attention to equation (3.10.25), we see that any scheme
advancing in time in an explicit fashion employing the previous grid points will have
the same requirement on the stability condition. In that sense, this is a very general
result.
3.11. A Sampling of Techniques
We have seen that just using Taylors series and generating formulas for the approximation of derivatives, we are able to convert differential equations to difference
equations. These algebraic equations are then solved.
If we look back at how we manipulated the modified equations to convert
temporal derivatives to spatial derivative, we see that there is an opportunity to
develop a solution scheme employing Taylors series directly. The only difference is
that instead of using the modified equation to eliminate temporal differences, we
use the governing equation as follows
u
u
2u
2u
=
2 =
t
x
t
tx
2u
2u
=
) we get
Now, switching the mixed derivative around (assuming
tx
xt
2
2u
2 u
(3.11.2)
=
t2
x2
This process can be repeated to replace higher derivatives of u with respect to t.
We will now derive a solution technique using Taylors series. Expanding uq+1
we
p
get about the point (xp , tq ) we get
(3.11.1)
(3.11.3)
uq+1
= uqp + t
p
q
q
u
t2 2 u
+
+ R()
t p
2 t2 p
uq+1
= uqp t
p
q
2 2 q
u
2 t u
+
x p
2 x2 p
2 q
q
{up+1 uqp1 } +
{u
2uqp + uqp1 }
2
2 p+1
Does this look familiar. Look at the expression we added earlier to the FTCS
scheme given in (3.9.21). Clearly, we can add more terms to the series and evaluate
(3.11.5)
uq+1
= uqp
p
142
3. SIMPLE PROBLEMS
(3.11.6)
uq+1
+
p
q
q
uq+1
uq+1
+ {uq+1
p
p1 } = up (1 ) {up+1 up1 }
2 p+1
2
The particular case of the Crank-Nicholson would would correspond to = 1/2,
which, incidentally is second order accurate in time.
The use of an intermediate point at which we represent the differential equation
opens up other possibilities. We will take a second look at employing FTCS in some
fashion to get a scheme. At time level q, consider three points p 1, p, p + 1. We
could represent the wave equation at a point in between p 1 and p and identify
that point as p + 21 . We then approximate the wave equation at the point p + 12 , q
(3.11.7)
q+ 1
using FTCS. We would, however, take only half a time step so as to get up+ 21 . To
2
be precise we can write
q
q+ 1
up uqp1
(3.11.8)
up 21 = uqp 1
2
2
2
q
q+ 21
q
up+ 1 = up+ 1
(3.11.9)
up+1 uqp
2
2
2
We do not have the terms uqp 1 and uqp+ 1 on the right hand side of each of these
2
2
equations. What do we do? As always, we improvise by taking the average of the
two adjacent grid points to write
(3.11.10)
(3.11.11)
uqp + uqp1
q
=
u uqp1
2
2 p
uqp+1 + uqp
q
q+ 1
up+1 uqp
up+ 21 =
2
2
2
q+ 1
up 21
2
q+1
q+
143
p, q + 1
1
2
p+
1
2
1
2
q
p1
p+1
p, q
Figure 3.37. We can use an intermediate time step to use a combination of FTCS steps to give a new possibly improved scheme.
We could use the same grids and a combination of FTCS steps
and CTCS steps as an alternative. The rhombus indicates points
at which the differential equation is represented
Then, repeating this process to go from the intermediate time step q +
we can write
uq+1
p
(3.11.12)
q+ 1
q+ 1
up+ 21 + up 21
2
1
2
to q + 1
o
n q+ 21
q+ 1
up+ 1 up 21
2
2
2
Assignment 3.15
(1) Perform a stability analysis for this scheme (given by equations (3.11.10), (3.11.11),
and (3.11.12)) and the variant of taking CTCS in the second step. Remember the definition of the gain.
(2) Find the modified equation. Make sure to include the second, third and
fourth order terms. Comment on the scheme from the point of view of
dissipation and dispersion. Can you infer a stability condition?
How do we perform the stability analysis?
Scheme A: Let us do the obvious thing. We will combine the two steps into one
step as we have been doing so far. Substituting for
q+ 1
up+ 21 ,
2
and
q+ 1
up 21
(3.11.13)
uq+1
=
p
uqp+1 + 2uqp + uqp1
q
up+1 uqp1
4
2
2 q
up+1 2uqp + uqp1
+
4
144
3. SIMPLE PROBLEMS
We add and subtract uqp to the right hand side so that we can recognise FTCS and
see what extra terms have been added to it for this scheme.
1 + 2 q
q
(3.11.14)
uq+1
= uqp
up+1 uqp1 +
up+1 2uqp + uqp1
p
2
4
{z
}
|
identical to FTCS
g=
uq+1
1 + 2
p
{cos 1} ,
q = 1 i sin +
up
2
= nx
A little algebra and trigonometry will show that it has the same stability condition
0 < < 1.
Scheme B: If you struggled a little with the algebra, let us try something that
may be a little easier. We assume that gain in any half step is . Then for the first
half step
1
=
(3.11.16)
1 + einx (1 einx )
2
1
+
(3.11.17)
1 + einx (einx 1)
=
2
and consequently the gain would be
+
1 +
+
(3.11.18)
g=
2
2
The intermediate terms, setting = nx, are
+ +
+
(3.11.19)
(3.11.20)
=
=
1 + cos i sin
i sin {cos 1} .
g=
1
1 + cos + 2 {cos 1} i2 sin
2
g
g=
2
1 2
(1 cos ) + (1 + cos ) < 1
4
We observe from all the schemes that we have derived up to this point that we
can choose a spatial discretisation of some kind, for example central differencing,
and also pick a temporal discretisation independent of the choice of spatial discretisation chosen. In fact, if we pick a discretisation scheme in space for solving the
145
one-dimensional first order wave equation using central differences, we could write
at each point (xp , t), an ordinary differential equation given by
dup
up+1 up1
=
dt
2x
At any xp , up is a function of time. This technique is also referred to as the
method of lines[NA63]. In a more modern parlance and in CFD literature it is often
called semi-discretisation to capture the idea of only some of the derivatives being
discretised. With semi-discretisation we see that we get an ordinary differential
equation in time at each grid point along the space coordinate. A quick check on
solving ordinary differential equations shows us that we could indeed use a variety
of schemes to integrate in time. This includes the whole family of Runge-Kutta
schemes. Though we can mix any of the time derivative discretisation with the
space discretisation, it is clear that one would be judicious about the combination.
Some combinations may be just a waste of time. The key phrase to remember is
mix and match. Finally, the modified equation had terms that came from both the
spatial discretisation and the temporal discretisation. The combination determines
the behaviour of the solver.
(3.11.25)
146
3. SIMPLE PROBLEMS
bit vague, indicating, really, that our algorithm may require more conditions than
required by the constants of integration from the first point. Let us look now at
the specific case of the wave equation.
Let us consider the problem of a pipe filled with cold water. The pipe is
one meter long. Such a pipe is shown in Figure 3.13. The left end of the pipe
is connected to the exit of a water heater. On opening a valve at the right end
of the pipe, water starts to flow out of the right hand side of the pipe into the
open. Hot water starts to flow in from the water heater which is maintained at a
constant temperature of 65 degrees Celsius. A simple model that we can use to
track the hot water front through the pipe is the first order, one-dimensional linear
wave equation. In the case of the wave equation, the reason for the application
of boundary conditions are quite obvious. The equation has a first derivative in
space and a first derivative in time. We would expect from our experience with
ordinary differential equations that we will require one condition corresponding
to the time derivative and one corresponding to the spatial derivative.
The temporal derivative part is easy. We have some initial condition at time
t = 0 (cold water in the pipe) and we would like to use the equation to find out
what happens in the future ( How does the hot water-cold water interface or contact
surface move?). If is positive , then the wave equation is propagating left to right
(the water will move from the reservoir out into the open). Information is flowing
into our domain from the left and flowing out of our domain on the right. So, we
need to prescribe what is happening to u at the left end of the domain. If we do
this, we can draw characteristics and extend the solution in time.
For the problem at hand, an initial condition needs to be prescribed. This
means we need to have u(x, 0). We also need for all time, knowledge of what is
coming in at the left end of the problem, that is, at x = 0, t > 0, we need to
prescribe u. Let us say we were using FTCS to solve the problem. The grids
involved are at p 1, p, and p + 1. So, when we are applying FTCS at the last
grid point at the rightmost edge, what is p + 1? If we have N grid points in x, we
must necessarily stop using FTCS at the grid point N 1. How do we compute
the value at the N grid point? We need to apply a boundary condition that our
original problem did not require. That is, our algorithm seems to need a boundary
condition that our differential equation did not require. We could extrapolate from
the interior grid points to the boundary (effectively setting u/x = 0). There are
two ways to do this. We could compute all the values we are able to calculate at
time level q + 1 and then copy u from the last but one grid point to the last grid
point. This is illustrated in Figure (3.38) and can be written as an equation for the
last grid point as
(3.12.1)
uq+1
= uq+1
N
N 1
Or, one could perform all the computations as possible with FTCS and copy
the value from last-but-one grid point at the current time level to the last grid point
at the new time level as show in Figure (3.39). This can be written as
147
uN = uN 1
q+1
00
11 00
11
1
0
1
0
11
00
1
0
1 00
0
11
11
00
00
11
11
00
q
00
11
00
11
1
0
0
1
1 00
0
0 11
1
00
11
1
0
0
1
1
0
0
1
11
00
00
11
1
0
N 1 N
Figure 3.38. We could compute u at time level q + 1 up to grid
point N 1 using FTCS and copy uN 1 to uN
uq+1
= uqN 1
N
(3.12.3)
uq+1
= uqN 1
N
11 00
00
11 00
11 11
00
q+1
11
00 11
00
q00
11 00
11 11
00 11
11 00
00
1
0
1
0
1
0
1
0
1
0
1
0
1
0
N 1 N
Figure 3.39. We could compute u at time level q + 1 up to grid
point N 1 using FTCS and copy uN 1 from time level q to uN
at time level q + 1.
This effectively transports u from grid point N 1, q to N, q + 1 at the grid
speed. Which means that it is FTBS applied to equation
u u
+
=0
(3.12.4)
t
x
This gives us an idea. That is, we use FTCS for all the points that are possible to
0
q+1 1
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
11
0 00
1
00
11
11
00
11
00 00
11
00
11
11
00
00
11
1
q 0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
11
00
00
1
0
00 11
11 11
00
0 11
1
00
11
00
00
11
N 1 N
Figure 3.40. We could compute u at time level q + 1 up to grid
point N 1 using FTCS and uN at time level q + 1 using FTBS
employing the known values at the grids N 1, q and N, q.
calculate using FTCS and FTBS for the last grid point. This is indicated in Figure
(3.40).
148
3. SIMPLE PROBLEMS
Note: There are boundary conditions that are prescribed as part of the
problem definition and may be part of the mathematical theory of whether
a problem is well posed or not. The algorithm that you choose to solve the
problem may require more conditions to be prescribed. Sometimes this
results in a need to manufacture more boundary conditions. At times,
these conditions seem completely contrived. They are very much part of
the list of assumptions made and need to be verified with as much vigour
as we would our other assumptions. It cannot be emphasised enough,
that the need for the extra boundary conditions may arise because of
the solution technique chosen. If in the earlier discussion, we had chosen
FTBS instead of FTCS there would have been no need for the generation
of an extra boundary condition. You can verify what happens when FTFS
is applied to this problem.
Assignment 3.16
(1) Write a program to solve the first order one-dimensional wave equation
using FTCS, FTFS, FTBS and BTCS. I would suggest you develop the
code using FTBS and do the BTCS code last.
(2) Verify the stability conditions that have been derived.
(3) What happens to FTBS for different values of > 1?
(4) Try out the different boundary conditions.
(5) Plot your results. This is a lot more fun if you make your program interactive.
u f
+
= 0,
t
x
This equation is said to be in the divergence free form or the conservative
form. That it called the divergence free form is obvious from the fact that the
left hand side looks like the divergence of the vector (u, f ) and the right hand side
is zero. From application of the theorem of Gauss, this equation essentially states
there is no net accumulation of any property that is carried by the vector field
(u, f ). This is a statement of a conservation law as is shown in greater detail in
Chapter 5. This is the reason why equation (3.13.2) is said to be in conservative
form. Equation (3.13.1) is not in conservative form and is therefore said to be in a
non-conservative form. We saw earlier this particular form of equation (3.13.1) is
like a directional derivative. In CFD parlance, this form is often called the nonconservative form. f is called the flux term and in the case of equation (3.13.1),
(3.13.2)
149
f (u) = u2 /2. The fact that f is actually the flux can be seen, if we consider a
one-dimensional control volume. The extent or the volume of the control volume
is xi+ 21 xi 12 . The amount of the property u contained in the control volume is
(3.13.3)
U=
xi+ 1
2
u(, t)d
xi 1
2
The rate at which this changes in time depends on the rate at which it flows in and
out of the control volume.
U
=
t
t
(3.13.4)
xi+ 1
2
xi 1
2
(3.13.6)
u(, t)d =
=
(
uxi ) = (f (xi+ 12 ) f (xi 21 ))
t
t x 1
t
i
represents the fundamental idea behind a whole class of modern CFD techniques.
We have a process by which a system evolves in time. We could integrate equation
(3.13.4) in time to get the change in U over the time period [tq , tq+1 ] as
(3.13.7)
q+1
U =
u
q+1
i
xi+ 1
2
xi 1
u
qi
u(, tq+1 ) u(, tq ) d =
{xi+ 21 xi 21 } =
tq+1
tq
Equation (3.13.4) gives us the process. It tell us that what is important is the
boundary of the control volume. The evolutionary process of the system is at the
boundary. In fact, the resounding message from that equation is
May the flux be with you!
From this point of view we could reduce everything down to finding the correct flux
at the interface of two control volumes. The flux at the interface between volumes
is the process by which our system evolves. As a consequence, the role of grid point
i shown in Figure 3.41 is only a place holder for u
and its location is somewhere
in the interval. We may still choose to locate it at the middle.
If equation (3.13.4) is integrated in time one can generate a numerical scheme
to solve this equation. This class of schemes are called finite volume schemes. Note
that we would be solving for U in equation (3.13.4). We have assumed that this is
some mean value u
times the length of the interval. This mean value can then be
used to evaluate the fluxes. If we drop the tilde and label the mean value at ui , we
can then write the spatially discretised version of equation (3.13.4) as
150
3. SIMPLE PROBLEMS
fi 1
fi+ 1
i1
i+1
i
i
1
2
i+
1
2
(3.13.8)
ui i = (f (ui+ 12 ) f (ui 21 ))
t
where,
(3.13.9)
ui+ 21 =
ui + ui+1
2
If you look at equation (3.13.4) and equation (3.13.8), you will see that the
argument of the flux term is different. This drives the algorithm that is generated.
If we stick with equation (3.13.4), we see that the flux term is evaluated at the
point xi+ 12 . This flux can be obtained as
(3.13.10)
fi+ 12 =
fi + fi+1
2
This typically involves more calculations. Try it out and see how the two
behave. Can you make out that using equation (3.13.10) actually is equivalent to
using FTCS on the grid points. The key is the term 12 fi will cancel. Now it is clear
in this case that if we want to add artificial dissipation of any kind we need to add
it to the flux term.
We see that the solution to the equation depends rather critically on the way
fi+ 12 is obtained. There are a whole host of high resolution schemes which completely depend on the way in which this flux term is calculated. Clearly, if we want
to ensure we are up-winding we need to know the direction of propagation at the
point. The propagation vector is given by the Jacobian f /u. How do we find
this at the point i + 1/2? Do we take the average of the derivatives at each node?
If we take the derivative of the flux at the interface, which flux, meaning how do
we calculate it? All of these questions lead to realm of high-resolution schemes.
We saw earlier that the generalised wave equation could result in characteristics
intersecting and creating a solution which was multi valued. Let us take a closer
look at that situation now. We consider a region where the solution jumps in Figure
3.28. We can pick a convenient control volume as shown in Figure 3.42 It is clear
tq+1
151
up+ 21
up 12
tq
xp 21
xp+ 12
from the figure that the speed us at which the discontinuity propagates is
(3.13.11)
us = lim
BA
xp+ 21 xp 12
tq+1 tq
The control volume was deliberately chosen so that equation (3.13.11) holds. Now
let us look at the generalised wave equation as applied to this control volume. We
take it in the form given in equation (3.13.7) since it has no derivatives and we do
have a discontinuity that is in the region of interest. This equation when applied
to our control volume becomes
(3.13.12)
xi+ 1
2
xi 1
tq+1
tq
As indicated in the figure, the value of u to the left of the discontinuity is up 21 and
consequently, it is the value at time level tq+1 . At time level tq , the same argument
givens us up+ 12 . Dividing through by (tq+1 tq ) and substituting from equation
(3.13.11) we get
(3.13.13)
(up+ 21 up 12 )
xp+ 12 xp 21
tq+1 tq
= fp+ 21 fp 21
152
3. SIMPLE PROBLEMS
(3.13.14)
(3.13.15)
(3.13.16)
(3.13.17)
lim up+ 21
uR
lim up 21
uL
lim fp+ 21
fR
lim fp 12
fL
BA
BA
BA
BA
Where uR and fR correspond to the conditions on the right hand side of the discontinuity and uL and fL correspond to the conditions on the left hand side of the
discontinuity. Taking the limit B A of equation (3.13.12) shows us that the wave
speed of the discontinuity is given by
fR fL
(3.13.18)
us =
uR uL
This is called a jump condition in a general context. In the context of gas dynamics
it is called the Rankine-Hugoniot relation.
We will now look at rewriting the discrete / semi-discrete equations in a form
that gives us a different perspective on these numerical scheme. This is called the
Delta form of the equations.
3.14. The Delta form
So far we have looked at problems where we were simulating the evolution of a
system. Quite often, we may be interested only in the steady state. That is, we are
interested in the solution to Laplaces equation, we may, instead, choose to solve
the two-dimensional heat equation and march to the steady state solution in time.
We would like to do this efficiently. To lay the foundation for the study of these so
called time-marching schemes we will look at the Delta form of our equations.
Consider the implicit scheme again. We apply it to the general one-dimensional
wave equation given in equation (3.13.2), which can be discretised as follows
(3.14.1)
p+1
q+1
u
f
+
= 0.
t
x
(3.14.3)
f q+1 = f q + t
f q+1 = f q + t
f u
+ ... = f q + aq uq ,
u t
aq =
q
f
u
u q
[f q + aq uq ] = 0.
+
t
x
Now, if we were only seeking the steady state solution, we are actually trying to
solve the problem governed by the equation
(3.14.4)
153
f
=0
x
For any function u other than the steady state solution this would leave a
residue R(u). So at time-level q, we have a candidate solution uq and the corresponding residue
(3.14.5)
f q
f (uq )
= R(uq ) = Rq
=
x
x
We now rewrite equation (3.14.4) as
(3.14.6)
u q aq uq
= Rq
+
t
x
Multiplying through by t and factoring out the uq gives us the operator form
of the equation
(3.14.7)
(3.14.8)
It should be emphasised that the term in the braces on the left hand side is a
differential operator. This equation is said to be in the delta form. Since we have
taken the first terms from the Taylors series expansion of f q+1 , it is a linearised
delta form. We look at this equation carefully now to understand what it does for
us. If we are looking only for the steady state solution, with a given initial guess,
we can march this equation in time till we get convergence to a steady state (or so
we hope).
154
3. SIMPLE PROBLEMS
solution, as long as this discrete version is not singular, if u is zero, does it matter
what multiplies it? The answer is an emphatic: No! This leaves open the possibility
that we can accelerate convergence to the steady solution by an appropriate choice
of the operator M . What we are effectively saying is that if you are interested only
in the steady state solution why not compromise on the transient to get to that
solution as quickly as possible. We will see how all of this works. Let us first see
how the delta form can be used to obtain a solution.
A simple minded discretisation would be to use BTCS. That is, the spatial
derivative is discretised using central differences.
(3.14.9)
aqp+1 uqp+1 aqp1 uqp1
= tR(uq )
uqp + t
2x
We immediately see that this forms a system of equations, For the typical
equation not involving the boundary, we have a sub diagonal entry, a diagonal entry
and a super diagonal entry. That is we have a tridiagonal system of equations which
can be written as
(3.14.10)
AU = B,
u7 + u8 = 0 u8 = u7
1
3
0
0
0
1
0
2 1
4
0
0
0
0
3 1
5
0
(3.14.12)
0
0
0
4 1
0
0
0
0
5 1
0
0
0
0
0
6
0
0
0
0
0
7
1
155
1
u tR1 a0 u0
2
2
0
u
tR
3
3
0
u
tR
4
4
0 u
tR
u5
0
tR5
6
6
0
u
tR
8
7
7
u
tR
1 1
You will notice that the matrix A given in equation (3.14.12) is not symmetric.
This is because of the first derivative in our equation. Laplaces equation resulted
in a symmetric matrix because it involved second derivatives.
This system of equation can be solved using any number of schemes. You could
use Gauss-Seidel or Jacobi iterations to solve them. Since, in this case, the system
of equations is small, we could use a direct scheme like Gaussian elimination or LU
decomposition.
However, we will recall again that in equation (3.14.10), the right hand side B
determines when we have converged and the quality of the solution. At the solution
B = 0 as is U = 0. As long as the coefficient matrix A is not singular, the result
will only affect the transient and not the steady state.3
A simple possibility in fact is to replace A with I, the identity matrix. If we
discretise R using central differences, can you make out that this results in FTCS?
Now we will look at replacing A with something that is easy to compute and close
to A. Since we saw that one of the ways we could solve the system of equations
was by factoring A we will see if we can factor this a little more easily. From the
form of the operator in equation (3.14.8), we can write
=
+
.
x
x
x
(3.14.13)
+
a + t a u = tR(u(t))
1 + t
x
x
1 + t
a
x
+
1 + t a u = tR(u(t))
x
3This is not a mathematical statement. It is made here with a certain gay abandon so that we
156
3. SIMPLE PROBLEMS
+
1 + t
a
1 + t
a = 1 + t
a + t a
x
x
x
x
(3.14.16)
+
+ t2
a a
{z x }
| x
extra term
If the extra term in equation (3.14.16) did not occur when the product on the left
hand side is expanded, we would have an exact factorisation. However, we will have
to live with the approximate factorisation as it stands. We notice that the extra
term is of the order of t2 . We have already made an approximation of this order
when we linearised the flux term.
An example for the kind of partitioning indicated in equation (3.14.13) is
au =
x
2x
ai ui ai1 ui1
ai+1 ui+1 ai ui
=
+
2x
2x
We should remember that is not quite how it is applied. In order see this can write
equation (3.14.15) as
(3.14.17)
(3.14.18)
(3.14.19)
(3.14.20)
1 + t
a u
x
+
1 + t a u
x
tR(u(t))
The first equation gives us a lower triangular matrix equation. The second an upper
triangular matrix. It is easy to solve for the u and then follow up with a step of
solving for u.
In this case, the error due to the approximate factorisation is the term
(3.14.21)
2u
2u
a2 2 = 0
2
t
x
157
We are looking at this equation with out any reference to the domain or boundary
conditions. In operator form this can be factored as
(3.15.2)
u=0
+a
a
t
x
t
x
Inspection of the differential operator on the left hand side reveals that it has
been written as a product of two one-dimensional first order wave operators. The
first operator corresponds to a right moving wave and the second one to a left
moving one. The corresponding characteristics are x + at and x at. One can
solve this equation numerically by employing a CTCS scheme, that is, a central
difference in time and a centred difference in space.
Assignment 3.17
(1) Solve the second order wave equation on the interval (1, 1). The initial
condition is u(x, 0) = 0 for all x except u(0, 0) = 1.
(2) Check the stability condition for the resulting automaton.
We will take a small detour into the classification of differential equations. First
some observations on what we have so far.
(1) The first order linear one-dimensional wave equation had a single characteristic.
(2) The second order linear one-dimensional equation that we have seen just
now has a set of two characteristics that are distinct.
The second item in the list above makes an assumption. We often tend to get
prejudiced by our notation. For example, we identified a as the propagation speed
and that ties us down to the coordinates x and t. To see this let us write the
equation in terms the independent variable x and y The rewritten wave equation is
2u
2u
a2 2 = 0
2
y
x
158
3. SIMPLE PROBLEMS
Please note that this does not restrict us to second order equations.
You can ask the question: Whats the big deal? Why are you bothering to
define this at this point? After all we managed so far with out his classification.
This is true, we have used this classification without quite being aware of it.
Why did we specify the data on all sides for the Laplace equation? Can you solve
Laplaces equation as an initial value problem with data prescribed on one side.
How about the wave equation. Can you go to the beach and assert that the wave
shall be a certain height or that a stream of water shall have a paper boat in it
at any time that you insist, however, someone else is placing the boats somewhere
upstream of you. The nature of the equations is intimately tied to the physics of
the problem and to the way we apply/can apply boundary conditions.
Assignment 3.18
Classify the following equations as elliptic, parabolic, and hyperbolic.
u
(1) u
t a x = 0
u
(2) t u u
x = 0
(3) Discretise the second order linear wave equation given by
(3.15.4)
2
2u
2 u
= 0.
t2
x2
159
CHAPTER 4
161
be addressed. We will look at these issues in this chapter. In this context, the
one-dimensional equations are the best place to start.
This chapter will lay the foundation for working with a system of equations
using the one-dimensional Euler equations. Some of the things done here can be
extended to multiple dimensions in spirit and possibly form. We will see that the
process of applying boundary conditions is one of the things that will be effectively
extended to multiple dimensions.
4.1. What is one-dimensional flow?
First, one-dimensional flow should not be confused with uni-directional flow.
Uni-directional flow is flow where all the velocity vectors point in the same direction,
the uni-direction so to speak. If we consider a plane perpendicular to the unidirection, the magnitude of the velocities at points on the plane need not be the
same. Uni-directional flow with all the velocities at points on a plane perpendicular
to the uni-direction is shown in Figure 4.1 On the other hand, we would have a
162
High Pressure
Atmospheric pressure
Valve
Figure 4.3. A schematic diagram of a setup that can be modelled
as one-dimensional flow.
We should be clear as to what this diagram in figure 4.3 indicates. To this end
we will restate salient features of the problem.
The initial pressure on the left hand side of the pipe is higher than atmospheric pressure.
163
u
+
=0
t
x
u (u2 + p)
+
=0
t
x
and can also be called the one-dimensional balance of linear momentum equation.
Note that this equation has no viscous terms as the fluid is assumed to be inviscid.
164
Finally, the equation that captures the principle of conservation of energy is given
by
Et
(Et + p)u
+
=0
t
x
(4.1.3)
Again, it is clear that there are no terms related to viscosity or thermal conductivity
in the energy equation which can be called the equation of energy balance.
Take a look at the three equations. They are very similar in form. These
three equations can be consolidated into a single vector equation. The inviscid
equations are called the Euler equations and the viscous equations for a NavierStokes viscous model are called the Navier-Stokes equations. These equations are
derived in chapter 5. The one-dimensional Euler equation is
~
~
Q
E
+
= ~0
t
x
(4.1.4)
where
(4.1.5)
~ = u ,
Q
Et
u
0
~ = u2 + p , and ~0 = 0
E
(Et + p)u
0
where, Et is the specific total energy, and is given in terms of the specific internal
energy and the speed the of the fluid as
(4.1.6)
Et = e +
u2
2
For a calorically perfect gas the specific internal energy can be written in terms of
the the specific heat at constant volume and the temperature
(4.1.7)
e = Cv T
We had Et . We wrote that in terms of e, with which we added one more equation.
Now, we have added yet another unknown, temperature T and we need one more
corresponding equation. Fortunately for us, we can close this process by invoking
the equation of state
(4.1.8)
p = RT
We have added one more equation consisting only of variables defined so far and
we have closure. By closure, I mean we have enough equations to determine all the
parameters in the equations.
You will note that equation (4.1.4) is in the divergence free form or the conservative form. We have already seen this form in the case of the wave equation
in section 3.13. In CFD there is another reason attributed to the name conservative
form and that is the fact that for a steady flow across a discontinuity like a shock,
see [LR57], the quantity E is continuous across the discontinuity. For example,
there may be a jump in and a jump in u, but there is no jump in u across the
shock. The dependent variables Q result in the flux term E. Equation (4.1.4) is said
to be in conservative form and Q given in equation (4.1.5) are called conservative
variables.
165
Assignment 4.1
(1) Write a program that given N will create an array that contains N number
of Qs. Add a function to this program that will compute a Q given p, T ,
and u. Take p = 101325 N/m, T = 300 K and u = 0 m/s and create and
set the values in and array of size N = 11.
(2) Write functions that compute p, , T and u when given a Q. Call these
functions GetP, GetRho, GetT and GetU. Use these functions to retrieve
the values set in the previous exercise.
(3) Write a function, GetE, that returns E given Q.
(4) Test the code and set it aside. You will use it later.
The first order linear one-dimensional wave equation that we studied in the
previous chapter is in the non-conservative form. We will try to get these equations
that make up the one-dimensional Eulers equation into a similar form. We will do
this in the next section.
~
~
~
~
~
~
E
Q
Q
~ Q = Q + A Q = ~0
+
=
+ Q E
t
x
t
x
t
x
~ and A is called
Here, Q is the gradient operator with respect to the variables Q
a flux Jacobian. Rewriting the equation in the non-conservative form in terms of
the conservative variables as
(4.2.2)
~
~
Q
Q
+A
= ~0
t
x
The equation now looks like the one dimensional linear wave equation. It still
represents a coupled system which we can verify by actually figuring out the entries
~ as
in the matrix representation of A. If we were to refer to the components of Q
~ as ei , and the components of A as aij , then we get
qi , the components of E
(4.2.3)
aij =
ei
qj
166
e1 = u = q2
(4.2.5)
e2 = u2 + p =
(4.2.6)
q22
+p
q1
q2
q1
clearly we need to derive an expression for pressure p and one for Et . We have
already come up with the relevant relationships to close the system of equations.
From equations (4.1.6) and (4.1.7) we get
(4.2.7)
Et = e +
u2
u2
= Cv T +
2
2
Substituting for the temperature from the equation of state (4.1.8), we get
(4.2.8)
Et =
Cv p u2
+
R
2
Et =
u2
p
+
( 1)
2
u2
q2
p = ( 1) Et
= ( 1) q3 2
2
2q1
q2
q22
1
(
1)q
+
(3
3
~ =
2
E
q1
q22 q2
1
q3 2 ( 1)
q1 q1
(4.2.11)
(4.2.12)
q22
1
(
3)
2
A=
q12
3
( 1) q2 q q2
3 2
q13
q1
1
(3 )
q2
q1
q2
q3
32 ( 1) 22
q1
q1
q3 we get
( 1)
q2
q1
(4.2.13)
1
2
A=
2 ( 3)u
( 1)u3 E u
t
167
(3 )u
Et 32 ( 1)u2
( 1)
Assignment 4.2
(1) Write a function GetA to get the matrix A given Q.
(2) Analytically, verify that E = AQ. (Obviously, this need not be true for
all conservation equations)
(3) Having shown that E = AQ, use it to test your function GetA.
We have managed to get our one-dimensional equations to look like the wave
equation. We are disappointed, but not surprised that the matrix is not diagonal.
If it were a diagonal matrix, the system of equations represented by equation (4.2.2)
would be a system of independent equations, each one a wave equation of the kind
we have studied earlier. Unfortunately, A is not diagonal. Maybe if we changed the
dependent variables we could get a non-conservative form which is decoupled. This
derivation is a standard fluid mechanics procedure, we will do it as an illustration
given by
of the process. We choose the the dependent variable Q
= u
Q
p
(4.2.14)
We will now proceed to transform the equations governing balance of mass, mo~ to Q.
We start with equation of conservation of mass.
mentum and energy from Q
Let us expand the equation (4.1.1) using product rule to get
(4.2.15)
+
+u
=0
t
x
x
We now expand conservation of linear momentum, (4.1.2), and simplify using
balance of mass.
u times conservation of mass
(4.2.16)
u
+ u
t
t
+ u
u
+
x
u
x
u
u p
+ u
+
=0
t
x x
p
=0
x
168
Q
Q
+ A
= ~0
t
x
(4.2.18)
dividing through by gives
u
u 1 p
+u
+
=0
t
x x
Finally, we expand the equation of conservation of energy (4.1.3).
(4.2.19)
Et
+
t
Et
+ u
Et
+
x
Et
u
x
+u
p
u
+p
=0
x
x
p
p
u2
u2
+ u
+
+
(4.2.21)
t ( 1)
2
x ( 1)
2
+u
p
u
+p
=0
x
x
(4.2.22)
p
( 1)
u u
t
+ u
x
p
( 1)
u2 u
x
p
+ u x
+p
u
=0
x
This gives us
(4.2.23)
p
( 1)
+ u
x
p
( 1)
+p
u
=0
x
p
u p
p
u
1 p
u
+p
=0
1 t
( 1) t
1 x ( 1) x
x
( 1)
+u
t
x
p u
1 x
(4.2.26)
(4.2.27)
u + 0
t
p
0
u
p
169
0
1
u
0
=
x
p
0
u
0
Q
Q
+ A
= ~0
t
x
We got a different non-conservative form. The A is also not a diagonal matrix. The
equations are still coupled. It is clear that we cannot go about hunting at random
for the set of dependent variables that will give us a decoupled system of equations.
We have to use some strategy to find these variables.
We will now try to see what we can do to manipulate these equations to decouple
them. For anyone familiar with matrices, similarity transformations, eigenvalues,
left eigenvectors, right eigenvectors can read on. If you are not comfortable with
these terms, I would suggest that you take a detour to the appendix B.2 and then
come back to this point.
Back to the discussion at hand. If we want to develop a systematic process to
hunt for the diagonal form of our governing equation, we can look at the relationship
between the two non-conservative forms that we have derived so far and that may
give us a clue. What is the relationship between A and A ? If we answer that
question we can then figure out how to get the diagonal form. In order to determine
~ and Q
as
the answer we need to use chain rule to relate Q
(4.2.28)
~ = P dQ
dQ
(4.2.29)
In terms of components this is
(4.2.30)
dqi =
X
j
pij d
qj =
X qi
d
qj
qj
j
~ and Q
respectively. If we were to multiply
where qi and qj are components of Q
equation (4.2.1) by P 1 , we would have
!
~
~
~
~
Q
Q
Q
Q
1
(4.2.31)
P
= P 1
+A
+ P 1 AP P 1
= ~0
t
x
t
x
employing equation (4.2.29) we get
(4.2.32)
Q
Q
+ P 1 AP
= ~0
t
x
Q
Q
+ A
= ~0
t
x
A = P 1 AP
170
Assignment 4.3
(1) Determine the transformation matrix P using the definition given in equation (4.2.29).
What do you con(2) As a continuation from assignment 4.2, evaluate AQ.
~
~
Axk = k xk
or in terms of components
(4.2.36)
AX = X
X 1 AX = X 1 X =
Yes! We have a scheme to generate the similarity transformation that we need and
consequently, the new dependent variables that are governed by the corresponding
= X 1 dQ, Then we can pre-multiply the
equations. If we were to define dQ
equation (4.2.1) to get
)
(
~
~
~
~
Q
Q
Q
Q
1
= X 1
+A
+ X 1 AXX 1
(4.2.39)
X
t
x
t
x
Which reduces to
(4.2.40)
Q
Q
+
= ~0
t
x
(4.2.41)
q1
q1
+ 1
=0
t
x
(4.2.42)
q2
q2
+ 2
=0
t
x
171
q3
q3
+ 3
=0
t
x
As we can see the equations seem to be decoupled. The individual equations now
resemble the first order wave equation that we have analysed earlier. There is now
hope that some of that analysis will carry over to this coupled system of equations.
We have only shown that this decoupling, if possible, will be nice to do. So, how
do we find the eigenvalues. You can try to find the eigenvalues of A directly. Again,
prior experience with the problem shows a way to get an easier similar matrix. It
A and A are related through a
turns out, it is easier find the eigenvalues of A.
similarity transformation. A property of two matrices that are similar to each other
is that their eigenvalues are the same.
The eigenvalue problem corresponding to A is written as
= x
(4.2.44)
Ax
(4.2.43)
0
1
u
(4.2.45)
0
=0
0
p
u
(4.2.47)
u
X=
2
u
2
u+c
c2
u2
+
+ cu
1
2
2
uc
2
2
c
u
+
cu
1
2
2
c
+ u2 as the total enthalpy Ht = Cp To .
For a calorically perfect gas, we recognise 1
So, X can be written more compactly as
172
X=u
2
u
2
(4.2.48)
uc
Ht + cu Ht cu
u+c
For clarity we will indicate the eigenvalue as a subscript and write the corresponding entries of the eigenvector.
(4.2.49)
xu = u ,
2
u
2
xu+c
= u+c
,
Ht + cu
xuc
= uc
Ht cu
We should note that though the eigenvalues of A and A happen to be the same,
corresponding to A is
the eigenvectors are not. In fact, the modal matrix X
(4.2.50)
X = 0
c2
2
c
(4.2.51)
1
X
=
0
2c
2c
1
c2
2c2
2c2
q2
t
q3
t
q1
q1
+u
=0
t
x
q2
+ (u + c)
=0
x
q3
+ (u c)
=0
x
to dQ
as
We can relate dQ
d
q1
1
d
(4.2.55)
q2 = 0
0
d
q3
2c
2c
Which gives us
(4.2.56)
(4.2.57)
d
q1 = d
d
q2 =
173
1
d
c2
1
du
2
2c
1
dp
2c2
dp
c2
dp
du + 2
2c
2c
and
dp
du + 2
2c
2c
Consider a point (x0 , t0 ). The equation of the first characteristic through this point
is x x0 = u(t t0 ). Without loss of generality, we can assume x0 = 0 and
t0 = 0 for this discussion. Along x = ut, d
q1 = 0 since q1 is a constant. In this
fashion equations (4.2.56), (4.2.57), and (4.2.58) can be integrated along x = ut,
x = (u + c)t, and x = (u c)t respectively.
If we just consider a point in our one-dimensional flow where the flow is from
our left to right, for which we take u to be positive and further if we assume the flow
is supersonic, we have three equations that look very similar to the wave equation of
the earlier section. We have three different characteristics. One for each equation.
In the supersonic case they look as follows
(4.2.58)
d
q3 =
uc
u
u+c
174
uc
u
u+c
~ = u
(4.2.59)
Q
T
RT
u +
u
(4.2.60)
t
T
0
( 1)T
u = 0
R
x
T
0
u
(3) Find the eigenvalues of the for the flux Jacobian in this case. Show that
they are u, u + c, u c.
(4) Find the eigenvectors.
175
used to solve the flow through the pipe given in Figure 4.3. You will notice that
right now, we will only look at the discretisation of the governing equations and
that the actual problem to be solved will show up in the discussion only when we
want to figure out the boundary conditions that need to be applied[RS81].
This flow field is modelled using the one-dimensional Euler equations. FTCS
applied to these equations gives us
(4.3.1)
o
1 t n ~ q
q
q
~
~ q+1
~
E
E
Q
=
Q
p
p
p+1
p1
2 x
~ q+1
~ qp . Bear in mind that this
we can employ this automaton to obtain Q
given Q
p
scheme was unconditionally unstable for the first order linear wave equation.
Assignment 4.5
Write a function called FTCS that takes an array of Q and a parameter which
can be named DtByDx, which represents the ratio t/x, and takes one time step
for all the interior points (that is leaving out the first and the last grid points) with
the FTCS scheme as given by equation (4.3.1).
~ q einx einx
~ q+1
~ qp 1 t E
Q
=Q
p
2 x p
Using the result that you showed in the assignment we can write E = AQ
(4.3.3)
~ q+1 = Q
~ q t Aq Q
~ q {i sin nx}
Q
p
p
x p p
Keep in mind that we are dealing with matrices here. We factor out operator acting
~ qp to the left to get
on Q
(4.3.4)
~ q+1
Q
=
p
t q
~ qp
A sin nx Q
I i
x p
The matrices X and X 1 can be made normal, meaning they perform a rotation
~ q . Then X 1 is normalised means kS q k =
~ q = X 1 Q
but no stretch. We define S
p
p
p
~ q k. The eigenvectors are also independent of each other. We
~ q k = kQ
kX 1 kkQ
p
p
176
t q
~pq
p sin nx S
I i
x
sin
nx
I
i
p
p
pk
x
So, the relationship given by equation (4.3.5) and consequently the one given by
~pq+1 k < kS
~pq k, that is
equation (4.3.4) is a contraction mapping if kS
(4.3.7)
I i t qp sin nx
< 1
x
t q
Note that I i
sin nx is a diagonal matrix. We just require the magnitude
x p
of the largest entry on the diagonal to be less than one. We see again that FTCS
is unconditionally unstable. We would need to add artificial dissipation if we want
it to work.
4.4. Boundary Conditions
As we did in the wave equation, we will inspect the governing equations to
discover how many boundary conditions are required. We see that the vector equation has one time derivative and one spatial derivative. We expect to prescribe
one vector initial condition or three conditions at t = 0 and one vector boundary
condition or actually, three boundary conditions.
This much we get from our understanding of differential equations. Let us
look at it from the point of view of the physics of the problem as described at the
beginning of the chapter (see Figure 4.3). The pipe is open to the atmosphere on
the right hand side. When the valve is closed, the conditions in the pipe will be
the ambient conditions. At the instant when the valve is opened, loosely, we have
two conditions on the left in the form of P0 and T0 . P0 is the total pressure in the
reservoir and T0 is the total temperature in the reservoir. These are often called
reservoir conditions or upstream conditions since we expect the flow from that
direction. We have two conditions, we need one more. In gas dynamics this would
be the back pressure or the ambient pressure pa on the right hand side of the pipe.
If pa = P0 , then there is no flow. If pa < P0 we will have flow. Clearly, P0 and pa are
boundary conditions that influence the flow. T0 is a constraint on the magnitude
of the flow in the form of the total energy available. The one-dimensional, steady
state energy equation reduces to
u2
2
where T is the static temperature measured on the Kelvin scale. This relates the
total temperature at the inlet to the unknown static temperature and speed at the
inlet.
(4.4.1)
Cp T0 = Cp T +
177
Employing these boundary conditions, one can solve the one-dimensional gas
dynamical problem[Sha53]. At this point in the wave equation, we discovered that
the numerical scheme required more parameters to be specified than required by
the physical problem or the differential equation. How do the boundary conditions
that we have prescribed so far work with the numerical scheme? Let us see what
happens when we use FTCS.
00
11
q+1
00
11
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
00
q 11
00
11
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
p-1
p+1
Figure 4.6. Taking one time step with FTCS. The grid points
involved in the approximating the governing equation at each of
the interior grid points are shown by the bubble surrounding the
grid points.
From Figure 4.6 we can see that the Q at points interior to our domain at time
level q + 1 can be found from Q at all the points at time level q. Now, if we wish
to proceed to the next time level q + 2, we require Q at all the points at time level
q + 1. From FTCS we have calculated Q only at the interior points. How do we get
the values of Q at the boundaries? We will use the characteristics to answer this
question.
For the subsonic inlet case we clearly have at the entry point characteristics as
shown in Figures 4.5 and 4.7. We have something similar at the exit.
uc
u
u+c
11
00
00
11
1
0
00
0 11
1
00
11
11
00
1 00
0
11
1
0
0
1
uc
p,q+1
0
1
00
0 11
1
00
11
1
0
0
1
1
0
0
1
p,q
0
1
0 00
1
11
1
0
p-1,q
p+1,q
1
0
u+c
178
Cp T0 = Cp T +
u2
u2
T = T0
2
2Cp
Assignment 4.6
(1) Write a one-dimensional Euler equation solver using FTCS. Remember,
that you will need to add some artificial dissipation terms. While developing the code add something along the lines
(4.4.4)
(4.4.5)
4Q
2Q
4 x4 4
2
x
x
with 2 = 0.01 and 4 = 0.001. While developing the solver make sure
that use the following conditions as the test case.
(a) Inlet condition
2 x2
Po = 101325 Pascals,
To = 300 Kelvin
Pa = 84000 Pascals
(c) Set the initial conditions to be the exit condition. You can take
Ta = 300K.
(2) Run the above conditions for various grid sizes. Choose the time-step so
that
t
= 0.0001, 0.0005, 0.001, 0.002
(4.4.7)
x
(3) Run your code for Po = 121590 Pascals and Pa = 101325 Pascals
179
(4.5.1)
"
~
E
= E + t
t
~q
#q
(4.5.2)
~ q+1
"
~
~ Q
E
= E + t
~ t
Q
~q
#q
h
iq
~ q + AQ
~
=E
q+1
q+1
q
Q
Q
q+1
~q
~ = 0
E
+
A
Q
E
=
+
+
t
x
t
x
If we are interested only in the steady state solution, then that solution satisfies
the equation
E
(4.5.4)
=0
x
In equation (4.5.3) we can rearrange terms so that
(4.5.5)
q+1
q
~q
Q
~ = E
A
Q
+
t
x
x
q
~q
Q q
~ = E
+
AQ
t
x
x
I + t
~q
A
~ = t E
Q
x
x
Here is a new way to build a whole class of automatons to solve our equation.
~ Then use this to compute the right hand side of equation
We assume an initial Q.
(4.5.6). This should be zero if we had the steady state solution. Unfortunately, we
~
are left, typically, with a residual. This allows us to determine a correction Q
using equation (4.5.7).
(4.5.7)
~q
A
~ q = t E
Q
I + t
x
x
~ using the Q
~ that we just got using
We can then obtain a corrected Q
180
~0
Assume Q
Compute E and
E
x
Solve
~ = t E
I + t x
A Q
x
~
to get Q
~ q+1 = Q
~ q + Q
~q
Q
E
<
x
?
no
yes
Done
(4.5.8)
~ q+1 = Q
~ q + Q
~q
Q
181
in the RHS. The CFD of how we are going to get there is in LHS. On the LHS we
need to come up with something simple to compute which causes large decreases
in the magnitude of the residue in every step.
Let us now see how we would deal with this for BTCS. The x-derivative in
equation (4.5.7) is approximated using central differences. Then the full system of
equations reads as shown in equation (4.5.9).
(4.5.9)
q
1
.
.
.
.
.
.
q2
q3
q2 I
q4
..
..
..
.
..
qp1 I p+1
..
.
..
.
..
..
..
qN 1
~1
Q
(R+BC)
1
~2
R
Q
2
~3
Q
R
3
..
.
.
.
~ p1
Q
Rp1
~
Q p
Rp
Q
R
p+1
p+1
0
.
.
..
..
Q
R
N 1
N 1
I
Q
(R+BC)
~N
N
where qp = t/(2x)Aqp .
Please note that each of the entries in our N N matrix is itself a 3 3 matrix.
The corresponding entries in the two vectors are also vectors. These are called block
structured matrices. It should be pointed out that in CFD we NEVER assemble
this matrix as shown.
One could use a direct method like Gaussian elimination to solve the system
of equations. Or an iterative techniques of Gauss-Seidel or Gauss-Jordan. Another
possibility is that we perform an approximate factorisation. How does this work?
remember that we want a solution to the steady state equation. We will take the
equation in operator form and factor the derivative in terms of forward differences
and backward differences.
(4.5.10)
~q
+A
A
~ q = t E
I + t
Q
I + t
x
x
x
~q
A
~ q = t E
Q
I + t
x
x
+
A
~ q = Q
~ q
Q
I + t
x
182
(4.5.13)
~q
+A
A +A
A
~ q = t E
Q
+ t
+ t2
I + t
x
x
x x
x
(4.5.14)
~ q = t
{I} Q
~q
E
x
Assignment 4.7
(1) Re-do the previous assignment 4.6, using BTCS and the delta form.
(2) Try a range of CFL values from 0.001 to 100. or more.
(4.6.1)
~q
A
~ q = t E
I + t
Q
x
x
183
A
~q =
I + t
Q
x
1 q
1 q
q
1 q
~q =
(4.6.2)
(X )p + t(X )p AXp (X )p
Q
x
~q
E
t(X1 )qp
x
At the first grid point this becomes
(X1 )qp
(4.6.3)
q
~
q
q
1 q E
I + t1
Q = t(X )1
x
x
At the inlet, we can write the boundary conditions from gas dynamics as we did
before. Referring to figure 4.7, for a subsonic inlet, we have two characteristics along
which the characteristic variables q1 and q2 are propagated into the computational
domain. Corresponding to which we prescribed two conditions: Po and To . We
will need to take a closer look at this in a moment. q3 is being carried out of the
domain and governed by the third component in equation (4.6.3). To extract the
third component corresponding to the left running characteristic the matrix L is
defined as follows
0 0
L = 0 0
0 0
(4.6.4)
0
0
1
Corresponding to the out-going characteristics at the inlet we can now use the
equation
(4.6.5)
L I+
tq1
Q =
Lt(X1 )q1
q
~
E
x
Where do the two prescribed boundary conditions Po and To figure? Actually, we need to transform them to the characteristic coordinates (xu , xu+c , and
xuc )and pick only those parts of them that correspond to the process of propagation from the outside of the computational domain into the domain (components
along xu and xu+c ). We do this as follows. Define the vector Bo as
Po
B o = To
0
(4.6.6)
Bq+1
= Bqo + t
o
q
Bo
Bq+1
= Bqo + Do Q
o
t
Bo = Do Q
184
where
Bo
Q
Now, if Po and To do not vary in time, then equation (4.6.7) becomes
(4.6.9)
Do =
(4.6.10)
Do Q = 0
o Q
=0
X1 Do Q = X1 Do XX1 Q = D
The subscripts indicating spatial and temporal position have been dropped. Now
we use (I L) to extract the first two components and eliminate the q3 part(xuc
component) of equation (4.6.11).
o Q
=0
(4.6.12)
(I L)X1 Do Q = (I L)D
We combine this at the boundary with the equation governing q3 to get
q
~
q = Lt(X1 )q
o Q
q + L I + tq
Q
(4.6.13)
(I L)D
1
1
x
x
q . We can pre-multiply
This combined equation determines the full vector Q
1
q
the equation by X1 to get back to the Q system.
q =
o Q
q + L I + tq Q
(4.6.14) Xq1 (I L)D
1
x
q #
~
1 q E
Lt(X )1
x
1
At the exit, we have two characteristics propagating q1 and q2 out of the domain.
We have one physical condition pa prescribed. As was done at the inlet, we can
define the vector Ba .
(4.6.15)
0
Ba = 0
pa
Repeating the process followed at the inlet we write the condition that Ba as
(4.6.16)
Da Q = 0
where,
Ba
Q
We can then write for the last grid point
q
q
q
q =
Q
(4.6.18) XN LDa Q + (I L) I + tN
x
(4.6.17)
Da =
(I L)t(X1 )qN
q #
~
E
x
N
185
Cp To = Cp T +
u2
u2
q3
1 q22
Cv To = Cv T +
=
2
2
q1
2 q12
(4.6.21)
u2 1
Po = p 1 +
=
2e
1
2 2
q22
/q
q
1
2
1 +
( 1) q3
2q1
q3
q22
2
2
q1
2q1
(4.6.22)
q2
1+
Po = ( 1) q3 2
2q1
q22
2 q3 q1
q2
2
The entries of Do are given in component form as dij . We know that d3j = 0.
The rest of the entries are
Et
1 u2
(4.6.23)
d11 =
1u
d12 =
(4.6.24)
1
d13 =
(4.6.25)
(4.6.26)
(4.6.27)
(4.6.28)
Et
To
d21 = o u2 ( 1)
T
2e
Et
d22 = o u
C
e
u2
d23 = o C
2e
( 1) To
. So, Do is given by (Please note the matrix indicated is DoT
T
which is the transpose of Do )
where C =
186
(4.6.29)
Et
1 u2
1u
DoT =
To
Et
o u2 ( 1)
T
2e
o u
Et
C
e
u2
o C
2e
Assignment 4.8
Redo assignment 4.7 with the new improved boundary conditions.
where the subscripts indicate the grid point. We could either extrapolate 1 or use
the characteristic equation to extrapolate it as
q
q
q+1
q
(u c)|0 t q
(4.6.33)
0 = 0
1 0
x
Along with the extrapolated quantity we now have the pair of characteristic
variables at the inlet. We observe that this pair is from the new state and now
q+1
q+1
label them ( |
, + | ). we can then compute
(4.6.35)
uq+1 =
q+1
q+1
+ +|
2
We will drop the superscript for now since we know that we are computing quantities
at the new time level. The resulting expressions dont look as busy. At the inlet
we have the relation between the total temperature and the static temperature as
(4.6.34)
Cp To = Cp T +
u2
2
187
Using equation (4.6.34) and equation (4.6.35) we can find T q+1 . From which we
can find the static pressure as
T 1
(4.6.36)
p = P0
T0
Given the static pressure and temperature we can find the density using the equa we can obtain Q.
tion of state. We have Q
What do we do at the subsonic exit? In this case, again, we prescribe the
boundary condition pa and extrapolate the first two characteristic variables from
the penultimate grid point to the last grid point.
Assignment 4.9
Try out the new boundary condition and see if it makes a difference.
Now all this talk of applying the correct boundary condition by getting the right
flux at the boundary has to remind you of our earlier discussion on the finite volume
method - see section 3.13. It clear again that if we breakup our problem domain
into small control volumes and computed the correct fluxes at the boundaries we
can determine the rate of change of our dependent variables in the interior of these
small control volumes which are very often called finite volumes
q
q
Q
Q
Q E
=
=0
+
+A
(4.6.37)
t
x p
t
x p
We can pre-multiply it by (X1 )qp
q
q
q
q
Q
Q
Q
+ qp (X1 )qp
=
(4.6.38)
(X1 )qp
+ qp
=0
t p
x p
t
x
p
(4.6.39)
u
0
+ = 0 u + c
0
0
0
0 ;
0
0
= 0
0
0
0
0
0
0 uc
(4.6.40)
+ =
(4.6.42)
q Q
q Q
Q
+ p
=0
+ + p
t
x p
x p
188
Q
Q E + E
Q
Q
+
=0
+ A+
+ A
=
+
t
x
x
t
x
x
(4.6.43)
abs(DQ/Q)
1e-05
1e-10
1e-15
0
1000
2000
3000
Time Step n
4000
5000
6000
4.8. PRECONDITIONING
189
abs(DQ/Q) - CLM
0.0001
1e-08
1e-12
1e-16
1e-16
1e-12
1e-08
abs(DQ/Q) - CM
0.0001
abs(DQ/Q) - CE
0.0001
1e-08
1e-12
1e-16
1e-16
1e-12
1e-08
abs(DQ/Q) - CM
0.0001
190
1.6
1.5
1.4
1.3
1.2
1.1
H
200
150
100
50
0
200
400
600
800
1000
x-index
Figure 4.12. The density () and speed (u) distribution at three
different time steps n = 13000(solid line), n = 40000(short dashes),
and n = 59000(long dashes).
If you were to run this case, you could reproduce Figure 4.12 and observe the
following. First, a compression wave travels from the inlet to the exit. It takes about
26000 time steps. The figure shows the compression wave at point B, midway along
the pipe at n = 13000 time steps. To the left of the compression wave, the speed of
the fluid is uL is of the order of 60+ m/s. To the right is the initial condition uR = 0.
The wave itself is formed by the intersection of the characteristics corresponding
to u + c. For the given conditions (u + c)L 400 m/s and (u + c)R 340 m/s,
the compression wave will be travelling at the average of those speeds. Now, this
motion uL , setup by the compression wave, carries the mass of the fluid with it.
This may sound like a circular statement, the point is that we are talking about the
characteristics corresponding to u. You see that the higher density fluid from the
inlet, moving at this speed makes it to the point A after 13000 time steps. After all,
it is travelling only at u 60 m/s. This wave is a jump in density, not in pressure
4.8. PRECONDITIONING
191
or in speed.1 Both the compression wave and this density wave are smoothed out
a bit because of the numerical dissipation that we have added.
Once the compression wave reaches the end of the pipe, the whole pipe is at the
same pressure (it is interesting to check the relationship between Po , p, and 21 u2
as the flow field evolves.) This is greater than the exit pressure. An expansion wave
is reflected from the exit into the pipe. The location of the expansion wave after
about 14000 time steps after is it is formed is shown in the figure as being between
points D and E. This expansion corresponds to the the characteristic uc. Now as
it happens, the wave at point E is travelling at 60 346 286 m/s. On the other
hand, the wave at point D is travelling at 120 346 226 m/s. The negative
sign indicates that the wave is travelling right to left. The leading edge of the wave
is travelling faster than the trailing edge. This wave is going to fan out and get
broader at it travels. We have seen this earlier in Figure 3.22. This is very clear
19000 time steps later. The point corresponding to D has moved to G and that
corresponding to E has moved to H. Clearly, from the computations, as expected,
GH is longer than DE. On reaching the inlet, you will see a compression wave
again from the inlet and so on. You will notice that as the waves move back and
forth, the steady state flow field is setup.
The important point to note is that in the first pass of the waves, the compression wave took 26000 time steps to communicate the upstream conditions
to the exit. The return expansion wave took approximately 33000 time steps to
communicate the downstream conditions to the inlet. In this time the wave corresponding to u has not even made it to the exit! It is only at point F . Now, u
increases towards its steady state value. What happens to u, u + c, and u c? You
should notice that the speed at which the wave travels from the exit to the inlet
decreases in each pass of the wave. This is because u c decreases with each pass
of the waves. Try the following assignment and see what you can get from it.
Assignment 4.10
Do the following with your favourite version of the one-dimensional code.
(1) For a given set of conditions, run you code for different number of grid
points: 101, 201, 501, and 1001 and more if it makes sense to do so.
Observe the plot of the residues, the wave propagation times (number of
time steps) as a function of the number of grid points.
(2) The assignments so far were picked so that the Mach number for the
solution is approximately 0.5. Run your code for different values of the
stagnation pressure so that the Mach number takes a range of values
between (0.01, 2.0). Observe what happens near Mach number 0.01 and
1.0.
You should have noticed the following. You convergence plot shows a correlation with the wave motion back and forth. The larger the grid size the more
obvious is this behaviour. From the second problem, it is clear that we have a
severe problem near M = 0.01 and M = 1. All of these symptoms point to one
problem. For a given acoustic speed c, if u 0 then two of the characteristics are
1This is a contact surface just like the one in 3.6 between the hot and cold water. If you have
192
of the order of c where one of them, u is of the order of zero. The propagation
speeds in our problem are very disparate. Such problems are said to be stiff or
ill-conditioned. This situation also occurs when u is of the order of c. In this
case u c 0 while the other two are of the order of c. This explains the difficulty
that we have at the two Mach numbers. We conclude
Disparate propagation speeds has a strong effect on the
convergence to a steady state solution
Naturally we ask: if we are only interested in the steady state solution can
we do something to make the wave speeds the same order of magnitude and not
contaminate the steady state solution? Restated, if I do not care for the transient,
can I do something to get to the solution faster? Or a blunt statement of fact: I
am willing to live with a non-physical transient if I can get to the correct steady
state solution rapidly.
We have a clue as to how this can be done from our experience with the Delta
form. If some term in our equation is going to zero, we can multiply it with a
non-singular expression and not affect our solution. in the case of our governing
equations, when we get to the steady state, the time derivatives should be zero.
We can pre-multiply the time derivative term with a matrix to get an equation
Q E
+
=0
t
x
So, what is ? That is for us to decide. Let us now multiply equation (4.8.1)
through by 1 . We get
(4.8.1)
Q
Q
+ 1 A
=0
t
x
This is the equation written in non-conservative form. We see that we have indeed
got an opportunity to modify the physics of the problem. We need to choose in a
manner that 1 A has eigenvalues that are almost equal to each other. This process
of changing the eigenvalues of a problem of interest to make it more amenable to
solution is called preconditioning. In this context, since we achieve this end by
multiply only the transient term we call it preconditioning the unsteady term.
We want to keep it simple for the sake of this discussion. We will seek a
so that the eigenvalues of 1 A are (1, 1, 1) and the eigenvectors are the same.
We are propagating the same physical quantities. The only difference is that the
propagation speeds are equal. This tells that
(4.8.2)
(4.8.3)
1 AX = X1 ,
1 0
1 = 0 1
0 0
0
0
1
1 X = X1
If we post-multiply through by 1
1 , pre-multiply by , and rearrange the equation,
we get
|u|
0
0
0
|| = 0 |u + c|
(4.8.5)
X = X1
1 = X||,
0
0
|u c|
193
We choose the
= X||X 1
(4.8.6)
~ 1
E
i+
~ 1
E
i
2
i1
i+1
i
i
1
2
i+
1
2
Figure 4.13. A small control volume about the grid point i. The
boundaries of the volume are shown at xi 21 and xi+ 12 . The short
vectors are outward unit vectors normal to the boundary. The
~ i+ 1
~ i 1 and E
other two vectors are the flux terms E
2
2
~ in the volume is labelled notionally as Q
~ i . Integrating
xi+ 12 . The mean value of Q
equation (4.1.4) with respect to x from xi 21 to xi+ 21 gives us
Z
d xi+ 21 ~
Q(, t)d =
(4.9.1)
dt x 1
i
2
d ~
~ i+ 1 , t) E(x
~ i 1 , t)
Qi xi = E(x
2
2
dt
~ i,
where xi = xi+ 1 xi 1 . In order the get the time evolution of the parameter Q
2
we need to integrate this equation in time. Unfortunately, this means that we need
~ i+ 1 , t) and E(x
~ i 1 , t). We know that E
~ i 1 = E(Q
~ i 1 ). How do
to determine E(x
2
2
2
2
~ 1 and not just E.
~
~ i 1 ? Further, it is clear that we should use E
we determine Q
2
i 2
194
If the system of equations were such that the flux Jacobian A was a constant
A in the interval (xi , xi+1 ) over the time period t, we could use our characteristic
equations to propagate the characteristic variables in time over the period t. The
flux Jacobian A is not likely to be a constant. We will assume it is. What should
we take as the that constant value A? We want an average value of A. Again, we
~ i and Q
~ i+1 .
would like to determine A using the two states Q
From the two paragraphs can we summarise a question as follows: Can we find
~ i+ 1 such that
aQ
2
~
E
~ i+ 1 ) =
(4.9.2)
A = A(Q
?
2
~ 1
Q
i+
2
~ and Q
~ through their respective changes over
Now, we know that A is related to E
the interval. That is
~ =A Q
~ i+1 Q
~i
~ i+1 E
~ i = E
~ i+ 1 = A Q
(4.9.3)
E
2
We want the entries of A that are given by equation (4.2.13) and repeated here for
convenience.
(4.9.4)
1
2
A=
2 ( 3)uh
( 1)u3 E u
h h
h
(3 )uh
Eh 32 ( 1)u2h
( 1)
uh
(u)
~ = (u) ,
~ = (u2 + p)
(4.9.5)
Q
E
(Et )
{(Et + p)u}
We can use equation (4.9.3) to determine the entries in A. We see from equation
(4.9.4), that this should allow us to determine ui+ 21 and Et i+ 12 . The student is
encouraged to continue with this derivation to see where the difficulties arise. We
will follow standard procedure here and convert the various expressions containing
Et and p into terms using Ht . Ht is the total enthalpy.
(4.9.6)
Ht = Et + p
First we eliminate p.
We have already seen in equation (4.2.10) that
(4.9.7)
u2
p = ( 1) Et
2
195
(4.9.9)
u2
Ht
2
(4.9.10)
0
(u)
(u2 + p) = 21 ( 3)u2
{(H u}
Cu
t
1
2
(3 )u
C
subscript in A. We
( 1) (u)
u (Et )
2
where C = 1
2 u Ht
The second equation still has p on the left hand side and the last equation
still has a Et on the right hand side. We will substitute for them from equations
(4.9.8) and (4.9.9) as we multiply the matrix equation out to get the individual
equations. The first equation gives us nothing; just (u) = (u). The second
equation gives us
1 2
3 2
Ht
u + (3 )u(u) + ( 1)
+
u =
(4.9.11)
2
2
+1 2 1
u +
Ht
2
(4.9.12)
1
+1
( 1)
(u2 ) =
( 3)u2 + (3 )u(u) +
(u2 )
2
2
2
(4.9.13)
where
(4.9.14)
a =
(4.9.15)
b = 2(u)
c = (u2 )
(4.9.16)
ui+ 12 =
(u)
(u)2 (u2 )
We cannot simplify this any further without substituting for the terms on the right
hand side as
(4.9.18)
(4.9.19)
(4.9.20)
= i+1 i = R L
196
As in section 3.13, we have use the subscripts R and L to indicate states to the right
and left of the interface at xi+ 12 . When we substitute and simplify terms under the
square root we get
R uR L uL R L (uR uL )
(4.9.21)
ui+ 21 =
R L
By combining the two terms containing R uR and L uL and taking the negative
sign from the , we get
R uR ( R L ) + L uL ( R L )
(4.9.22)
ui+ 12 =
2
2
R
L
So,
(4.9.23)
ui+ 12 =
R uR + L uL
R + L
R
1
(4.9.24)
ui+ 2 = uR + (1 )uL , =
R + L
gives us confidence. Why did we not take the other root? You can followup on that
and figure it out. Now, repeat this derivation for Ht i+ 21 to get
R
(4.9.25)
Ht i+ 12 = Ht R + (1 )Ht L , =
R + L
Assignment 4.12
(1) Take the positive sign in equation (4.9.21) and see what is the consequence.
(2) Use the Roe average in your finite volume solver.
(4.10.1)
197
as the source term. In this case the source term is a consequence of writing the
quasi-one-dimensional equations in the conservative form and has components
dA
(4.10.2)
H = p
dx
CHAPTER 5
199
H
h
H
h
200
y2
x1
x2
y1
Figure 5.3. A grid point with neighbouring points placed at uneven distances.
H
h
= (x, y),
= (x, y),
x(, ),
(5.1.4)
y(, ).
We are in a position where, given the solution at a point in one coordinate system,
we can provide the solution at the corresponding point in the other coordinate
system. Let us step back for a minute to see where we are.
201
(5.1.6)
+
x x
| {z } | {z }
+
y
y
Since, the transformation is known, we can determine the partial derivative on the
right hand side of equation (5.1.5). How do we use the expression given by equation
(5.1.5)? We can take the corresponding partial derivative of . On doing this, we
get
(5.1.7)
(5.1.8)
+
x
x
+
y
y
=
=
So far it looks manageable. Since we want to solve Laplaces equation, we now look
at the second derivatives. The second xderivative is
(5.1.9)
=
2
x
x
+
x
x
2 2
2
2
+
=
+
2
2
x
x
x x
| {z } |
{z
}
A1
A2
2 2
2
+
+
+
x2 x x
x
2
| {z } |
{z
}
2
B1
B2
This is a little messy. To make sure we understand this clearly, the term A in
equation (5.1.5) results in the terms identified as A1 and A2 in equation (5.1.9).
The same is true of the terms marked B in the two equations. A1 and A2 are a
consequence of applying product rule. The two terms in A2 emerge from applying
equation (5.1.5) to obtain the derivative of the / term with respect to x. In a
similar fashion we can write the second derivative with respect y as
(5.1.10)
2
2
2
+
2
y y
2 2
2
2
+ 2
+
+
y
y y
y
2
2
2
=
+
2
y
y 2
202
+ (xx + yy )
+ (xx + yy )
=0
To keep things more compact, we decide to use the notation that the subscript
indicates differentiation with respect to that parameter. So,
(5.1.12)
x =
x
Using this notation uniformly, the Laplace equation in the plane is given by
x2 + y2 + 2 (x x + y y ) + x2 + y2
(5.1.13)
+ (xx + yy ) + (xx + yy ) = 0
(5.1.11)
The domain for the problem has become easier, the equation does not quite fit in
one line! Also, it is not in quite the right form. The coefficients are still expressed
in the x, y coordinate system. We make the following observations and see if we
can clear the air a bit.
We want to solve problems that involve complicated domains. There may
be many methods to handle complicated problems, performing transformation of coordinates is definitely one way to do it.
We do not want to have to re-derive our governing equation in every new
coordinate system that we encounter. We need a general frame work in
which we can derive our equations.
The introduction of the subscript notation gave some relief in handling
the equation. So, the proper choice of notation is going to make life easier
for us. Further, we can do work on more difficult / complex problems
with the effort that we are currently expending.
We observe that the only difference between equation (5.1.9) and (5.1.10)
is the replacement of x with y. Again, we need the notation that will
help us to abstract these kinds of patterns out, so that we do not have to
repeat the derivation for each coordinate.
We want to solve problems in three dimensions and not just one and two
dimensions. If we are going to perform transformations in three dimensions, we need to have some minimal understanding of geometry in three
dimensions.
We will address the last point here by looking at a little differential geometry.
Coordinate lines in three dimensions are curves in three dimensions and we will try
to get a handle on them. A region of interest in three dimensions will be a volume
and it is defined using surfaces. We will take a brief look at surfaces. Tensor
calculus is a tool to address the rest of the issues raised in our list of observations.
We will do a little tensor calculus and some geometry.
As further motivation as to why one needs tensor calculus, consider the following conundrum. If you have learnt only calculus, the sequence of courses typically
taught in an undergraduate curriculum, this is for you to puzzle over; to show you
203
there must be life beyond calculus. Consider a potential flow in two dimensions.
The velocity can be written in component form as (u, v) in Cartesian coordinates.
If we were to transform the velocity to some other coordinates (, ) we get
dx
u=
(5.1.14)
= x t = x t + x t = x U + x V
dt
dy
(5.1.15)
= y t = y t + y t = y U + y V
v=
dt
Where (U, V ) are the velocities in the coordinates. The matrix representation
of this transformation equation is
(5.1.16)
x
u
=
y
v
x
y
U
V
(5.1.17)
= x = x + x = x U + x V
u=
x
v=
(5.1.18)
= y = y + y = y U + y V
y
Which has a representation
(5.1.19)
u
= x
y
v
x
y
U
V
They contradict each other and
are wrong
Why are these equations, (5.1.16) and (5.1.19), different? How can the u and v
transform in two different ways? One immediate conclusion that the equations are
wrong. We should be able to figure out what is wrong with these equations since
there are basically three terms involved. The left hand side of these two equations
are clearly fine since they are the quantities with which we start and are a given.
The chain rule part follows from calculus. That procedure looked right. That leaves
the U and V and of course, the = symbol. We want the equation relating velocities
in the two coordinate systems. That means there is a problem with the assumption
that the U and V in equation (5.1.16) are the same as the U and V in equation
(5.1.19). So, there may be two different kinds of U and V . To clear up these issues
study tensor calculus.[You93][Ari89][SS82]! We will do a very quick overview
here.
5.2. Tensor Calculus
~ can be written in terms of a global basis
Very often, we assume that a vector V
vectors e1 , e2 , e3 as follows
(5.2.1)
~ = v 1 e1 + v 2 e2 + v 3 e3 =
V
3
X
v i ei
i=0
We will see what we mean by a global basis as we go along. For now, do not confuse
the superscript on v with exponentiation. We deliberately chose superscripts and
204
subscripts since we anticipate that we are going to encounter two different kinds
of entities. We will see that superscripted entities are said to be contravariant and
subscripted entities are covariant. So, v 1 may be different from v1 . We will see
what this means as we go along. If we agree that any time the index is repeated it
implies a summation, we can simply write
~ = v i ei
(5.2.2)
V
Now, THAT is compact. It is called Einsteins summation convention. It only
gets better. By itself, the equation does not even restrict us to three dimensions.
It is our assumption that we use three dimensions. In this book, we will restrict
ourselves to two / three dimensions. You should note that
~ = v i ei = v k ek
V
(5.2.3)
Since there is a summation over the index, the index itself does not survive the
summation operation. The choice of the index is left to us. It is called a dummy
index.
3
Q
~x(Q)
P
~x(P )
P~
1
Figure 5.5. A Cartesian coordinate system used to locate the
point P and Q. ~x(P ) gives the position vector of P in the Cartesian coordinate system. That is, P~ = ~x(P ). P Q forms a differential element.
We now define the notation with respect to coordinate systems. Consider
Figure 5.5. It indicates a differential line element with points P and Q at each
end of the element. We define ~x(.) as a coordinate function which returns the
coordinates of a point in the Cartesian coordinate system. That is, for a point P ,
~x(P ) gives us the corresponding coordinates. If we had another coordinate system
overlayed on the same region, the point P will have the corresponding coordinates
205
~x(P ) = x1 (P )
e1 + x2 (P )
e2 + x3 (P )
e3
Since we are dealing with Cartesian coordinates, xi and xi are the same. we have
already seen that if P~ is the position vector for P then
(5.2.5)
xi (P ) = P~ ei
~1
cos
=
sin
~2
sin
cos
~e1
~e2
~i = Aji ~ej
(5.2.8)
~s = si~ei = i~i
where i and j are dummy indices. Even though they are dummy indices, by the
proper choice of these dummy indices here we can conclude that
(5.2.10)
sj = i Aji = Aji i
Compare equations (5.2.7) and (5.2.10). The unit vectors transform one way,
the components transform the opposite [ or contra ] way. We see that they too show
the same behaviour we saw with the velocity potential. Vectors that transform like
each other are covariant with each other. Vectors that transform the opposite way
are contravariant to each other. This is too broad a scenario for us. We will stick
with something simpler. Covariant entities will be subscripted. Contravariant
entities will be superscripted.
An example where this will be obvious to you is the case of the rotation of
the Cartesian coordinate system. Again, we restrict ourselves to two dimensions.
If you rotate the standard Cartesian coordinate system counter-clockwise, you see
that the coordinate lines and the unit vectors ( as expected ) rotate in the same
direction. They are covariant. The actual coordinate values do not change in the
same fashion. In fact, the new values corresponding to a position vector look as
though the coordinate system was fixed and that the position vector was rotated in
a clockwise sense ( contra or opposite to the original coordinate rotation ). These
two rotations are in fact of equal magnitude and opposite in sense. They are, indeed,
inverses of each other. We will investigate covariant and contravariant quantities
more as we go along. Right now, we have assumed that we have a position vector.
Let us take a closer look at this.
206
2
1
x2
1
x2
1
e1
Figure 5.6. The basis vectors rotate with the coordinate axes
(only e1 and ~1 are shown). The coordinates of the point in the
new system are as though the point had moved clockwise and the
coordinate system was fixed. That is x2 < x2 in this particular
case.
We have made one assumption so far that the basis vector is global. We used
the term global basis in the beginning of this section. What do we mean by a
global basis? We want the basis to be the same, that is constant, at every point.
Such a set of basis vectors is also said to be homogeneous. For example, the basis
vectors in the standard Cartesian coordinate system do not depend on the (x, y)
coordinates of a point. Consider the trapezium in Figure (5.7) We see that the
tangent to the = constant coordinate lines change with . In general, the basis
vectors change from point to point. We do not have a global basis. Also, consider
the standard polar coordinate system (see Figure 5.8). The usual symbols for the
basis vectors are e and er . Both of these vectors depend on . Again, for the
familiar and useful polar coordinate system, we do not have a global basis. That is
the basis vectors are not constant. They are not homogeneous. In fact, in the case
of polar coordinates we have as the position vector at any point P~ = r
er . Does
the position vector not depend of at all? The fact of the matter is that the er
depends on , as the basis is not homogeneous. Fortunately, er and e depend only
on . So, we are still able to write P~ = r
er .
Another example of a coordinate system with which you are familiar and is
used for doing log-log plots is shown in figure 5.9. In this case, the basis vectors
seem to be oriented in the same fashion. However, the length of the vector seems
to change. It is clear that the notion of distance between points is an issue here.
Looking at these examples, we realise that we need to spend a little time trying
to understand generalised coordinate systems. Lets just consider one coordinate
line in some generalised coordinate system. We will see that in three dimensions,
it is a space curve. Let us first look at space curves and some of their properties.
We are especially interested in the tangent to these curves since the tangent to the
coordinate line is a part of our basis.
207
P~
Figure 5.7. The basis vectors at the origin and the basis vectors
at some other point are clearly not the same. The position vector,
P~ is geometrically very easy to draw. It cannot be written simply as a linear combination of some basis vector in the coordinate
system shown here.
Consider the coordinate line shown in Figure 5.10. The curve is determined by
a function
~ ( 1 ). It is a coordinate line in a coordinate system labelled ~ which has
three components ( 1 , 2 , 3 ). The figure shows the coordinate line corresponding
to 2 =constant, and 3 =constant. To belabour the point, it is the 1 coordinate
line since it is parametrised on 1 . The tangent to this line is given by
(5.2.11)
~1 =
d~
( 1 )
d 1
In fact, for any of the coordinate lines of i we have for the corresponding
~i
d~
i
, no summation on i
d i
This basis vector is called the covariant basis vector. We note the following
In equation (5.2.12), though the subscripts are repeated, there is no summation implied over i. The subscript on the
~ is there to conveniently
indicate three generic coordinate functions.
Some new tensor notation convention: ( see equation (5.2.12) ) the superscript of the derivative on the on the right hand side becomes a subscript
on the left.
We consider an example to understand this process better. Let us take a look at
polar coordinates in two dimensions ( 1 , 2 ). A 2 = coordinate line corresponds
to a 1 = r =constant line. For the given r, the curve is parametrised as
(5.2.12)
(5.2.13)
~i =
~ () =
~ ( 2 ) = r cos() + r sin()
= 1 cos( 2 )
e1 + 1 sin( 2 )
e2
As was seen earlier, e1 and e2 are the standard basis vectors in the Cartesian
coordinate system. You may be used to calling them and . The tangent to this
208
e
er
~P =
re
r
~2 = r sin()
e1 + r cos()
e2 = 1 sin( 2 )
e1 + 1 cos( 2 )
e2
~ (r) =
~ ( 1 ) = 1 cos( 2 )
e1 + 1 sin( 2 )
e2 ,
2 = constant
For constant 2 = , this will correspond to a radial line. The tangent vector to
this line is given by
~
e1 + sin( 2 )
e2
(5.2.16)
~1 = 1 = cos( 2 )
209
P~
x
Figure 5.9. A log-log coordinate system. We have a rectangular coordinate system, however the unit vectors are still a function
of position making it impossible to write the position vector drawn
P~ as a linear combination of the basis vectors.
~ ( 1 )
~ ( 1)
210
system
2
x3
3
d~ = ~(Q)
~
d~x = X(Q)
Q
P
x2
1
x1
X system
Figure 5.11. Figure 5.5 is redrawn and zoomed. The origin of
our Cartesian coordinate system is translated to the point P . The
differential element P Q is now represented in terms of the translated coordinate system and a similar system of the generalised
coordinates.
Lets pause and take stock of what we have and where we are. We have seen
that there are coordinate systems where the basis vectors are not homogeneous. So,
~ = v i ei , for a position vector V
~ may
just writing a relation like equation (5.2.2), V
not be possible. We will start dealing only with differentials. A differential element
P Q is shown in the Figure 5.11. It is represented in the X coordinate system
as d~x = dxi ei . The ei are the basis vectors in this coordinate system. We can
transform from the X coordinates to the coordinates where the basis vectors are
~i . The differential P Q can be written in the coordinates system as d~ = d i ~i .
How are the two representations for the given differential element at a given point
related? Clearly, the length of the element should not depend on our choice of
the coordinate system. Or, put another way, if two people choose two different
coordinate systems, the length of this particular element should work out to be the
same. As we had done earlier, here are the two equations that relate the Cartesian
211
xi = xi ( 1 , 2 , 3 )
and
(5.2.18)
i = i (x1 , x2 , x3 )
Remember that ei are the basis vectors of a Cartesian coordinate system and are
orthogonal to each other. Consequently, we can define a useful entity called the
Kronecker delta as
1, i = j
(5.2.20)
ij = ei ej =
0, i 6= j
With this new notation we can write
(5.2.21)
(dxi )2
Following the convention we have used so far (without actually mentioning it) we
see that
dxj ij = dxi
(5.2.22)
That is, j is a dummy index and disappears leaving i which is a subscript. For the
first time we have seen a contravariant quantity converted to a covariant quantity.
If you think of matrix algebra for a minute, you will see that ij is like an identity
matrix. The components dxi are the same as the components dxi in a Cartesian
coordinate system. Hence, equation (5.2.21) can be written as
X
(5.2.23)
(ds)2 = dxi dxi =
(dxi )2
i
The length of the element is invariant with transformation meaning the choice of
our coordinates should not change the length of the element. A change to the
coordinates should give us the same length for the differential element P Q. The
length in the coordinates is given by
(5.2.24)
(ds)2 = d~ d~ = d i ~i d j ~j = d i d j ~i ~j = d i d j gij = d i di
gij is called the metric. Following equation (5.2.22), we have defined di = gij d j .
Why did we get gij instead of ij ? We have seen in the case of the trapezium that
the basis vectors need not be orthogonal to each other since the coordinate lines
are not orthogonal to each other. So, the dot product of the basis vectors ~i and
~j gives us a gij with non-zero off-diagonal terms. It is still symmetric, though. In
this case, unlike the Cartesian situation, d i is different from di .
We can define another set of basis vectors which are orthogonal to the covariant
set as follows
~i ~ j = ij
(5.2.25)
where,
(5.2.26)
ij
1,
0,
i=j
i 6= j
212
~ 3
~3
~2
~1
Figure 5.12. The covariant basis vectors ~1 , ~2 , and ~3 are shown.
In general they may not be orthogonal to each other. ~ 3 is also
shown. It is orthogonal to ~1 and ~2 and ~3 ~ 3 = 1
This new basis, ~ i , is called the contravariant basis or a dual basis. This is
demonstrated graphically in figure 5.12. This basis can be used to define a metric
g ij = ~ i ~ j
(5.2.27)
Now, is the definition given for di consistent with this definition of the contravariant basis? Is d~ = di ~ i ? That is, if we take the dot product of a vector with a
basis vector, do we get the corresponding component? We have,
(5.2.28)
d~ = di ~ i d~ ~ j = di ~| i{z
~ }j = d j ,
g ij
and
(5.2.29)
and
(5.2.30)
and finally,
(5.2.31)
d~ = di ~ i
d~ ~j = di ~ i ~j = dj ,
d~ = d i ~i
d~ ~j = d i ~i ~j = dj ,
d~ = d i ~i
d~ ~ j = d i ~i ~ j = d j ,
So, to get the contravariant components of a tensor, dot it with the contravariant
basis vectors. Likewise, to get the covariant components of a tensor, dot it with
the covariant basis vectors. The effect of gij on a contravariant term is to lower the
index or convert it to a covariant term. Similarly, the effect of g ij on a covariant
term is to raise the index or convert it to a contravariant term. So, what is gij g jk ?
(5.2.32)
gij g jk = gik = ~i ~ k = ik
213
~1
~2
~ 2
214
Assignment 5.1
(1) Expand the following using the summation convention assuming that we
are working in three dimensions.
(a) ai b j ij , (b) jj , (c) ij ji , (d) ii jj
(2) Repeat the above problem assuming we are dealing with tensors in two
space dimensions.
(3) Find the covariant and contravariant bases vectors and the corresponding
metric tensors for the following coordinate systems. xi are the Cartesian
coordinates.
(a) Cylindrical coordinates. 1 = r, 2 = , and 3 = z in conventional
notation.
x1 = 1 cos 2 , x2 = 1 sin 2 , and x3 = 3 .
(b) Spherical coordinates. 1 = R, 2 = , and 3 = in conventional
notation.
x1 = 1 sin 2 cos 3 , x2 = 1 sin 2 sin 3 , and x3 = 1 cos 2
(c) Parabolic
coordinates.
n cylindrical
o 2
1
1
1 2
2 2
x =2
, x = 1 2 , and x3 = 3 .
(4) Compute the covariant and contravariant velocity components in the
above coordinate systems.
You have seen in multivariate calculus that given a smooth function , in a
region of interest, we can find the differential d as
(5.2.33)
d =
i
d
i
Now, we also know that this is a directional derivative and can be written as
(5.2.34)
d = d~ = i d i
where,
(5.2.35)
= ~ j
,
j
d~ = ~i d i
We managed to define the gradient operator . What happens when we take the
gradient of a vector? How about the divergence? We first write the gradients of a
scalar function and a vector function as
= ~ j j
(5.2.36)
~
~ = ~ j V
(5.2.37)
V
j
If we look carefully at the two equation above, we see that equation (5.2.37) is
different. It involves, due to the use of product rule, the derivatives of the basis
vectors. In fact, equation (5.2.37) can written as
i
~
v
i
j
i ~
j V
~
= ~
~i + v
(5.2.38)
V = ~
j
j
j
215
So, what is the nature of the derivative of the basis vector? For one thing, from
the definition of the covariant basis in equation (5.2.12) we have
(5.2.39)
~i
2
~
~j
=
=
j
j
i
k
The set of ij are called a Christoffel symbols of the second kind. We took
the dot product with ~ k so that equation (5.2.38) can be rewritten as
i
~ = ~ j v ~i + v i k ~k
(5.2.41)
V
ij
j
Since i and k are dummy indices (meaning we are going to sum over their values)
we swap them for a more convenient expression
i
v
i
j
k
~
(5.2.42)
V = ~
~i + v
~
kj i
j
This allows us to write
(5.2.43)
~
V
=
j
v i
i
k
~i
+
v
kj
j
i
v;j
=
v i
i
k
+
v
kj
j
This is called the covariant derivative of the contravariant vector v i . Staying with
our compact notation, the covariant derivative is indicated by the semi-colon in the
subscript. This is so that we do not confuse it with the plain derivative v i / j .
So, if we have Christoffel symbols of the second kind do we have any other
kind? Yes, there are Christoffel symbols of the first kind. They are written in a
compact form as [ij, k] and it are given by
~i
~i
l
(5.2.45)
[ij, k] =
glk = j ~ l glk = j ~k
ij
2 i
j
k
This can be verified by substituting for the definition of the metric tensor. The
peculiar notation with brackets and braces is used for the Christoffel symbols (and
they are called symbols) because, it turns out that they are not tensors. That is,
though they have indices, they do not transform the way tensors do when going from
one coordinate system to another. We are not going to show this here. However,
we should not be surprised that they are not tensors as the Christoffel symbols
216
encapsulate the relationship of the two coordinate systems and would necessarily
depend on the coordinates.
~ is defined as the trace of the gradient of V
~ . That is
The divergence of V
i
~ = ~ j v ~i + v k i ~i
(5.2.47)
divV
kj
j
Assignment 5.2
For the coordinate systems given in assignment 5.1,
(1) Find the Christoffel symbols of the first and second kind.
(2) Find the expression for the gradient of a scalar potential.
(3) Find the gradient of the velocity vector.
(4) Find the divergence of the velocity vector.
(5) Find the divergence of the gradient of the scalar potential that you just
found.
~ = we get,
In the case of the velocity potential V
~ = ~ j = ~k g kj = ~k g kj vj = v k ~k
V
j
j
If we now take the divergence of this vector using equation (5.2.47) we get
i
v
i
2
j
k
(5.2.49) = ~
~i + v
~
kj i
j
i
j
il
k
~
= ~
g
~i + v
kj i
j
l
Completing the dot product we get
i
2
il
k
(5.2.50)
=
g
+v
ki
i
l
(5.2.48)
i
il
kl
2
g
+g
(5.2.51)
=
i
l
l
ki
This much tensor calculus will suffice. A more in depth study can be made
using the numerous books that are available on the topic [You93], [SS82].
5.3. Equations of Fluid Motion
We have seen enough tensor calculus so that if we derive the governing equations
in some generic coordinate system, we can always transform the resulting equations
into any other coordinate system. In fact, as far as possible, we will derive the
equations in vector form so that we can pick the component form that we feel is
appropriate for us. We can conveniently use the Cartesian coordinate system for
the derivation with out loss of generality.
We will first derive the equations of motion in integral form. We will do this in
a general setting. Let us consider some fluid property Q, whose property density
is given by Q. In terms of thermodynamics, Q would be an extensive property and
217
V~
3
dS
d
~x
2
1
Figure 5.14. An arbitrary control volume chosen in some fluid
flow. An elemental area on the controls surface dS and and elemental volume d within the control volume are also shown. Note
that in most coordinate systems we may not be able to indicate a
position vector ~x.
We would like to write out the balance laws for a general property, Q. We
arbitrarily pick a control volume. One such volume is indicated in the Figure
5.14. For the sake of simplicity, we pick a control volume that does not change in
time. This control volume occupies a region of volume . This control volume has
a surface area S. It is located as shown in the figure and is immersed in a flow field.
Within this control volume, at an arbitrary point ~x, we pick a small elemental
region with volume d. From equation (5.3.1), the amount of the property of
interest at time t, dQ(~x, t), in the elemental control volume is Q(~x, t)d. Then the
total quantity contained in our control volume at any instant is
218
(5.3.2)
Q (t) =
Q(~x, t)d
(5.3.3)
d
dQ
=
dt
dt
Q(~x, t)d
Then we ask ourselves the question, why is there a rate of change? There is change
because the property Q is carried / transported in and out of the control volume by
the fluid. It is also possible, based on the nature of Q, that it is somehow created
or destroyed in the control volume. There may be many mechanisms by which Q
can be changed into some other property. Let us now look at the transport of Q
by the flow.
At any arbitrary point on the surface of the control volume that we have shown
in Figure 5.14, we can determine the unit surface normal vector. We can pick a
small elemental area dS at that point. The surface normal is perpendicular to this
element. By convention, we choose to pick a surface normal that points out of the
control volume. The rate at which our property Q flows out through this elemental
~ n
area is given by QV
dS. The total efflux (outflow) from the control volume is
Z
~ n
(5.3.4)
QV
dS
S
Since this is a net efflux, it would cause a decrease in the amount of Q contained
in the control volume. So, our balance law can be written as
Z
Z
d
~ n
Qd = QV
dS + any other mechanism to produce Q
(5.3.5)
dt
S
Before going on we will make the following observation. Though the control volume
can be picked arbitrarily, we will make sure that it is smooth enough to have surface
normals almost everywhere. Almost everywhere? If you think of a cube, we cannot
define surface normals at the edges and corners. We can break up the surface
integral in equation (5.3.5) into the sum of six integrals, one for each face of the
cube.
5.3.1. Conservation of Mass. Let us look at an example. If the property
we were considering was mass, Q (t) would be the mass of fluid in our control
volume at any given time. The corresponding Q would be mass density which we
routinely refer to as the density, . Ignoring mechanisms to create and destroy or
otherwise modify mass, we see that the production terms disappear, leaving only
the first term on the right hand side of equation (5.3.5). This gives us the equation
for balance of mass as
Z
Z
d
~ n
dS
d = V
(5.3.6)
dt
S
This equation is also called the conservation of mass equation.
219
Here, f~(~x) is the body force per unit volume at the point ~x within the control
volume. T~ (~x) is the traction force per unit area (often called traction force or
just traction) acting at some point ~x on the control surface. If we are willing or
able to ignore the body force, we are left with the traction force to be handled.
From fluid mechanics, you would have seen that we can associate at a point, a
linear transformation called the stress tensor, which relates the normal to a surface
element to the traction force on that element. That is
(5.3.8)
T~ = n
where, T~ = Ti ~ i , = ij ~ i ~ j , and n
= nk ~ k . This gives us the Cauchy equation
in component form as
Ti = ij nj
(5.3.9)
d
dt
~ d =
V
Z n
S
~V
~
V
n
dS
220
S
Z
~q n
dS
S
Here, ~q is the term quantifying heat. Again, if we are in a position to ignore body
forces we get
Z
Z
Z
Z
d
~ n
~ n
Et d = Et V
dS +
V
dS
~q n
dS
(5.3.14)
dt
S
S
S
~ ,
Q = V
Et
~
V
~ =
~V
~
F
V
~ V
~ + ~q
(Et )V
~ is the rate at which the traction force does work on the control volume.
where, V
This, gives us a consolidated statement for the balance (conservation) of mass,
linear momentum, and energy. The great thing about this equation is that it can
be cast in any three dimensional coordinate system to get the component form. It
is written in a coordinate free fashion. Though, it is good to admire, we finally
need to solve a specific problem, so we pick a coordinate system convenient for
the solution of our problem and express these equations in that coordinate system.
There is another problem. As things stand, there is some element of ambiguity in
~)n
the dot products of the form ( V
. These ambiguities are best resolved by
writing the expression in terms of components.
~ = Ti ~ i ~l V l = ij ~ i (~ j ~ k )nk ~l V l = ij nj V i
(5.3.17)
T~ V
The differential form of equation (5.3.15) can be obtained by applying the theorem
of Gauss to the right hand side of the equation and converting the surface integral
to a volume integral.
Z
Q
~
+ divF d = 0
(5.3.18)
t
221
required. If you have other properties that need to be tracked, the corresponding
equations can be incorporated. However, for our purpose, these equations are quite
general. We will start with a little specialisation and simplification.
We now decompose the stress tensor into a spherical part and a deviatoric
part. The spherical part we will assume is the same as the pressure we have in the
equation of state. The deviatoric part (or the deviation from the sphere) will show
up due to viscous effects. So, can be written as
= p1 +
(5.3.20)
1 is the unit tensor and is the deviatoric part. Do not confuse a tensor with
the control volume . Through thermodynamics, we have an equation of state /
constitutive model for p. Typically, we use something like p = RT , where T is the
temperature in Kelvin and R is the gas constant. We need to get a similar equation
of state / constitutive model for . Assuming the fluid is a Navier-Stokes fluid,
that is the fluid is Newtonian, isotropic and Stokes hypothesis holds we get
2
(5.3.21)
= trD + 2D, where
3
1
D =
(5.3.22)
(L + LT ), and
2
~
(5.3.23)
L = V
where is the coefficient of viscosity and trD is the trace of D, which is the sum
of the diagonals of the matrix representation of the tensor. LT is the transpose
~ and is called the
of L. Really, D is the symmetric part of the the gradient of V
deformation rate. Equation (5.3.19) with written in this fashion is called the
Navier-Stokes equation. Since, we are right now looking at inviscid flow, we can
ignore the viscous terms. So, for the Eulers equation we have
T~ = p1 n
(5.3.24)
where, 1 is the unit tensor. The Eulers momentum conservation equation can be
written as
Z
Z
Z
d
~ d = V
~V
~ n
(5.3.25)
V
dS
p1 n
dS
dt
S
S
d
dt
~ d =
V
Z n
S
o
~V
~ + p1 n
V
dS
where we have
(5.3.28)
~ ,
Q = V
Et
~
V
~ = V
~V
~ + p1
F
~
(Et + p)V
giving us a consolidated statement for the conservation (or balance) of mass, linear
momentum, and energy. These equations are collectively referred to as the Eulers
222
p = RT
and
(5.3.30)
(5.3.31)
Et = e +
~ V
~
V
2
e = Cv T
With these equations included, we have a closed set of equations that we should
be able to solve. The equations are in integral form. We can employ the theorem
of Gauss on the surface integral in equation (5.3.27) and convert it to a volume
integral like so
Z
Z
Z
d
~
~ n
dS = divFd
Qd = F
(5.3.32)
dt
which is valid for all possible control volumes on which we have surface normals
and can perform the necessary integration. Remember, this particular was
chosen arbitrarily. We conclude that the integral can be zero for any only if the
integrand is zero. The differential form of the Eulers equation can be written as
Q
~ =0
+ divF
t
~ in Cartesian coordinates as
If we use normal convention to write F
(5.3.34)
(5.3.35)
~ = E + F + Gk
F
Q E
F
G
+
+
+
=0
t
x
y
z
Clearly, given any other basis vector, metrics, Christoffel symbols, we can write the
governing equations in the corresponding coordinate system.
Assignment 5.3
(1) Given the concentration of ink at any point in a flow field is given by ci ,
derive the conservation equation in integral form for ink. The diffusivity
of ink is Di .
(2) From the integral from in the first problem, derive the differential form
(3) Specialise the equation for a two-dimensional problem.
(4) Derive the equation in polar coordinates.
223
(5.3.37)
p
T
, and p = , along with p = RT gives T =
,
r
pr
Tr
where,
(5.3.40)
Tr =
pr
,
r R
(5.3.41)
Consider the one-dimensional energy equation from gas dynamics. This relation
tells us that
V2
2
If we divide this equation through by Tr and nondimensionalise speed with a
reference speed ur we get
(5.3.42)
Cp To = Cp T +
Cp To = CP T +
(5.3.43)
V 2 u2r
2Tr
ur =
RTr
V 2
To =
T +
1
1
2
224
Now, consider the first equation, conservation of mass, from equations (5.3.37).
This becomes
r
r ur u
r ur v
(5.3.46)
+
+
=0
t
L x
L y
where is some characteristic time scale to be defined here. Dividing through by
r ur and multiplying through by L, we get
u
v
L
+
+
=0
ur t
x
y
(5.3.47)
Clearly, if we define the time scale = L/ur we get back our original equation.
I will leave it as an exercise in calculus for the student to show that given the
following summary
(5.3.48)
(5.3.49)
(5.3.50)
y
L
p
and
p = ,
= ,
r
pr
p
pr
,
and
ur = RTr
Tr =
r R
x =
x
,
L
and
y =
Q
E
F
+
+
=0
t
x
y
(5.3.51)
where
(5.3.52)
v
u
u u2 + p
u v
, F = 2
,E =
Q =
v +p
v
u v
[ Et + p ]v
[ Et + p ]u
Et
Very often for the sake of convenience the stars are dropped. One has to
remember that though these basic equations have not changed form. Others have
changed form. The equation of state becomes p = T and the one-dimensional
energy equation changes form. Any other auxiliary equation that you may use has
to be non-dimensionalised using the same reference quantities.
A careful study will show you that if L, pr and r are specified then all the other
reference quantities can be derived. In fact, we typically need to fix two reference
quantities along with a length scale and the others can be determined. The other
point to note is that we typically pick reference quantities based on the problem at
hand. A review of dimensional analysis at this point would be helpful.
Assignment 5.4
(1) Non-dimensionalise the Eulers equation in the differential form for threedimensional flows.
(2) Try to non-dimensionalise the Burgers equation
(5.3.53)
u
u
2u
+u
= 2
t
x
x
225
CHAPTER 6
This is an equation for flow in three dimensions. Of course, with the appropriate
interpretation of Q, , F , and S, it could also represent the flow in two spatial
dimensions. The basic idea behind the finite volume method is pretty straight
forward. We just repeatedly apply our conservation laws given in equation (6.1.1)
to a bunch of control volumes. We do this in a systematic fashion. We fill the region
of interest with polyhedra. We ensure that there are no gaps and no overlaps. No
gaps and no overlaps means that the sum of the volumes of the polyhedra is the
volume of our problem domain. This is called a tessellation. So we would say: We
tessellate the region of interest using polyhedra.
Consider two such volumes as shown in Figure 6.1. The volumes share a face
226
227
B
A
Qd
(6.1.2)
Q=
where is the volume of the cell. (Now you understand why we needed to
introduce the term cell. Volume of the volume does not sound great and if we
were discussing a two-dimensional problem, area of the volume sounds worse.)
We decide, for the sake of this discussion, to store the volume averaged properties
at the centre of a given cell, For this reason, the scheme will be called a cell
Q
centred scheme. The flux term at the face needs to be estimated in terms of
values. In a similar fashion for the volume ABCDE, one can
these cell centred Q
compute the total flux through all of the faces ABCD, BCE, ADE, and DCE. For
this volume, we can now compute the right hand side of equation (6.1.1). Including
the left hand side, this equation is in fact
d
Q = total flux through faces ABCD, BCE, ADE, and DCE
dt
In general, for a polyhedra with four or more faces, we can write
XZ
d
(6.1.4)
Q =
Fk n
k dSk
dt
Sk
(6.1.3)
228
i
Q
C
j
Q
D
Figure 6.2. The shared face ABCD has a flux F through it which
i and Q
j
may be determined from Q
the flux term F at each cell and find the flux on the face ABCD by taking the
average of these fluxes. That is
1
k)
(6.1.5)
F ij = (F i + F j ) , F k = F (Q
2
The subscripts ij indicate that we are talking about the face shared by cells i and
ij at the face and find the
j. The other possibility is to compute an average Q
corresponding F .
Assignment 6.1
(1) Write a function Get3dFlux( Q ), which will compute the inviscid flux
Use this function to evaluate fluxes as necessary in
given the variable Q.
the next two functions.
(2) Write a function GetAvgFaceFlux( Qi, Qj ), which will compute the fluxes
and return the average given in equation (6.1.5).
(3) Write a function GetAvgQ( Qi, Qj ) which will return the averaged value
Qij .
Let us look at the second problem in the context of equation (6.1.3). Since the
integral of the sum is the sum of the integrals, we can split the integral implied by
the right hand side of equation (6.1.3) into two parts. That is
Z
Z
1X
1X
(6.1.6)
Fi n
k dSk +
Fk n
k dSk
2
2
Sk
Sk
k
k is an index that runs over all cells neighbouring the cell i. Since, F i in the first
sum does not depend on k, it can be taken out and the resulting sum is zero. Only
the neighbours contribute to the net flux in this scheme. The resulting scheme will
look like FTCS and likely to require that we add some viscous dissipation. Worse,
the current cell seems to make no contribution to the net flux. You can try using
the fluxes calculated in both fashions as indicated in problem 2 and problem 3.
However, we suspect that using average Q and then computing the fluxes may be
better.
How do we calculate the derivatives required in the Navier-Stokes equations
and in the artificial dissipation terms that we may want to add?
229
where is the standard unit vector in the y direction. Again, if we want the second
derivative we would have
Z
Z 2
u
u
d
=
n
dS
(6.1.10)
2
x
S x
(6.1.11)
x
x
The right hand side of equation (6.1.8) looks like flux term and can be evaluated
in a similar fashion.
Let us consider an example at this point. We will consider the equation of
conservation of mass alone. The integral form given in equation (5.3.6) is rewritten
here for convenience. It reads
Z
Z
d
~ n
(6.1.12)
d = V
dS
dt
S
Now consider a situation where the flow field is irrotational and we can actually
~ = . This assumption gives us
find a so that, V
Z
dS = 0
(6.1.14)
n
S
230
~
V
~V
~ + p1
(6.1.15)
F = V
~
(Et + p)V
~ , for a solid wall
This is reproduced here from equation (5.3.28). The first term V
gives
(6.1.16)
~ n
V
=0
~V
~ + p1 consists of two terms.
The first entry in F n
is zero. The second term V
~
~
Again, due the solid wall V V n
= 0, leaving p1 n
= p
n. The final entry in
F n
can be verified to be zero. This gives the flux through the face of a cell
corresponding to a solid wall as
0
n
(6.1.17)
F n
= p
0
How do we find p on the wall? We will take a local coordinate system on the cell face
that corresponds to the solid wall. We take the x-coordinate direction perpendicular
to this wall. Lets look at the x-momentum equation in the differential form. This
turns out to be
uv uw
u
+
u2 + p +
+
=0
(6.1.18)
t
x
y
z
If we take find the limit of this equation as we approach the wall, that is as x 0.
We know that u 0 as x 0. As a consequence, at the wall
(6.1.19)
p
=0
x
231
Since we are talking only about the normal, it is usual to use a coordinate n instead
of x. We write
p
(6.1.20)
=0
n
The easiest way to impose this condition is to copy the cell centre value to the face.
x
y
Fi
n
Fj
F j = F i 2(F i n
)
n
We see immediately that taking the average flux across the face will result in zero
normal flux (check this for your self). This is a vector condition. It corresponds to
prescribing three quantities, u, v, and w. We need two more conditions, after
all, we require Q in the pseudo volume. We set
T
p
= 0, and
=0
n
n
We have already derived the first condition The second boundary condition in
equation (6.1.22) is the adiabatic wall condition. We are asserting that there is
(6.1.22)
232
no Fourier heat conduction through that boundary. These conditions in turn can
be written as
Et
(6.1.23)
= 0 and,
=0
n
n
The first condition comes from taking the derivative of the equation of state with
respect to n. The second follows from the definition of Et . These conditions can
be imposed by just copying the and Et from the interior volume to the pseudo
volume.
What happens in the viscous case? The first condition on mass flux changes.
The fluid adheres to solid walls and at these walls one needs to set the velocity to
zero.
Fi
n
Fj
Figure 6.4. F i and F j are mass fluxes in the interior and pseudo
cells. n
is a unit normal pointing out of the domain. For a viscous
flow, not only is the mass flux through the surface element zero,
the velocity at the boundary is zero
This is done by just setting the mass flux term in the pseudo volume to the
opposite of that in the interior cell. This is shown in Figure 6.4 That is
(6.1.24)
F j = F i
Again, we have prescribed three conditions on the boundary. The other two conditions on the pressure and temperature remain the same. In the case of pressure,
the boundary condition is one of the consequences of the viscous boundary layer
on the wall. The condition, however, is valid only within the boundary layer. This
requires that the first cell be completely immersed in the boundary layer. In fact, it
will turn out that we need at least three cells in the boundary layer as an absolute
minimum. I would recommend five to ten cells. We have not talked about grids
yet. We see that there is a constraint on the grids for viscous flow.
We will now look at an important assignment. The first two problems use the
heat equation written in the integral form. Use this as an exercise to understand
the structure of the program. Once these practice problems are coded, you can
move onto the next two problems which involve solutions to the Eulers equations
in two and three dimensions.
Assignment 6.2
233
B
x
234
F
z
x
Face ABCD is held at 400 K
Figure 6.6. A unit cube. The governing equation is given by
equation (6.1.25). The front and the rear faces are held at 400 K
and 300 K respectively. The rest of the faces are adiabatic
solve the one-dimensional Euler equations in chapter 4. We will now see how these
techniques extend to multi-dimensional flows. Lets start with two-dimensional
problems and then extend to three dimensions.
6.2.1. Two Dimensional Euler Equations. The equations governing two
dimensional inviscid flow can be written in differential form as
Q E
F
+
+
=0
t
x
y
(6.2.1)
Where
(6.2.2)
Q=
v ,
Et
u
u2 + p
,
E=
uv
(Et + p)u
and
uv
F =
v 2 + p
(Et + p)v
As we had done earlier, we now have a choice of explicit and implicit schemes
to solve these problems. The simplest explicit scheme that we can apply is the
FTCS scheme. In the two dimensional case we use central differences along each
coordinate line. That is
q
q
q
q
Ep+1,r Ep1,r
Fp,r+1
Fp,r1
q+1
q
(6.2.3)
Qp,r = Qp,r t
+
2x
2y
We may find that we have to add some second order and fourth order dissipation
terms to stabilise the scheme. There is a huge class of multi step methods to march
235
these equations in time. Then, again there are many implicit schemes that one can
employ too.
Implicit schemes typically involve solving a large system of equations. We
can use a whole plethora of numerical algorithms to solve this system of equations
[GL83]. This makes implicit schemes computational expensive. We now endeavour
to use of an idea called approximate factorisation to simplify the numerical solution
of these equations.
The basic idea of the approximate factorisation schemes is to factor the
implicit operator into two or more factors that are easier to invert. In order to ensure
that the factorisation cost is not too large, the factors or chosen with a structure
that is predetermined and the product of the factors results in an approximation
to the original operator.
6.2.2. Alternating Direction Implicit scheme. One of the earliest factorisation schemes is the ADI scheme [Jr55], [PJ55]. This is an acronym for
Alternating Direction Implicit scheme. The implicit operator is factored as
operators along the various coordinate directions. In the two dimensional case the
linearised block implicit scheme would give the equation in the delta form as
I + t A + t B Q = tR(Q)
x
y
This can be written as the product of two factors
(6.2.4)
(6.2.5)
I + t A
I + t B Q = tR(Q)
x
y
Each of the factors is an operator in one of the coordinate direction. In a sense
we are solving two one-dimensional problems.
(6.2.6)
I + t A Q = tR(Q)
x
I + t B Q = Q
(6.2.7)
y
We solve the first equation for Q and then the second equation for Q. The
individual factors result in an equation that can be solved using central differences.
This would then give as two separate tridiagonal systems to solve. The advantage
is that we have reduced the bandwidth of the system of equations. This is a
tremendous savings in computational effort. The natural question to ask is what
was the price.
Well, we can multiply the two factors and see the approximation involved.
Please remember that the individual factors are actually operators.
I + t A + t B + t2 A B Q = tR(Q)
x
y
x y
We have this extra term
(6.2.8)
(6.2.9)
t
x
2
A
(BQ)
y
236
fp+1 fp1
1 fp+1 fp
1 fp fp1
f
=
+
x
2x
2
x
2
x
or
=
+
x
x
x
(6.2.11)
(6.2.12)
+
+
A + t
B + t A + t B Q
I + t
x
y
x
y
= tR(Q)
(6.2.13)
+
+
I + t
A + t
B
I + t A + t B Q
x
y
x
y
= tR(Q)
(6.2.14)
(6.2.15)
A + t
B Q = tR(Q)
I + t
x
y
+
+
I + t A + t B Q = Q
x
y
The first equation is a lower triangular system. The second equation corresponds to an upper triangular system. They can be solved through a sequence of
forward substitution and back substitution respectively.
To see how this actually works, we write out the left hand side of the unfactored
equation in discretised form at the grid point (i, j). To keep the equations compact
we define
(6.2.16)
i =
t
Ai ,
2x
and
i =
t
Bi
2y
(6.2.17)
237
t x AQ
t y BQ
t x AQ
t y BQ
and
Qi + i+1 Qi+1 i Qi + i+N Qi+N i Qi = Qi
(6.2.19)
These equations are used to sweep through the domain. Equation (6.2.18) is used
M 1
i1
M 2
M 3
i+N
p2
p1
q2
q1
p=MN
q=M2N
M3N
i+1
iN
3N
2N
2N+1 2N+2
N+1
N+2
QN +1 + N +1 QN +1 N QN
+ N +1 QN +1 1 Q1 = tR(QN +1 )
238
Both grid points indexed as N and 1 are on the boundary. As a consequence these,
pending figuring out applications of boundary conditions, need to be known. They
can be taken over to the right hand side of the equation leaving
(6.2.21)
[I + N +1 + N +1 ]QN +1 = tR(QN +1 ) + N QN + 1 Q1
we can solve for the Qi as the Qi1 and QiN would have already been
calculated. In this manner the Q for all the is can be found. Now that Q
is known, we can sweep equation (6.2.19). This U-sweep goes from grid index
i = M N 1 = p 1 back to i = 0.
(6.2.23)
Qi = [I i i ]
We need to address the issue of applying boundary conditions. Like we did with
the finite volume method, we could create some pseudo grid points as the neighbours
of the points on the boundary and use them to impose the boundary conditions.
The other possibility is to apply boundary conditions in the same manner as the
one dimensional case. In either case, we will have quantities that are prescribed like the back pressure p at the exit of a pipe, and we will have quantities that are
taken from the domain to the boundary, again like the pressure towards a solid wall.
In the L-sweep, conditions that are prescribed at the lower and the left boundary
are applied. Conditions that are propagated from the domain to the boundary are
imposed on the top and right boundaries. Vice-versa for the U-sweep. This should
be apparent after inspecting equations (6.2.22) and (6.2.23).
6.2.4. Three-Dimensional Euler Equations. These equations look very
similar to the two dimensional counterpart. In fact the same nondimensionalisation
can be used to get
Q E
F
G
+
+
+
=0
t
x
y
z
(6.2.24)
Where
Q=
v
w
Et
(6.2.25)
and
(6.2.26)
w
v
u
uw
uv
u2 + p
E = uv , F = v + p , and G =
vw
w2 + p
vw
uw
[Et + p]w
[Et + p]v
[Et + p]u
We can apply ADI to this system of equations written in Delta form. The
point to be noted is that we will get three factors instead of two. One for each
spatial coordinate direction. Though this is a very large saving in computational
239
effort from the original equation, we will end up solving three tridiagonal systems
of equations.
In the three dimensional case the linearised block implicit scheme would give
the equation in the delta form as
(6.2.27)
I + t A + t B + t C Q = tR(Q)
x
y
z
(6.2.28)
I + t B
I + t C Q
I + t A
x
y
z
= tR(Q)
(6.2.29)
I + t
A + t
B + t
C
x
y
z
+
+
+
+t A + t B + t C Q = tR(Q)
x
y
z
(6.2.30)
A + t
B + t
C
I + t
x
y
z
+
+
+
I + t A + t B + t C Q = tR(Q)
x
y
z
I + t
A + t
B + t
C Q = tR(Q)
x
y
z
+
+
+
I + t A + t B + t C Q = Q
x
y
z
The first equation is a lower triangular system. The second equation corresponds to an upper triangular system. They can be solved through a sequence of
forward substitution and back substitution respectively.
We need to address the issue of applying boundary conditions. At the boundary, we consider the one-dimensional equations as applicable perpendicular to the
boundary. That is, we describe our equations normal to the boundary and tangential to the boundary and ignore the tangential equations. This would be akin
to taking only the first factor from the ADI scheme at a boundary parallel to the
YZ plane. Since this is a one-dimensional problem at this point, our procedure for
240
applying boundary conditions can be used here. The only difference is the we also
need to propagate v and w.
Again attention needs to be paid to how and when the boundary conditions are
applied. In the ADI case, the boundary condition is applied during the appropriate
spatial sweep. In the case of approximate LU decomposition, the boundary conditions also need to be factored into a lower component and an upper component
so that they can be applied at the appropriate time in the sweep. That is the
lower triangular part during the L-sweep and the upper triangular part during the
U-sweep .
6.2.5. Addition of Artificial Dissipation. As in the one-dimensional Euler
equations, it is possible, as need, to add both second and fourth order terms in order
to attenuate high frequency oscillations. We have a choice of adding these terms
either to the right hand side of equation (6.2.27) or to the left hand side. If we
add it to the right hand side it appears in terms of Q which is at the current time
step. Hence the term becomes and explicit term. On the other hand if we add
it to the left hand side it would be written in the delta form and would appear in
the system of equations to be solved. Consequently, the term would be implicit in
nature.
Assignment 6.3
(1) Use the differential form of the equations and redo assignment 6.2.
241
if there is a definite problem at hand to be solved, that problem and its solution
then determine the answer to this question.
The question of where things are located was addressed systematically by Rene
Descartes. He brought the power of abstraction of algebra to bear on geometry,
resulting in analytic geometry. The realisation that one could assign numbers in
the form of coordinates to points in space was a revolution. Cartesian coordinates
are named so in his honour.
The fundamental job of a coordinate system is to help us locate points. The
fundamental job of a grid is to help us locate points, determine neighbours and
distances to neighbours. Let us, therefore, start with Cartesian coordinates and
ask the question:
6.4. Why Grid Generation?
We will use a two-dimensional problem to motivate the study of grid generation.
We solved the Laplace equation on a unit square in section 3.1. After that, we have
used a unit square and a unit cube to solve the Eulers equation in many forms in
the current chapter. We were able to use Cartesian grids for the solution of this
problem. This was possible as the domain on which the problem was defined, a
unit square at the origin, conforms to, or is amenable to a Cartesian mesh. That
is the boundary of our problem domain is made up of coordinate lines. What do
we do if the problem domain were not this convenient? What if it was changed to
one with top boundary slanting? We have encountered this question in chapter 5.
This is shown in Figure 6.8. We considered a Cartesian mesh in that chapter and
H
h
242
(1) If we want to solve problems on a square or a rectangle we can use Cartesian coordinates.
(2) If we want to solve a problem on a circle we could use polar coordinates.
Notice that by simple geometry, we mean one which is made up of the coordinate
lines of, for example, the Cartesian mesh. In fact, we want to generate a coordinate
system where the boundaries are coordinate lines. Since our focus is on the problem
domain at hand, we say such a coordinate system is boundary conforming. We
will try to perform a transformation of coordinates of some sort, to be boundary
conforming. The point now is to find such a coordinate transformation.
We observe that, as we go from the origin to the other end of the trapezium
along the x-axis, the height of the trapezium increases in a linear fashion. Actually,
what is important is that it does so in a known fashion. Let us take M grid points
in the y direction. At any x location, we divide the local height of the trapezium
into M 1 equal intervals. We see that we can obtain a mesh, that looks ordered,
and the grid points are on the boundary. We now have grids on our boundary and
H
h
Figure 6.9. Grid generated by spacing grid points proportionately in the y-coordinate direction
will be able to approximate our equations on this grid. The details of this process
will be discussed in the section on algebraic grids.
Another possibility is that we give up the structure that we get from using a
coordinate system, retain a Cartesian mesh and decompose our domain into smaller
domains whose geometry is known in the Cartesian mesh. This mesh or tessellation
as it is sometimes called, can then be used to solve our differential equation. Figure
6.10 shows a triangulation of the given domain. We now solve our problem on these
triangles.
6.5. A Brief Introduction to Geometry
We have already seen that to obtain a coordinate transform is to have knowledge
of the coordinate lines. We plan to do a quick review of the geometry of space curves.
Along the way, we will also do a review of the geometry of surfaces. Before that, let
243
H
h
244
either. So, the track forms a space curve in a coordinate system at the centre of the
earth and fixed to the earth. We can measure distance along the track to identify
a point on the track or we could use time, especially if you are on a train without
any intermediate stops. Since, a single parameter is enough to locate ourselves on
the track, it is one-dimensional in nature.
A space curve can be parametrised by one parameter, say t. We have,
(6.5.1)
where, e1 ,e2 , and e3 are the standard basis. We restrict ourselves to curves in a
two dimensions / three dimensions. So a curve in two dimensions, R2 , would be a
map from the real line into a plane. Formally,
(t) : {U R R2 }
(6.5.2)
As an example, consider the mapping (t) given by the components (cos t, sin t).
What does the graph of this map look like? If e1 and e2 are the basis along the 1
and 2 coordinate directions, then,
(6.5.3)
t
1
Assignment 6.4
(1) How would the graph of the curve given by the following equation look?
(6.5.4)
What changes?
(2) How about
(6.5.5)
For both the curves given in the two problems, find the tangent vector at any
point. What happens to the magnitude of the tangent vector? What happens to
245
the unit vector along the tangent vector? You should find that the magnitude or
the speed of the curves changes, the unit vector does not.
In general, the space curve in three dimensions may be as shown in Figure
6.12. Normally, the tangent to the curve or the velocity as it is often called in
(t)
2
t = x t e 1 + y t e 2 + zt e 3
We can get this from our tensor calculus. Given the map from the Cartesian
coordinate system to t we can define a differential length for a line segment in t as
(6.5.8)
ds2 = t t dt2
246
Since we are mapping to one dimension the metric tensor has only one component,
g11 .
Sometimes, we are lucky and we have a parametrisation s, where s is the length
along the space curve. For example, this occurs in our railroad track. We could use
the milestones or the kilometre markers as a measure along the length of the rails
to indicate position. That is, we are given a situation where
(6.5.9)
where,
(6.5.10)
and
(6.5.11)
s=
s ds,
s = 1
Any curve that is parametrised so that equation (6.5.11) is true, is said to have a
unit speed parametrisation or the curve is just called a unit speed curve.
If we were to generate a grid on our railroad line at uniform intervals along
the length, we could identify these by an index. If we put a grid point at every
railway station on the route, we could number them starting with zero for Chennai.
We know then that the fifth railway station comes on the railroad after the fourth
and before the sixth. Given a railway station i, we know it is preceded by i 1
(excepting Chennai) and succeeded by i + 1 (except at Delhi). Thanks to this
underlying structure, such a grid is called a structured grid.
On the other hand, someone can decide to sort the railway station according
to alphabetical order and number in that order. In that case, we could not be
sure that of the neighbour of i without looking at the sorted list. Since this does
not have an immediate structure from which the predecessor and successor can be
found, this is called an unstructured grid.
It may seem that some rearrangement will allow a structured grid to be converted to an unstructured grid. This is so in one-dimensional problems. However,
if one looks at Figure 6.10, it is clear that this is not possible in this case. In the
case of an unstructured grid, we are forced to retain information on neighbours.
Assignment 6.5
(1) Find the curvature and torsion of the curves given in assignment 6.4.
(2) Find the tangent t, normal n, and binormal b for the curve given by
(a cos t, b sin t, ct). Specialise the problem to a = b. With
a = b, what are
t, n, and b if we change the parametrisation to t = s/ a2 + c2 ?
6.5.1. Properties of Space Curves. We will now look a little more closely
at properties of space curves. We call the tangent vector to a curve with unit speed
parametrisation as t. Now, the question is what is the derivative of t? We will
answer this in two parts. Clearly, as the curve is a unit speed curve, there is, by
definition, no change in the magnitude or the speed. The only other property of a
247
vector that can change is its direction. First, for any unit vector
d
(6.5.12)
tt=1
{t t} = ts t + t ts = 2ts t = 0
ds
We will assume that ts is not the zero vector. This means that the derivative is
orthogonal to the original vector. For the unit speed curve, ts is also the rate at
which the curve moves away from the tangent and hence represents the curvature
of the curve. We can see this from the example shown in Figure 6.13. For the
A
B
C
A
C
n
Figure 6.13. A planar curve is shown with tangents at three different points. These tangent vectors are shown located at a single
point. The arrow heads lie on a unit circle since the tangent vectors
in this case are unit vectors
sake of simplicity, a planar curve is chosen. As was pointed out in the assignment,
the direction of the tangent to a curve at a point is a property of that curve at
that point and does not really depend on the parametrisation. The tangents are
shown in this figure as position vectors of points on a unit circle. Since the curve is
smooth, the tangent varies in a smooth fashion. It is clear then that the derivative
of the tangent vector tA at the point A is perpendicular to tA .
So, if ts is along a direction called n, then
(6.5.13)
ts = n
where, is the curvature at that point. We get back to our space curve now.
We have the unit vector t which is the tangent vector. We also have the vector n
which is normal to the tangent vector and we know that ts is proportional to n.
We have two orthogonal vectors in three dimensions at a given point on the curve.
We could define a third vector b, orthogonal to t and n, to complete the triad.
(6.5.14)
b=tn
It would be interesting, and useful, to see how this triad evolves along the curve.
We already have ts we can obtain equation for the other two rates. Since b = t n,
we can take a derivative to find
248
bs = ts n + t ns = t ns = n
(6.5.15)
By convention the sign on the right hand side of equation (6.5.15) is taken as
negative. Since bs is orthogonal to b and t, it must be along n. bs is the rate
at which b is twisting away from the plane formed by t and n. It represents the
torsion of the curve.
In a similar fashion, we find ns using n = b t to be b t. We can write
these three equations as one matrix equation for the triad t, n, and b as
t
0
0 t
n =
0
n
(6.5.16)
s
b
b
0
0
which gives us the evolution of the triad with the parameter s. It looks like this
system of equations tells us that if we are given the curvature and torsion of a curve
and a point on that curve we can reconstruct the curve.
Of course, (a, v) is a coordinate line on this surface as is (u, b). These two
lines pass through the point (a, b). Clearly (a, v) is a space curve and its tangent
at the point (a, b) is given by v (a, v). The subscript v indicates differentiation
with respect to v. Likewise, we have u (u, b) a tangent to (u, b). Here, the
subscript u indicates differentiation with respect to u. For a regular surface,
| u v | 6= 0. In fact, the unit normal to the surface is defined as
(6.5.18)
N=
u v
| u v |
If (t) is a curve on the surface, its tangent at any point, t , lies in the plane
spanned by u and v and it is normal to N . The length along this curve is given
by equation (6.5.7). Using chain rule we can write the expression for | t | as
(6.5.19)
| t |2 = u u u2t + 2 u v ut vt + v v vt2
Recognising that
(6.5.20)
where,
1He is intelligent, but not experienced. His pattern indicates two-dimensional thinking.Spock,
Star Trek: Wrath of Khan
E = u u ,
(6.5.21)
F = u v , and
249
G = v v
(6.5.22)
We could ask the question as to how much the surface curves away from the
tangent plane. This question is critical for us since it will determine, ultimately, the
size of our grid spacing. From the point (u, v), how much does (u + u, v + v)
deviate? This answered simply by looking at
((u + u, v + v) (u, v)) N
(6.5.23)
(6.5.24)
( u u + v v ) +
{z
}
|
in the tangent plane
uu u + 2 uv uv + vv v
+ ... N
Since the N is orthogonal to the tangent plane the first two terms do not
contribute to the curvature. In fact, we have taken the dot product to identify and
eliminate these terms. This leaves an expression that we identify as the Second
Fundamental Form.
Ldu2 + 2M dudv + N dv 2
(6.5.25)
where,
(6.5.26)
L = uu N ,
M = uv N , and
N = vv N
Assignment 6.6
(1) Compute the expression for the unit surface normals to the following surfaces:
(a) (u, v) = (a cos u, a sin u, bv),
(b) (u, v) = (u, v, u2 v 2 ),
(c) (u, v) = (u, v, 2uv).
(2) Compute the first and second fundamental forms for b) and c)
Now that we have some idea of the geometry of curves and surfaces, we can start
looking at the grid generation. We have already seen that we can have structured
and unstructured grids. Let us start by looking at ways to generate structured
grids.
250
x(, )
(6.6.4)
y(, )
(h + (H h))
Since we are able to go back and forth from the two coordinate systems, we will
now transform our governing equation, say Laplaces equation to the new coordinate
system. Employing chain rule we get for the first derivative
(6.6.5)
(6.6.6)
=
+
x
x x
=
+
y
y
y
(6.6.7)
This will be written in a compact notation as
(6.6.8)
(6.6.9)
= x = x + x
x
= y = y + y
y
(6.6.10)
Then for the second derivative we have
(6.6.11)
= xx =
(x ) =
( x + x )
2
x
x
x
( ) x + xx +
( ) x + xx
=
x
x
2
2
= (x ) + 2 x x + xx + (x ) + xx
(6.6.12)
251
= yy =
(y ) =
( y + y )
y 2
y
y
=
( ) y + yy +
( ) y + yy
y
y
2
= (y ) + 2 y y + yy + (y ) + yy
So, Laplaces equation can be rewritten as
(6.6.13)
2 2
+ 2 = xx + yy = x2 + y2 + x2 + y2
2
x
y
+ 2 (x x + y y ) + (xx + yy ) + (xx + yy ) = 0
Clearly, we have a grid that conforms to our domain, however, the governing equations, even if it is just Laplaces equation can become quite complex.
6.6.2. Elliptic Grids. A grid generation scheme which employs elliptic differential equations to determine the transformation is called an elliptic grid generation
scheme.
If you look back at our solution to differential equation, you will see we have reduced the differential equations to algebraic equations. Recall that in the potential
flow problem, stream lines and potential lines are orthogonal to each other. They
in fact form a mesh. They also satisfy Laplace equation [Win67]. We have already
seen that we can solve Laplace equation on a square. Here is our plan of action:
(1) Assume (, ) satisfy Laplaces equation
(2) Transform these equations into the (, ) coordinates.
(3) solve for (x, y) in the (, ) coordinates.
We already have the transformed equations (6.6.13). Since the coordinates satisfy
Laplaces equation we can simplify to get
(6.6.14)
x2 + y2 + 2 (x x + y y ) + x2 + y2 = 0
Let us recall how the chain rule works for the transformation of differentials
from one coordinate system to another.
(6.6.15)
x
dx
=
y
dy
x
y
d
= x
x
d
y
y
x
y
(6.6.17)
x
y
x
x
d
.
d
dx
.
dy
y
y
1
252
(6.6.19)
x y y x
x y x y
y
J
y =
x = J
y =
x
J
x
J
and
x =
(6.6.21)
x = y
y = x
y =
Why would we do this? Well, we know that we want the transformed coordinate
system to be a unit square, say in a Cartesian coordinate system. Which means,
we actually know the (, ). In order to have a transformation, we need to find
for a given (, ) the corresponding (x, y). So, we really need differential equations
for (x, y) in terms of (, ). To this end we divide equation (6.6.14) by 2 and
substitute from equations (6.6.21) to get
(6.6.22)
x2 + y2 2 (x x + y y ) + x2 + y2 = 0
We have the equations that need to be solved in the rectangular domain in the
(, ) plane. We now need the boundary conditions to actually solve the equation.
The rectangle in the (, ) plane has four sides. To each of these sides we need
to map the sides of our original problem domain from the physical domain. For
instance if the original problem were a unit circle (a circle with radius one) centred
at the origin as shown in the Figure 6.14.
Let us look at the equations that we are about to solve.
(6.6.23)
(6.6.24)
ax 2bx + cx = 0
ay 2by + cy = 0
where
(6.6.25)
a = x2 + y2 ,
(6.6.26)
b = x x + y y
(6.6.27)
c=
x2
and,
y2
n
+cn (xn+1
i,j1 + xi,j+1 )
253
C
1
45
1
(6.6.29)
n+1
n+1
n
+ yi+1,j
)
= dn an (yi1,j
yi,j
n+1
n+1
n+1
n
+ yi+1,j+1
)
yi+1,j1
yi1,j+1
0.5bn (yi1,j1
n+1
n
+ yi,j+1
)
+cn (yi,j1
where
(6.6.30)
(6.6.31)
(6.6.32)
(6.6.33)
and
(6.6.34)
n 2
an = 0.25 (j xni,j )2 + (j yi,j
)
n
n
bn = 0.25 (i xni,j )(j xni,j ) + (i yi,j
)(j yi,j
)
n 2
cn = 0.25 (i xni,j + (i yi,j
)
1
dn =
2[an + cn ]
is defined such that
S = S+1 S1 ,
S = x, y,
= i, j.
Here, n is the iteration level. The coefficients are not changed during the current
iteration.
As we had indicated at the beginning of this section, we decided to use Laplaces
equation to determine (, ) so that the resulting coordinate lines are orthogonal
to each other. This we obtained from our analogy with potential lines and stream
lines. Now we know that streamlines are drawn towards sources and pushed away
254
0.5
-0.5
-1
-1
-0.5
0.5
from sinks. So, if we want our grid lines to be clustered in a region due to the
solution having large gradients in that region, then one can add a sink appropriately
and sources appropriately so that such a clustering occurs. Unlike in case of the
unstructured grids, it is difficult to perform very localised grid refinement without
affecting a reasonably large region around the area of interest.
6.6.3. Parabolic and Hyperbolic Grids. Similar to the last section if the
nature of the differential equations employed to generate the grid are hyperbolic
or parabolic then the grid generation scheme is called a hyperbolic grid generation
scheme or a parabolic grid generation scheme.
We will look at hyperbolic grid generators first. These schemes can be used
generate grid by themselves. However, they are increasingly used to generate a
hybrid grid where hyperbolic grids are used near solid boundaries and and any
other kind of grid generation scheme including unstructured grids are used away
from the boundary.
One simple way to generate a hyperbolic grid is to start at the surface where
the grid is required and to travel along a perpendicular to the surface by a set
distance, . The end points of all of these normals now defines a constant
plane. This should remind the reader of two things. First the wave equation and
its properties we studies earlier. The second is algebraic grid generation. Once we
have offset by we are now in a position to determine the new normal directions
and then take another step. In this fashion, one can generate grid lines and the
corresponding planes. As in the wave equation of course, one has to be careful
that the -grid lines do not intersect. One solution is to add some dissipation terms
which will convert the equations to parabolic equations, resulting in parabolic grid
generation.
255
256
111111111111111
000000000000000
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
0000
1111
000000000000000
111111111111111
0000
1111
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
000000000000000
111111111111111
Step Two: While the set of edges, E, is not empty, pick an edge e.
Step Three: Find the set, L, of all nodes that are to be triangulated and
the to left our edge e.
Given the edge e, we need to find a node to form a triangle. Clearly any node that
is to the right of the edge will form a triangle that is outside the domain. So, we
form L from which to pick a node. We now proceed to find a candidate node from
the set L to form a triangle.
Step Four: Find the nearest node from the set L and check to see if a
triangle can be formed
Check One: Is the area of the triangle too small?
Check Two: Do the new Edges formed intersect any of the existing
edges?
The first check to to make sure that we do not pick three collinear points. The
second check is to make sure that the triangles do not overlap. Remember, in our
chapter on representations, we saw that non-overlapping domains gave us functions
that are independent of each other. So from the list L of the nodes we will find a
node, n, that allows us to form a triangle.
Step Five: Form the triangle with the node n. Check whether new edges
have to be added. If a new edge is added, its orientation is such that the
interior of the triangle is to the left of the edge. This can be achieved
by ensuring that the StartNode of the new edge is the EndNode of the
existing edge or vice-versa.
Step Six: We remove edge e from the set E. We check to see if any of the
edges formed employing the end points of e and the node n already exist
in the set E. Any such edge is removed from the set E.
Step Seven: The loop was opened out when edges were dropped. The new
edges that were created to form the triangle are reversed in orientation
and added to E. This results in the loop being closed again.
Step Eight: Go back to Step Two and form another triangle
257
Figure 6.17. This is the initial domain given as a counterclockwise loop of nodes labeled 1, 2, 3, 4, 5, 6, 7, 8, 1
N = {(1), (2), (3), (4), (5), (6), (7), (8)} is the set of nodes from which we get
the set of edges to be triangulated. This set is E = {1 2, 2 3, 3 4, 4 5, 5
6, 6 7, 7 8, 8 1}.
Pick an edge: The set E is not empty, we pick the first edge from it: e =
1 2.
Form the left list: The list of nodes, L, which are to the left of e is formed:
L = {(4), (5), (6), (7), (8)}
Pick nearest node: The node closest to e is (8).
Triangle? : With this it is possible for us to form the triangle made
up of the edges 1 2, 2 8, 8 1.
Old Edges? : of these besides e, 8-1 is an existing edge from E.
Intersect: The new edge 2 8 does not intersect any existing edges.
Triangle Size: Triangle is not approaching zero area.
Form Triangle: We can form the triangle {1 2, 2 8, 8 1}.
Housekeeping:
first: We will remove edges 1 2 and 8 1 from the list of edges E.
second: The new edge in the triangle is 2 8. We add the opposite of
this edge 8 2 to E.
third: We remove the node (1) from N as it has no edge connected to
it.
258
State 1: E = {23, 34, 45, 56, 67, 78, 82}, N = {(2), (3), (4), (5), (6), (7), (8)},
and T = {(1 2, 2 8, 8 1)}
5
8
259
2
8
2
1
260
4
4
2
2
1
261
4
8
2
1
Figure 6.22. All the triangles are formed. 8 4 was the last new
edge to be added
262
263
1
3
3 2
a
a=
a .
2 2
4
The perimeter is S = 3a for the equilateral triangle. We can define a nondimensional parameter Q representing the quality of the triangle as
(6.7.1)
A=
A
Q = 12 3 2
S
Q would be 1 for an equilateral triangle and less then 1 for other triangles. For
three collinear points Q = 0.
Now that we have figured out a way to generate grids employing the boundary
nodes and we are able to assess the quality of a triangle, we are in a position to see
if it is possible to improve the overall quality of the triangulation.
With out disturbing the nodes if we want to try to improve the triangulation,
we can only move the edges around using the same nodes.
(6.7.2)
264
(3) Consider a triangle T. Pick the triangle opposite to the given triangle.
(4) Apply the swap criterion and consider whether the edge should be swapped.
Repeat this for all the triangles shared by the node.
(5) Repeat this for all the nodes.
(6) Repeat this whole process till no swap occurs.
We will now look at a method to insert nodes into our triangulation to increase
the richness of the triangulation.
2
l2
l1
l3
265
3
Figure 6.26. Triangle 1-2-3 is refined as shown in Figure 6.25.
An existing triangle on edge 2-3 becomes a quadrilateral. This is
split into two triangles by adding an edge from the new node. The
objective is conform without the introduction of any new nodes
This can be done quite easily as shown in Figure 6.26. Due to the refinement
process if the adjacent triangle has a node on its edge, it is split into two by joining
the node opposite to that edge to the new node as shown by the thicker line in the
figure. If two of the edges of the triangle happen to have nodes on them due to two
neighbours being refined then that triangle is also refined by splitting into four.
It is clear that if we have an initial coarse mesh generated using boundary
triangulation that one can generate a hierarchical mesh by refining each triangle
266
into four as required. This process can be represented by a quad tree since each
triangle typically has four children. This can in fact be used for adaptive griding,
where the decision to refine may be based on a computed solution. The advantage
with this unstructured grid is that the refinement can be performed at the place
where it is required instead of having to refine the whole grid.
It should be clear that this kind of refinement of breaking the geometrical entity
into four children is also possible for quadrilaterals. In particular, quadrilaterals
that are part of a Cartesian mesh.
6.7.2. Cartesian Meshes. Figure 6.27 shows an unstructured Cartesian mesh
on the trapezoidal domain. Figure 6.28 shows the zoomed-in part of the mesh. The
strategy here is quite clear. Any cell that is not completely inside our domain, in
this case the trapezium, is subdivided. In the two dimensional case, it is subdivided
into four parts. The same test is applied to the new cells that are formed. This
process is continued till the geometry is captured to our satisfaction. Clearly, we
will have to place a lower limit on the size of any cell. Of course, we have a limit on
how many cells we are able to handle with the computational resources available.
267
Figure 6.29. An unstructured Cartesian mesh generated to capture the circle x2 + y 2 = 1, the mesh goes down ten levels
Figure 6.29 shows another example of the unstructured Cartesian mesh. Figure
6.30 shows a zoomed in version of the same figure. As you can see, the circle has
not been drawn, though from the mesh it (the circle that is) is apparent.
268
269
270
Figure 6.32. A 7 7 elliptic grid generated in a unit circle in the xy plane is scaled and stacked along the helix
(0.2t, 0.1 cos t, 0.1 sin t) to generate a three-dimensional grid in a
helical tube. There are twenty eight grid planes stacked along the
helix
Figure 6.33. A 7 7 elliptic grid generated in a unit circle in the xy plane is scaled and stacked along the helix
(0.2t, 0.1 cos t, 0.1 sin t) to generate a three-dimensional grid in a
helical tube. There are twenty eight grid planes stacked along the
helix. Only grids on the inlet, exit, bottom, and side coordinate
surfaces are shown for clarity
271
Now consider a different extension of the same cylinder problem. The cylindrical surface of the cylinder is a surface of revolution. How would we handle other
objects of revolution. One way would be to generate the grid as done before on
a circle and the grid is stacked, it can also be scaled. This would also work for a
conical nozzle as shown in Figure 6.34. In this case, not only do we stack the grid
272
273
used the Rivaras algorithm to refine the grid in a boundary layer, one would halve
the characteristic size in the flow direction, every time one halved the direction in
the transverse direction. This results in a very rapid increase in the number of
grids that are required. On the other hand if one had a structured grid near the
surface, the grid can very easily be stretched perpendicular to the wall and left
alone parallel to the wall. The grid in the general case can be generated using a
hyperbolic/parabolic grid generator. This then provides the boundary on which a
conforming unstructured mesh can be generated. Now, the grids in the boundary
layer can truly be refined and stretched as required.
Figure 6.37. A two-dimensional hybrid grid made up of a structured quadrilateral mesh and an unstructured conforming triangular mesh
6.10. Overset grids
There are lots of situations where we may just decide to create grids around
various parts of the domain separately. In the process, we attempt to stitch the
various grids together having generated them in a fashion that they conform to the
geometries around them and each other.
The other option is to abandon the attempt to have the grids conform to each
other and just focus on what the grid is meant to do. This of course will result in
an overlap of grids. The following two points need to be borne in mind.
(1) Always solve the governing equations on the finest of the overlapping grids
(2) Interpolate to get the data on the other grids
(3) Transfer of data from one grid to the other needs to conform to the propagation due to the underlying flow.
Let us consider an example where this may work well. Consider the the flow
over any rotating machinery, a helicopter for example. One could consider the
possibility that a grid is generated about the helicopter and that another grid is
generated about each of the blades of the helicopter main rotor. Now, the grid
around a blade is likely to be completely embedded in the grid about the fuselage.
This has the advantage that the fuselage grid is generated to conform to the needs
of the flow about the fuselage. The rotor can independently rotate, the blades can
go through a flapping motion or a lead lag motion with the blade grid following the
motion and conforming to the blade surface.
274
It is clear that transferring data from one grid to another is the most important piece of technology required. This can be achieved again by generating the
appropriate unstructured Cartesian mesh to find out for a given point in the coarse
grid, which are all the grid points on the fine grid that are likely to contribute
to the value on the coarse grid. Once these points are identified, the data can
appropriately transfered.
6.11. Important ideas from this chapter
The finite volume method derives directly from the conservation equations
derived in integral form.
The accurate evaluation of the flux at the boundary determines the quality
of the solution that we obtain.
Boundary conditions can be applied either directly as conditions on the
flux, or by employing ghost volumes.
Unstructured grids are easy to generate about complex geometries.
Structured grids work well in simple geometries. One could decompose a
complex geometry into many simple geometries and employ a structured
grid.
Structured grids tend to be better in capturing flow features near boundaries. A hybrid grid consisting of a structured grid near the boundary
and an unstructured grid elsewhere often may be a better solution rather
than using an unstructured grid everywhere.
The preceding two points essentially partition the domain. Once a decision
is made to partition the domain, one can consider different ways to do it.
Overset grids are easy to generate, require a little effort to use. They can
easily handle moving/deforming geometries.
CHAPTER 7
Advanced Topics
This chapter is a combination of many topics. Each topic is worthy of a book
in its own right. Such books are available. Given that the philosophy of this book
is to give you the essential elements of CFD, these topics are grouped together
here. The chapter is essentially in three parts. The first two sections deal with a
different view and techniques for solving our equations. The second part focuses
on various acceleration techniques. Finally we will close this chapter with a look
at some special techniques to solve unsteady problems.
Most of the material presented so far has been based on finite difference methods
and finite volume methods. The finite volume method essentially solves the integral
form of the conservation equations. We will look at other techniques to do the same.
7.1. Variational Techniques
Variational methods form the basis of much more specialised and popular class
of methods called finite element methods(FEM). I am not going to dwell on FEM
here since there is a whole machinery of jargon and techniques that have been
developed over the years. However, this section on variational techniques should
lay the foundation for any future study of FEM pursued by you.
7.1.1. Three Lemmas and a Theorem. [GF82] As the section title indicates, we are going to look at three Lemmas that will allow us to derive the
Euler-Lagrange equation.
Here is the motivation for the mathematics that follows. Lets say that you
wanted to find out whether the route you normally take from the library to your
class room is the shortest path. You could perturb the current path to see if the
length decreases. The change h(x) that you make to the path y(x) has to ensure
that the new path y(x) + h(x) starts at the library and arrives at your classroom.
So we see that h(library) = h(class room) = 0. Since you are not likely to teleport
from one point to another point, the path needs to be a continuous path. If we
identify the library as a and the class room as b then we are looking for the
shortest continuous path from amongst all the continuous paths going from a to
b. We will call the set containing all these paths as C[a, b]. We can go further
and define the subset of C[a, b] which has zero end points as C0 [a, b]
The first Lemma: For (x), a continuous function on the interval [a, b], that
is (x) C[a, b] if we can say that
Z b
(x)h(x)dx = 0
(7.1.1)
a
for any h(x) which is continuous and h(a) = h(b) = 0, that is h(x) C0 [a, b] then
(7.1.2)
(x) = 0
275
276
7. ADVANCED TOPICS
How can we prove this to be true? Can we find an (x) C[a, b] which satisfies
(7.1.1) but is not zero? Let us assume we managed to find such an (x). Now this
(x) must be non-zero for some value of x, say it is positive. In that case since it
is continuous it must be positive in a neighbourhood say (x1 , x2 ). Since (7.1.1) is
supposed to work with all h(x) we need only find one case where it fails. Consider
the function
(7.1.3)
h(x)
(7.1.4)
=
=
0 otherwise
x2
x1
which contradicts the assertion that the integral is zero. So, the (x) could not
have been non-zero as we imagined. We could have chosen h(x) to be a hat function
on the interval [x1 , x2 ] taking unit value at the midpoint of that interval.
Just as we had the set of functions that were continuous labelled C[a, b], we
call the set of functions that have continuous derivatives as C 1 [a, b] and if we have
a function which is zero at the end points that belongs to C 1 [a, b], we say it belongs
to C01 [a, b].
The second Lemma: For (x) C[a, b] if we have
Z b
(x)h (x)dx = 0
(7.1.5)
a
C01 [a, b]
then
(x) = constant
How can we prove this to be true? Can we find an (x) C[a, b] which satisfies
(7.1.5) but is not a constant? Let us assume we managed to find such an (x).
What worked effectively last time was to find an h(x) that violated the given integral
condition. We need to construct such an h(x). Let us first define a constant c as
Z b
1
(7.1.7)
c=
()d
ba a
or more useful to us would be to take the c into the integral and write it as
Z b
(() c)d = 0
(7.1.8)
a
We are essentially taking c to be the mean value of (x) on the interval [a, b]. We
then define our h(x) to be
Z x
(() c)d
(7.1.9)
h(x) =
a
What we have achieved is that h(a) = h(b) = 0 and the derivative h (x) exists! We
now look at the integral.
Z b
Z b
Z b
(x)h (x)dx c
((x) c)h (x)dx =
(7.1.10)
h (x)dx = 0
a
a
|a
{z
}
(7.1.5)
277
(7.1.11)
b
a
((x) c)2 dx
(7.1.13)
A(x) =
()d
a
b
a
()h()d = A()h()|a
A()h ()d
Since h(a) = h(b) = 0, the first term on the right hand side of equation (7.1.15) is
zero. Substituting back into equation (7.1.12) we get
(7.1.16)
(x) = (x)
J(y) =
F (x, y, y )dx
a
F (x, y, y )dx
a
(F (x, y + h, y + h ) F (x, y, y )) dx
278
7. ADVANCED TOPICS
(7.1.21)
J(y) =
b
a
F (x, y, y ) +
F
F
h + h F (x, y, y ) dx
y
y
So, if y were an extremum, then J(y) would be zero for any perturbation h.
We immediately see that the last lemma is applicable and as a consequence F/y
is differentiable with respect to x and that
F
F
=
x y
y
This is called the Euler-Lagrange equation. So how do we use it? Consider the
following problem. In two dimensions, we want to find the shortest path between
two points A and B as shown in the Figure 7.1.
(7.1.22)
Y
A
X
Figure 7.1. What is the shortest path between A and B?
So what is the length of the path that is shown in the figure. From calculus we
know that the length of the curve y(x) from x = a to x = b is given by
Z bp
1 + y 2 dx
(7.1.23)
J(y) =
a
We want to find the path, y(x), which has the shortest length. We need to find the
Euler-Lagrange equation for equation (7.1.23)
p
y
(7.1.24)
F (x, y, y ) = 1 + y 2 ; Fy = 0, Fy = p
1 + y 2
so,
(
)
y
y
d
p
=0 p
= c y 2 = c(1 + y 2 )
(7.1.25)
dx
1 + y 2
1 + y 2
279
squaring both sides and subtracting one from both sides we get
1
1
p
y (x) = y (a)
=p
(7.1.27)
1 + y 2 (a)
1 + y 2
Which tells us that the slope of the curve is a constant along the curve which gives
us a straight line. In fact, integrating one more time from a to x we get
y(x) = y (a)[x a] + y(a)
(7.1.28)
Consider the problem of the shortest distance between two points on the surface
of a sphere of radius R. Without loss of generality we can assume that the radius
of the sphere R is one. The coordinate map on the sphere can be taken in terms of
and which are the standard coordinates used in the spherical polar coordinate
system. Since we are given two points on the sphere, taken along with the origin
this gives us three distinct points in space through which we can find a unique
plane. We will align our coordinate system so that the value of the two points is
the same. That is the x z plane of our coordinate system passes through these
points. The x z plane can be placed in such a fashion that the z-axis passes
through the initial point and consequently of the start point is zero. The initial
and final points have coordinates (0, 0) and (f , 0). The length of a differential line
element on the surface of the sphere can be obtained as
ds2 = d2 + sin2 d2
(7.1.29)
d
q
=0
(7.1.31)
d
2
1 + ( ) sin2
Integrating with respect to once we get
= (0) = c
q
= q
2
2
2
2
1 + ( ) sin
1 + ( ) sin
(7.1.32)
Rearranging terms and integrating one more time with respect to we get
Z q
2
c 1 + ( ) sin2 d
(7.1.33)
() = (0) +
0
At this point we do not need to evaluate the integral. We know that (0) = (f ).
So the above equation gives when integrating to f
Z f q
2
c 1 + ( ) sin2 d = 0
(7.1.34)
0
280
7. ADVANCED TOPICS
Now, c is a constant, the rest of the integrand is always positive. This leaves us
with a situation where equation (7.1.34) is satisfied only when c = 0. This tells us
that () = (0). The shortest distance between two points is a segment of a great
circle on the surface of the sphere passing between those points.
This is all very nice. How do we solve differential equations using these methods? Look at the variational problem of minimising
Z
1 b 2
u dx
(7.1.35)
J(u) =
2 a x
with u(a) = ua and u(b) = ub . The Euler-Lagrange equation for this is given by
(7.1.36)
uxx = 0
We are then free to choose which of these two mathematical models we use
to obtain the solution. The variational problem does not seek a solution which
has second derivatives, however we do have the minimisation to perform. The
differential equation on the other hand has its own set of problems as we have seen
so far and requires that the solution have second derivatives in this case. Let us
look at how we would solve the variational problem numerically.
If we represent the solution in terms of the hat functions from chapter 2.2.5,
we can write it as
(7.1.37)
u=
N
X
u i Ni
i=0
u =
N
X
ui Ni
i=0
where the prime denotes differentiation with respect to x. We see that the integral
in equation (7.1.35) can be interpreted as the dot product. So, this equation can
be rewritten as
*N
+
N
X
1
1 X
h
(7.1.39)
J (u) = hu , u i =
u j Nj
u i Ni ,
2
2 i=0
j=0
The superscript h on the J is to indicate it is a discrete version of the one given in
equation (7.1.35). If we assume that we are going to take grid points at equal intervals h apart, we can then get an expression for the derivative of the hat function.
x xi1
x xi1
xi xi1
h
(7.1.40)
Ni (x) =
x xi+1
x xi+1
xi xi + 1
0
Now, the derivative is found to be
(7.1.41)
Ni (x) =
x < xi1
1
h
x (xi1 , xi )
1
h
281
x xi1
x (xi1 , xi ]
x (xi , xi+1 ]
x > xi+1
x (xi , xi+1 )
x > xi+1
This is nothing but the Haar function. We can now find the dot product between
these functions. It is clear that the only non-zero terms for the dot product are
going to be for the terms corresponding to j = i 1, j = i, and j = i + 1.
0
j <i1
h j = i 1
(7.1.42)
hNi (x) , Nj (x) i =
j=i
j =i+1
0
j >i+1
Substituting back into equation (7.1.39) we get
(7.1.43)
J h (u) =
N 1
u2
1 X
u2a
+ b +
ui ui1 + 2u2i ui ui+1
2h 2h 2h i=1
To find the extremum of this functional we differentiate it and set it equal to zero.
Differentiating equation (7.1.43) with respect to uj we get
(7.1.44)
N
1
X
h
ui
ui1
ui
1
ui1
J (u) =
ui
+ 4ui
uj
2h
u
u
u
j
j
j
i=1
ui+1
ui
ui
= 0,
ui+1
uj
uj
282
7. ADVANCED TOPICS
The five terms in the expression on the right hand side of this equation are nonzero for a specific value of i for each of those terms. Meaning, this is not the time
to factor and simplify the expression. We need to inspect each one and determine
the value of i in terms of j so that the derivative is non-zero. We do this below.
i1=j i=j+1
ui
ui1
ui
ui
ui+1
1
ui1
ui
+ 4ui
ui+1
ui
2h
uj
uj
uj
uj
uj
(7.1.45)
i=j
(7.1.46)
i+1=j i=j1
i1=j1
i=j
i+1=j+1
h
1
J (u) =
{2uj1 + 4uj 2uj+1 } = 0,
uj
2h
j = 1, ..., N 1
Again in order to get the extremum we differentiate equation (7.1.47) with respect
to ui and set the resulting expression to zero.
(7.1.48)
h
ui ui1
ui+1 ui
J (u) = 2
2
ui
h
h
(7.1.49)
uxx = p(x),
u(a) = ua ,
u(b) = ub
283
or not. Earlier we had written u in terms of the hat function. In a similar fashion
we need to project p on to the hat functions so that we can write it as
(7.1.51)
p=
N
X
p i Ni
i=0
(7.1.52)
J (u) =
u j Nj +
u i Ni ,
u j Nj
p i Ni ,
2 i=0
j=0
j=0
i=0
We have a new set of terms here from the pu term.
0
j <i1
j =i1
2h
(7.1.53)
hNi (x), Nj (x)i =
j=i
j =i+1
0
j >i+1
We differentiate the discrete functional with respect to ui and set it equal to zero.
(7.1.54)
1
h
J (u) =
{2ui1 + 4ui 2ui+1 }
ui
2h
pi1 + 4pi + pi+1
+h
= 0, i = 1, ..., N 1
6
i = 1, ..., N 1
Interestingly, we see that the function p is also averaged over three neighbouring
points. This term appears because the integrals which have a ui in them correspond
to pi1 with a part overlap, pi with a full overlap and pi+1 with a part overlap.
Assignment 7.1
(1) Since the problem stated equation (7.1.39) is in one spatial dimension,
another way to determine the functional J h is to actually enumerate the
inner product term by term. Show that it can also be written as
N
J h (u) =
u2
1X 2
u2a
b +
u ui ui1
2h 2h h i=1 i
284
7. ADVANCED TOPICS
(7.1.57)
E(ai , bi ) =
xi
(F (x) f (x, ai , bi )) dx
We need to find the pair ai , bi for which this error is minimised. It should be noted
here that we are not going to insist that the representation is continuous at the
end points just like in the box functions and the Haar functions. This allows us
to solve for one interval without bothering about the neighbouring intervals. So to
get the ai , bi pair we differentiate equation (7.1.57) with respect to ai and set the
derivative to zero and repeat the process with bi . This will give us two equations
in two unknowns which we solve for the ai and bi .
I used the package wxmaxima to perform the symbolic integration and differentiation to find the general expression for ai and bi for F (x) = sin x. I obtained
the following expressions.
(7.1.58)
ai =
and
(7.1.59)
bi =
where, dxi = xi+1 xi .1 We use eleven grid points as we did earlier and graph
these line segments in Figure 7.2 to see what the representation look like. This representation looks quite good. It is shown in the same scale as the graphs in section
2.9. It almost looks as though the lines segments meet and that the representation
is continuous everywhere. This is not really so. We look at a zoomed in view of
the above graph at the point x = . This is shown in Figure 7.3. We can clearly
see the jump in the representation. What do we do at the point where the jump
occurs? Take the average of the two values. We can do this for the function and
the derivative at the nodal points.
In fact the general expression for sin nx turns out to be ( thanks again to
wxmaxima, though in view of the footnote we could have actually guessed this
1I want to point this out just as a curiosity. If you squint at the expression for a and b can you
i
i
see something that looks like the finite difference approximation to the second derivative of sin x?
285
sin(x)
1
0.5
0
-0.5
-1
sin(x)
0.05
-0.05
3.1
3.15
3.2
ai =
and
(7.1.61)
bi =
Assignment 7.2
(1) As we did in section 2.9, plot these graph for various values of the wave
number n.
(2) Find this kind of a representation for F (x) = x2 on the interval [1, 1].
How does that compare to the representation employing hat functions.
(3) Repeat this process for a cubic, that is for F (x) = x3 . Check the values
at the nodes. What can you conclude?
(4) Repeat the exercise with a different number of intervals. How about eleven
intervals?
I have included plots for four cases of n = 5, 10, 11, 12 here from the first
question in the assignment. We are looking at these cases since we expect the
representation to be poor for these cases. You can look at these graphs and compare
them to the ones we plotted using hat functions. In the case of the hat functions
286
7. ADVANCED TOPICS
sin(5x)
1
0.5
0
-0.5
-1
sin(10x)
1
0.5
0
-0.5
-1
sin(11x)
1
0.5
0
-0.5
-1
287
7.6 and 7.7 are given here just to show the degeneration in the representation if
the grid is too coarse. There is loss of both amplitude and frequency information
in these two cases.
sin(12x)
1
0.5
0
-0.5
-1
where ux and uy are derivative of u with respect to x and y, A is the region over
which the integration is performed. The corresponding Euler-Lagrange equation
would be something like this
F
F
F
+
=
(7.1.63)
x ux
y uy
u
We see that the Euler-Lagrange equation is nothing but the Laplaces equation.
Try it. We can define hat functions in two dimensions and in fact in multiple
dimensions in exactly the same manner in which we derived them in one dimension.
We define the hat function at a grid point as taking a value one at that grid point
and dropping off to zero toward the adjacent grid points and the edges connecting
those grid points.
We have seen that quite often, problems have a differential representation, an
integral representation or a variational representation. Now looking at the example
above for Laplaces equation it may not be obvious that one representation is better
than the other. The example is chosen simply to demonstrate that there are various
288
7. ADVANCED TOPICS
ways of representing the same problem. We will now consider a different problem
which is initially posed as a variational problem and we subsequently derive the
partial differential equation
7.1.4. The Soap Film Problem. This problem is also called the Plateau
problem. Simply, it is this. If we loop a wire around and immerse it into soap
water so that on removing it from the water a film of soap is formed on loop, what
is the shape of the soap film. Again, it is clear that we are seeking a function which
satisfies certain boundary conditions. Namely the film is supported by the loop.
There are many functions that satisfy this requirement. We look for the one which
has the minimum area and for a thin film this turns out to be a minimum energy
state. If D defines the region enclosed by the look we as that
Z q
(7.1.65)
J(u) =
1 + u2x + u2y dA,
D
1 + u2y uxx 2ux uy uxy + 1 + u2x uyy = 0
with u = f (x, y), with (x, y) on the boundary of D as the boundary condition.
In this case, posing the problem as a variational problem definitely looks more
attractive then the corresponding partial differential equation.
7.2. Random Walk
The idea of random walk will first be introduced in an intuitive fashion[Fel68].
We will start with a probabilistic view of a game played by two gamblers. Let us
say that two gamblers, A & B, start playing a game. Let us say at the beginning
they have 100 chips between them [ a chip is a unit of currency ]. One of them, A,
has x chips the other, B, has 100 x chips. The game played by the gamblers goes
in steps called rounds. At the end of each round one of them gains a chip which
the other loses. So clearly, we need only look at A, the gambler with x chips. We
can infer the fate of B from the story of A.
So, our gambler has x chips. What is the probability that A is ruined? Let us
call this probability qx . The game may not be a fair game, B may be cheating. Let
us say that the probability that A wins a round is p and the probability of a loss is
q. Then, given x chips, the probability that A will have x + 1 chips at the end of a
round is p and the probability that A will have x 1 chips at the end of the same
round is q.
Therefore, the probability of ruin with x chips can be written as
(7.2.1)
qx = qqx1 + pqx+1
That is there is a probability q that A will end up with x1 chips and the probability
of ruin with x 1 chips is, of course, qx1 . Similarly, with a probability p, A will
have x + 1 chips given that A has x chips before the round and the probability of
ruin with x + 1 chips is qx+1 . Note that here, p + q = 1.
This equation is valid for all values of x except for x = 0 and x = 100. In the
first case A is ruined and hence q0 = 1. In the second case B has been ruined and
289
the probability of As ruin is zero. That is q100 = 0. So, given p and q one can find
the probability of ruin for x = 1 to x = 99 by iterating on the equations given by
(7.2.1) When it has converged, we will have the qx for all the xs.
Take a good look at equation (7.2.1), with p = q = 1/2. We see that it becomes
qx1 + qx+1
2
It is possible to use the prescribed boundary conditions to iterate this equation
to get the probability distribution q(x). The iterates are related by
(7.2.2)
(7.2.3)
subtracting qxn1
qx =
n1
n1
qx1
+ qx+1
2
from both sides we get
qxn =
n1
n1
2qxn1 + qx+1
qx1
2
Which is nothing but our solution to the one-dimensional Laplace equation or
the heat equation. So, it looks like we could some how use this problem of the ruin
of a gambler to study the solution to the heat equation.
Since we need to generate a random sequence of +1 and 1, we will take a
small detour, and look at how to write a simple random number generator.
(7.2.4)
qxn qxn1 =
290
7. ADVANCED TOPICS
Any player who ends up in the bin at the far right is removed as a victor from
the game. You can now run this simulation and see what happens to the players
as a function of time and how many players there are in each bin.
for each marble in a bin generate a random number which has two possible
states:(+, ). If it is a plus move the chip into the bin on the right. If
it is a minus move the chip to the left. The chips in the first bin do not
move left and the chips in the last bin do not move to the right.
make sure there are 100 chips in the first bin and zero in the last.
This process can be repeated.
Assignment 7.3 Implement the above algorithm. How does the solution evolve?
What happens if you generate ten solutions with ten different seeds and then take
the average of the ten solutions to get one solution. Try this for 4, 16, 64 sets
of solution. What can you say about the quality of the solution? If you let the
solution evolve by a hundred time steps. What does the average of the last ten
steps look like. Do this for the average solution obtained earlier.
df = dt + dz
Here, dt is called the drift term. 2 dt is the variance. To ascribe some physical
meaning to these terms, the first term has a in it which is basically the mean
velocity of the flow and the second term is usually captured by temperature in gas
dynamics.
Assignment 7.4
Do the following.
(1) Generate random numbers that are normally distributed with mean zero
and variance one, that is N (0, 1).
(2) These random numbers that you generate at each time step are the dz
values. Use these dz values along with the drift term to integrate the
stochastic differential equation (7.2.5). Use the Euler explicit scheme for
the deterministic part. The resulting scheme is called the Euler scheme
for the stochastic differential equation. For one set of random number you
generate one path. You can repeat this trial many times.
(3) Plot a few sample paths and see what they look like. What is the expectation of these paths?
291
30
20
10
-10
-20
-30
20
40
60
80
100
Time t
Figure 7.8. Ten paths with = 0 and = 1 obtained by integrating equation (7.2.5) over a hundred time steps using the explicit
Eulers scheme and a t = 1
It is also important to see the relationship between the solution to the heat
equation and this random process. Any bin in the heat equation solution contains
chips corresponding to the number of paths that pass through that bin at that time.
7.3. Multi-grid techniques
Multi grid schemes, somehow, are not as popular as they should be. An excellent start can be made by reading Briggs[Bri87]. Lets look at a simple introduction
to the multi grid technique.
The basic objective of the multi-grid scheme is to accelerate a given numerical method. We saw earlier that we could get an improvement in convergence by
employing the successive over relaxation scheme. However, we also saw that there
was a short coming, in the sense that one had to perform an exploratory study
to obtain the optimal relaxation parameter and that as the grid became finer the
relaxation parameter got closer to 2.0 rendering the scheme less useful. Here accelerated convergence is obtained using a multiplicity of grids rather than just one
grid.
Why use multiple grids to solve the same problem? High frequency is defined
with reference to the grid. A function is said to have high frequency content if
it is composed of frequencies comparable to the highest frequency that can be
represented on that grid. As was seen in the earlier demonstration, for a given
grid, high frequencies are damped out faster than low frequency. This is especially
true if some kind of smoothing is involved either in the process represented by the
equation being solved or the smoothing is done deliberately to eliminate the high
frequency component of the function.
These two features are used to develop the multi grid method. We will use the
Laplace equation on a unit square as an example.
292
7. ADVANCED TOPICS
We have seen that the Laplace equation on a unit square when dicretised results
in a system of equations given by
(7.3.1)
Ah h = f h
where the superscript h indicates the grid size. Remember now, that if
we solve the equation and obtain a h , it show satisfy the equation (7.3.1) and
approximate the solution to the Laplace equation.
In the process of trying to obtain h , we have a candidate approximation h .
Now this candidate solution to equation (7.3.1) differs from the actual solution.
This difference is defined simply as
(7.3.2)
eh = h h
h = h + eh
rh = f h Ah h
Ah eh = Ah (h h ) = Ah h Ah h = f h Ah h
Ah eh = rh
293
So, what happens when the iterations in step (5) wipe out the high frequencies with reference to grid 2h and are now slogging along slowly with the lower
frequencies. We have a solution.
This statement of the modified algorithm was made deliberately vague on the
steps (4) and (5). Remember again, that the equation for the correction [ often
referred to as the correction equation] looks the same as the original equation.
So, when we transfer the residue, rh , from grid h to 2h, we choose the name
2h
f . Then the algorithm is
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
Assume a h
Compute the residue rh = f h Ah h
Perform relaxation sweeps to (7.3.6) to eliminate the high frequencies
Transfer the residue rh on the grid h to the grid 2h to obtain f 2h .
With this residue, we can rewrite (7.3.6) on the grid 2h as A2h 2h = f 2h
Compute the residue r2h = f 2h A2h 2h
Perform relaxation sweeps to A2h e2h = r2h and get e2h
Obtain the corrected solution as 2h = 2h + e2h
Transfer 2h back to the grid h to get eh .
Obtain the corrected solution from (7.3.3)
back to step (1) till the required convergence on the fine grid is achieved.
Form Ah , f h
Guess a h
invoke the function ComputePhi( Ah , f h , ) to obtain eh
find the new improved from h = h + eh
find rh and transfer it to f 2h
transfer eh to 2h and form A2h
invoke the function ComputePhi( A2h , f 2h , 2h ) to obtain e2h
find r2h and transfer it to f 4h
transfer e2h to 4h and form A4h
invoke the function ComputePhi( A4h , f 4h , 4h ) to obtain e4h
This can be taken to as many levels as required
transfer e4h back to e2h .
294
7. ADVANCED TOPICS
This is called the Vcycle. This is because we start at the finest grid make our
way down to the coarsest grid with the residuals and work out way back to the
finest grid with the corrections.
There are other cycles or grid sequences that can be used. To motivate this let
us consider the number of operations performed in order to get to the solution. We
can then see if we can try to reduce the total number of operations to get down to
converge to the solution by changing the grid sequencing.
The number of grid points at the finest grid are n. The number of operations
per grid point for one iteration are . Then the number of operations at the finest
grid h for one iteration are n. This is termed one work unit. At the next coarsest
level there are approximately n/4 grid points. Which means that an iteration at
the second level is only 1/4 of a work unit. Again an iteration at the third level
corresponds to 1/16 of a work unit. As we would expect the number of work units
on a coarse grid are less per iteration than on a fine grid. That is, it is cheaper
iterating on a coarse grid rather than on a fine grid. So on the way back from a
coarse grid to the fine grid we may be tempted to go back to the coarse grid once
more before returning to the fine grid. This is called the Wcycle. The two cycles
are illustrated in the figure.
h
h
2h
4h
8h
16h
h
r
4h
e
16h
295
grid algorithm. We just look at or plot the error on the finest grid versus the
number of work units.
Finally of course, the easiest and most obvious thing to do is to actually start
the iterations with the coarse grid rather than the fine grid.
How do we now apply this to the one-dimensional Euler equations. The first
point to bear in mind is that we always linearise the equations. This multi grid
algorithm can be applied almost directly to the linearised equation. The second
point to bear in mind is that we always transfer the residue from the finest grid to
the coarsest grid and the corrections/solutions from the coarsest grid to the finest
grid. The equations are written in delta form and the test for convergence is always
made on the fine grid, meaning the residue is evaluated on the fine grid.
Assignment 7.5 First redo the Laplaces equation solver you have written already
to accelerate convergence using the various multi grid methods.
Su = tR(t)
We will now apply the multi-grid scheme to this equation. The point to be borne
in mind is that we always transfer the residual from the fine grid to the coarse grid
and transfer the correction / solution from the coarse grid to the fine grid.
So given the discrete equation (7.3.7) and an initial condition, we do the following to obtain the solution.
(1) Take a few time steps with the equation (7.3.7).
(2) Compute the residual Rh . The residual can be smoothed if required especially if we have taken only one or two time steps.
(3) Transfer Rh and uh to the grid 2h to get R2h and u2h .
(4) Take time steps on the grid 2h.
(5) Transfer to the grid 4h and so on.
(6) transfer the u back to the fine grid.
(7) repeat this process.
Assignment 7.6 Do the same with the Euler equation solver. Again try out
different levels of grids. Also find the effect of the number of time steps taken on a
given grid level.
296
7. ADVANCED TOPICS
For extremely small Reynolds number value flows, the flow is laminar and turns out
to be dominated by viscous effects. The governing equations for a two dimensional
incompressible flow at these Reynolds number in fact degenerates to the Laplaces
equation in the velocity components. Creep flow, as it is called, throws up a flow
governed by some kind of a potential. As we keep increasing the Reynolds number,
the inertial effects start to grow and the viscous effects slowly start to get important near the surface of the cylinder. Put another way, as the Reynolds number
increases, the viscous effects become less important away from the cylinder. The
perceptible and increasingly strong viscous effects are confined to a region near the
cylinder called the boundary layer. It turns out that this boundary layer can in fact
separate from the cylinder causing a recirculation region to form. So far everything
seems fine. The problem is that the boundary layer and the recirculation region effect the pressure field around the cylinder which in turn effects the boundary layer.
We could hope that all of this settles down to an equilibrium situation giving a
steady state solution. However, in real life the recirculation region may slosh back
and forth responding to the pressure field that it creates in a sense the system hunting for that equilibrium and never finding it. Worse still, the recirculation region
may break off from the cylinder and head out into the flow. From fluid mechanics
we recognise this as vortex shedding and then the problem is anything but steady.
Look what this vortex shedding did to the pressure field of the Tacoma narrows
bridge with catastrophic consequences. The bridge problem was complicated by
the fact that the bridge also deformed in response to the pressure distribution.
The conclusion: There may be problems that are inherently unsteady.
How do we compute flows that vary in time. Or, how do we calculate the
transient to a flow that eventually makes it to the steady state. That is, we are
interested in the variation of our flow parameters in time.
The flow past a cylinder example given above is one where we have unsteady
flow without a geometrical change. The bridge example is a case where the unsteady flow entails changes in geometry. We will not address the geometry change
issue here. The unsteady flow problem is one on which whole tomes can actually
be written. When we sought the steady state solution with no reference to the
transient, we were interested in getting to the steady state as quickly as possible.
If the flow had different time scales we did not bother to capture all of them accurately as long as the transients with those scales were quickly eliminated. We
went out of our way to get the time scales comparable to each other so that they
297
could be efficiently eliminated. Now we want to pick up the transients. There may
be many different time scales that are important. This creates an interesting problem as we have already seen that dissipation in our numerical scheme can change
amplitudes and that dispersion can change speeds of propagation. The dissipation
and dispersion can depend on the time scales. On the other hand, the physical
system that we are trying to model may have its own dissipation and dispersion
characteristics. That is, the behaviour of decay and dispersion that we have seen
are not only properties of numerical schemes, they are also exhibited by real life.
Just look at a bag of potato chips, you open a bag, offer your friends some and
then find that all the piece left are smaller and tending towards crumbs. An initial
even distribution of various length scales of potato chips will on transit from the
manufacturer to your hands be shaken enough times to cause the smallest length
scales to end up at the bottom of the bag. This process is dispersion. It exists.
Clearly, we need a numerical scheme that is accurate and can incorporate/capture
the correct dissipation and dispersion.
Why are these parameters critical. Imagine trying to model a tsunami in the
ocean after an earthquake has occurred. It is important to be able to predict the
magnitude of the tsunami and the time of landfall. Dissipation and dispersion will
introduce errors in both. Consequently, we may give a false alarm which will cause
people to ignore future alarms or we may decide that there is no danger and not
give an alarm when we should.
7.5. Standard Schemes?
Can we use the schemes that we have employed so far with some success to get
time accurate solutions. If you go back to the section on the modified equation,
you will remember that FTBS gave the exact solution when = 1. What can
we do to make sure that we get the best possible solution within the constraint of
resources available?
If we were to try setting = 1 for the one-dimensional Eulers equation, which
would we use? Clearly this is not going to be easy. This is not getting us anywhere,
so let us just jump and look at a scheme with which we are already familiar: BTCS.
BTCS has the following advantages
It is unconditionally stable
...
So, clearly, we would like to get a scheme that is second order accurate in time.
This can be done by employing a central difference representation in time. The
Crank-Nicholson method does exactly this.
It should be pointed out here that the BTCS scheme can be viewed as a central
difference scheme for the spatial coordinate with a backward Euler scheme or a
rectangle rule for quadrature in time. Similarly, the Crank-Nicholson scheme is
equivalent to a trapezoidal quadrature rule employed in time, again in conjunction
with a central difference scheme in space. They are also identified an first and
second order Runge-Kutta schemes.
298
7. ADVANCED TOPICS
We could go to higher order schemes. There are a whole slew of higher order
Runge-Kutta schemes. There are numerous predictor corrector schemes that are
second order accurate in time.
7.6. Pseudo Time stepping
We have seen that we can march to a steady state solution by marching time
from some guessed initial condition. We have accumulated a lot of analysis tools
and know how for this class of time marching schemes. In fact, we would go as far
as adding a time derivative term of our choice to an equation that is formulated
to be unsteady, just to employ our time marching technology. We will try to bring
that to bear on the problem of solving unsteady flows.
Consider the first order, linear, one-dimensional wave equation again.
u
u
+a
=0
t
x
We will create an extra coordinate called pseudo time, , and include a
derivative with respect to in our equation.
(7.6.1)
u
u u
+
+a
=0
t
x
Now, we can take a second order backward difference in time-t and use some
time marching scheme to proceed in . In theory, each time we reach some kind of
a steady state in , we should have satisfied our unsteady equations.
Let us discretise this equation using central difference for the spatial derivatives.
We will take a two point backward difference for the time derivative. For the sake
of simplicity, we will use a forward difference in pseudo time, . This gives us
(7.6.2)
r
ur(p+1)q ur(p1)q
ur+1
3urpq 4up(q1) + up(q2)
pq upq
(7.6.3)
+
+a
=0
2t
2x
We are now able to march in pseudo time using the automaton
(7.6.4)
r
ur+1
pq = upq
ur(p+1)q ur(p1)q
3urpq 4up(q1) + up(q2)
a
2t
2x
ur+1
pq =
1
1
3
ur(p1)q + (1 )urpq ur(p+1)q + D
2
2
2
where,
(7.6.6)
= a
,
x
,
t
D=
1
4up(q1) up(q2)
2
urpq upq
299
At some intermediate iteration we define the error in our current pseudo time step
solution as
rpq = upq urpq
(7.6.8)
r
r+1
pq = pq
(7.6.10)
u(p+1)q u(p1)q
3up(q1) up(q2)
+a
=0
2t
2x
Multiplying through by we substitute the resulting equation back in to equation
(7.6.9) to get an equation in terms of alone.
(7.6.11)
(7.6.12)
r
r+1
pq = pq
r(p+1)q r(p1)q
3rpq
a
2t
2x
3
|g |2 < (1 )2 + 2 sin2 < 1
2
g takes its maximum value for = 2 . From here it is clear that
(7.6.14)
3
3
(2 ) + 2 < 0
2
2
would ensure that the automaton is stable.
To see this in terms of our original time step we expand out all the terms to
get
(7.6.15)
(7.6.16)
9
4
2
+ a2
t
2
<0
x
We really need to derive the modified equation for discretisation given in equation
(7.6.10) to find out when, if at all, all the dispersive and dissipative terms disappear.
(7.6.17)
300
7. ADVANCED TOPICS
We will for now look at the case when = 1. This tells us that we will have a
stable iteration for the pseudo time marching if
12
(7.6.18)
<
t
13
Just for fun, we could ask the question when can we take = t. Substituting
back into equation (7.6.17) we get
3
(7.6.19)
2 <
4
7.6.2. One-Dimensional Eulers Equation. Let us find out how it works
with the one-dimensional Eulers equation
Q Q E
+
+
= 0.
t
x
The naive thing to do is to just add the unsteady term in pseudo time to the
Euler equation as we have just done. This gives us an automaton to generate a
sequence of Q in pseudo time. Every pseudo time step, one needs to then extract
the flow parameters as required to calculate the other terms of the equation on
order to take the next pseudo time step. Since, most of our work here is in taking
pseudo time steps and the pseudo time derivative goes to zero anyway, why not add
a term that makes the computation easier. We could instead look at the equation
(7.6.20)
Q E
W
+
+
= 0.
t
x
and so on.
where W is some convenient set of state variables like Q, Q,
Let us consider different ways that one could proceed to discretise these equations. First, we can repeat the process used for the linear wave equation. We can
use an implicit scheme in time and an explicit scheme in pseudo-time so as to get
an explicit scheme. Again to illustrate we will use a second order backward-time
representation of the time derivative to get the following equation.
(7.6.21)
(7.6.22)
r+1
r
Wpq
Wpq
+
r
3Qpq 4Qp(q1) + Qp(q2)
2t
Implicit in
physical
time
Explicit
in pseudo
time
r
r
E(p+1)q E(p1)q
2x
= 0.
Implicit in
physical
time
r+1
Solving for Q
pq we get
(7.6.23)
r+1
r
Wpq
= Wpq
301
We can now march in , that is, advance in index r till we are satisfied that we
have a steady state in . A good initial condition in ( r = 0 ) is Q0pq = Qp(q1) .
Or, an explicit step can be taken in physical time to get an initial condition for
time-marching in pseudo time.
A second possible mechanism is to use an implicit scheme in pseudo time. In
this case we rewrite equation (7.6.22) as
(7.6.24)
r+1
r
Wpq
Wpq
3Qpq
+
r+1
4Qp(q1) + Qp(q2)
2t
+
r+1
E(p+1)q
r+1
E(p1)q
2x
= 0.
r+1
r
Epq
= Epq
+ AW
W
,
AW =
r
E
W pq
Substituting back into equation (7.6.24) and adding and subtracting 3Qrpq from the
physical time derivative term we get
r
Wpq
3
+
(Qr+1 Qrpq )
2t pq
3Qrpq 4Qp(q1) + Qp(q2)
AW W
E
+
+
=
.
2t
x
x
Q
and take the second order time derivative to
If we define the Jacobian PW =
W
the right hand side we get
(7.6.26)
r
Wpq
3
AW W
r
+
PW (Wpq
)+
2t
x
3Qrpq 4Qp(q1) + Qp(q2)
E
.
=
x
2t
The right hand side is our residue R. Multiplying through by and factoring out
r
we get
the Wpq
3
AW
r
(7.6.28)
I+
W = Rpq
.
PW +
2 t
x
Clearly, most schemes to accelerate convergence, whether it is local time stepping in
, residual smoothing, preconditioning the unsteady term can all now be included.
For example, preconditioning the unsteady term would simply require the change
in one term in equation (7.6.28) to get
AW
3
r
W = Rpq
.
PW +
(7.6.29)
W +
2 t
x
(7.6.27)
302
7. ADVANCED TOPICS
CHAPTER 8
Closure
If you are right 95% of the time, there is no sense in worrying
about the remaining 3% - Anonymous
Faith is a fine invention
For gentlemen who see;
But microscopes are prudent
In an emergency!
-Emily Dickinson
We have looked at various techniques to represent entities of interest to us. We
have seen how to get estimates of the error in the representation of mathematical
entities. We will now look at this whole process in a larger context.
8.1. Validating Results
How good are the approximations that we generate? How do we know they are
correct? Remember where we started this discussion back in chapter 1. Also bear
in mind what we want to do - answer the two questions that we just asked. We will
recast the discussion from the first chapter in a broader setting.
We have the world around us and in a attempt to understand and predict
this complex world we try to model it. Figure 8.1 tries to capture the processes
involved. The figure consists of arrows going between various shapes. The shapes
have explanatory labels on them. One of the points that I am trying to make
through the figure is that we have only a faint idea of what the real world is like.
The arrow or limb marked as A is the process of perception. Take vision for example.
Our vision is limited to a range of frequencies of electro-magnetic radiation in which
we can see. We call this range that we perceive light. We cant see x-rays for
example. However, by accident we may discover their existence and try to come up
with an arrangement to see them. Here, once we have a theory for light as electro
magnetic radiation, we can predict the existence of radiation not visible to us. The
fact of the matter is that we usually cannot test limb A in the figure. We may not
know what we do not know. In this case, we do not see the grey shade in the figure,
in fact no one sees it or perceives it and that is that.
We would then take what ever we see and try to model it. If we look at what
we perceive, we see that it is quite complex. We try to model this by throwing
away non-essentials and typically create a mathematical model. In the figure we
choose to represent the mathematical model as a circle. This is to convey as clearly
as possible, the loss of detail and our attempts, very often, to keep the model as
simple as possible. Sometimes, we may be tempted and fascinated by the symmetries exhibited by our mathematical model. The first question to ask is does the
303
304
8. CLOSURE
Real
World
We See
This
Computer
Model
B
C
Mathematical
Model
Figure 8.1. How CFD works. The polygon at the end is our
approximation to the circle. The process marked A is perception.
The process marked B maybe abstraction. The process C is
the CFD. Finally the process marked D would be validation
mathematical model represent the reality that we choose to represent? That is,
how do we test limb B of the figure? Normally, we cant answer this question unless
we use it to solve / model a problem that can then be used in an attempt to answer
this question. We usually find that this mathematical model is in itself complicated
and difficult to solve. By making appropriate assumptions and approximations we
end up modelling the mathematical model on the computer. In our case we assume
the Navier-Stokes equations are correct and then proceed to approximate these
equations on the computer using the techniques of CFD to get and algorithm and
techniques of programming to get an actual implementation. The discretisation
process we have studied so far in this book is indicated by the polygon that is used
to approximate the circle. Quite often, we can not get an exact representation of
our mathematical model on the computer and hence we end up approximating the
mathematical model. We ask, how well does the computer model represent the
original mathematical model. This is an important question when we are trying to
develop techniques to solve equations that arise in the modelling of nature. This
also questions the competence of doing limb C. It is complicated by the fact that
one needs to ensure that the program captures the algorithm appropriately.
305
The other question that is often asked, especially by someone who is interested
in the application of the model to solve problems is: How well does the computer
model represent reality? Here we are short circuiting all of the previous questions
asked before and instead say, Just tell me if the answer mirrors reality? This
would check whether limb D exists or not. This is not always as easy to answer. A
short cut would be to perform experiments and compare the results with the output
of our program. However, always bear in mind that the errors in the experiment
and the errors in the computer model may be such that the results agree. The
issue then is: what confidence do we have that they will agree when we do not have
experimental data?
This is the problem of validation[Roa00]. We now address different ways to
evaluate or validate our programs.
Consider, for example, Laplaces equation. We know from complex analysis
[Ahl79][Chu77] that any analytic function is a solution to Laplaces equation. We
pick an analytic function ( say z 2 ) and use the real part as our solution. We
evaluate the solution that we have picked on the boundaries and use this as the
boundary condition for Laplaces equation. We now act as though we did not know
the solution and solve the problem using any of our favourite techniques. Since, we
actually know the solution we can figure out how each solution technique is doing.
What if we did not know any solution? Well, make up a function say
(8.1.1)
u(x, y) = x2 + y 2
As before one can evaluate this function on the boundaries and obtain the boundary
conditions for the problem. If we were to substitute equation (8.1.1) into Laplaces
equation it will leave a residue. You can check for your self that the residue is 4.
This tells us that u given by equation (8.1.1) is a solution to
(8.1.2)
2 u = 4
This does not change the essential nature of our equation. We now have a problem to which we know the solution. We can now compare the computed solution
to actual solution of this equation.
How does one pick the solution? Looking at the differential equation, one knows
that the solution to Laplace equation will have second derivatives in the various
coordinate direction. So, we choose functions of our independent variables that
have this property. If we know more, for example, that the equation or the original
problem will allow for some discontinuity. We can then pick such problems. We did
this for the wave equation, when we took the step function as the initial condition.
The choice of functions for the solution can be made due to some very sophisticated reasons. One may want to test the ability of the computer model to pick
up some aspect. As has already been noted, discontinuities are one such artefact.
One could test the ability to pick maxima, minima, the correct dispersion and so
on.
8.2. Computation, Experiment, Theory
The first part of this chapter spoke in generalities. We now look at this triad
that makes up current scientific endeavour. Theory is the abstraction of exploratory
experiment and computation. Abstraction is the process of extracting the general
306
8. CLOSURE
APPENDIX A
Computers
There are numerous books out there to explain the workings of computers with
various degrees of detail. We will look at an analogy to understand the issues
involved when it comes to high performance computing which is the realm in which
CFD resides.
We will use the student registration system as it was run the first time it
was started at Indian Institute of Technology Madras (IITM). For the sake of the
analogy I have tweaked a few of the details.
The Layout: IITM is located in a 650 acre campus. At the centre is a five
storied central administrative building (adblock). On the fifth floor of
this building is the senate room where students are to be registered for
the new semester.
Cast of Characters: The students are to be registered by the Professor
in Charge of the Unregistered. We will call this person the CPU. The
CPU is a busy person and does not like to be kept idling. The students are
located about a kilometre away in ten different hostels (or dormitories).
Each hostel houses 192 students.
The Program: Each student
(1) needs some discussion with the CPU. (about 4 minutes)
(2) will fill out a registration form. (about 4 minutes)
(3) will hand the form to the CPU who will verify that everything is fine
with the form and sign it. (about 4 minutes)
The Time Element: Each student takes ten minutes to bicycle down to
the adblock. It takes a couple of minutes up the elevator into the senate
room and then 12 minutes (3 4 for each of the above steps) to finish
registering.
This can be implemented in various ways.
(1) The first simple minded solution is to have the CPU call the hostel. The
student bicycles down and registers after which the next student is called.
Now while the student is bicycling down and taking the elevator the CPU
is idling. Each student takes about 25 minutes to process and the CPU is
going to be stuck there for days.
(2) As it turns out, the lobby of adblock can comfortably handle 32 students.
We could use the institute minibus to fetch 32 people from the hostel to
adblock. The bus has a capacity of 16 students. Corralling the 16 and
bringing them in would take about ten minutes. Now all that the CPU
has to do is call down to the lobby and ask for the next student to be sent
up. The student takes a couple of minutes up the elevator and can then
proceed with the registration. From the CPUs perspective, each student
307
308
A. COMPUTERS
(3)
(4)
(5)
(6)
takes about 15 minutes to process. There is still idle time present. When
sixteen of the students are done in the lobby they get bussed back to the
hostel and the next sixteen replaces them.
We can now add a wrinkle to the above scenario. Outside the senate
room there is an ante-room which can accommodate four students. Now
the CPU only rings a bell which is a signal to the next student to enter the
senate room. In this fashion we have eliminated the ride up the elevator
showing up on the CPUs clock.
If we pay attention to the CPU, he/she talks to the student and then
waits while the student fills up the form. If instead of waiting, the CPU
gets another student in and talks to the second student while the first
is filling up forms, that idle time is eliminated. The CPU can verify the
first students registration form while the second student is busy filling
out his/her form. The first student can then be replaced by another
student. So, the CPU has a pipeline with at most two students in it at any
given time. With this structure or architecture we have reduced the time
between students leaving the senate room on completion of registration
to about 8 minutes per student instead of 12. Can we do better?
We could have two professors in the senate room. One advises one student
after another. The students move over one seat and fill out the registration
form. The second professor verifies the forms and signs them. The pipeline
has three stages and we get a student out every 4 minutes.
We could organise another pipeline in the conference room adjacent to
the senate room. and process 30 students per hour instead of 15 students
per hour.
The last is very clearly a great improvement from the first version of about
two students per hour. Now the analogy. In the computer, the CPU would be
the Central Processing Unit. The Hostels would be the computer memory. The
minibus would be the memory bus connecting the CPU to the memory. Fetching
students sixteen at a time and keeping them waiting in the lobby is called caching.
Having a smaller cache on the fifth floor basically means we have a two level cache.
Typically in computers the cache on the top floor is called the level one cache or
the L1 cache. The cache on the ground floor would be the L2 cache. The cost of
not having a data item in the L1 cache is high. The cost of not finding it in the L2
cache is very high.
Another point to consider: can we go to four rooms with four CPUs or eight
rooms with eight CPUs? Remember the elevator can only handle 120 students per
hour and the bus can handle about 96 students per hour. The two lessons we get
from this, try to access the data in the order in which it is available in the cache.
The rate at which data can come into the CPU is a strong determinant of the
performance of the computer. We cannot just keep on increasing the number of
CPUs.
A final note of caution here. The example given above fall into the category of
an embarrassingly parallel problem. That is, the registration of one student did not
depend on the other. The students could be registered in any order. We have not
looked at the flow of the data relevant to each of the students to the CPU. Nor have
we looked at the nature of the courses. If courses had constraints on the number
of students that can register for a given course, then we start to see complications.
309
How does one deal with two students in two different rooms contending for the last
seat in a given course. How does one know the state of the course at any time? For
example, how do you know there is only one seat left?
The objective here was to give a reasonable picture of the computer so that
the student can endeavour to write good programs. Warning: If you reached this
appendix because of a reference from chapter 2 you may wish to go back now.
You could continue reading. However, you may encounter terms that are defined
in chapters that follow chapter 2. We can now see how we can go about writing
programs on computers.
A.1. How do we actually write these programs?
As I had indicated in the introduction we need to bring together different
skills to this material. Fluid Mechanics, Mathematics and Programming. In an
introductory book, the fluid mechanics maybe restricted and the mathematics may
predominate the discussion. However, you really should code if you want to really
understand the process.
So, how does one go about developing these codes. My intent is not to introduce
serious design methodologies here. However, using a little common sense, one can
easily create functionally correct code meaning it works.
I would classify people who write code into five different categories based on
an aerospace engineering analogy. Lets look at someone programming equivalent
to designing and building an airplane.
Duh: Code? I know section 212 of the motor vehicles act. Nothing wrong
with being in this state. If you want to CFD though you will need to
put in the minimal amount of effort required to switch to the next level.
I would suggest that you pick up a relatively high level programming
language. My personal preference right now is Python.
Novice: Code is like the Wright flyer: It works. This is fine. You can get
by with a little help from your friends. You can learn to CFD which
is one of the aims of this book. On the other hand, anytime we start
working on something, by the time we are done, we usually know a better
way to do it, having now had the experience of doing it the first time. I
have noticed that people in the novice state (remember, no one was born
programming) tend to cling to code that they should discard, since they
barely got it working. Practice and paying attention to the process of
programming will get you to the next level.
Proficient: I can fly it. Not sure whether anyone else can operate it. At
this point code writing should not be such a hassle that you cling to
any particular piece of code. At this point you can not only learn to
CFD, you can actually do decent research in CFD without the need of a
collaborator to take care of the coding. This level of proficiency gives you
the confidence to check out new algorithms. Now, if you paid attention,
you would realise that your confidence may be justified by the fact that
you are able to capture many of the mistakes that you make. If you will
recognise two things:
a. Every one makes mistakes, no one is infallible
b. Every one acquires habits and quirks. If you pay attention while
you are learning something, (and here I will admit it helps if you
310
A. COMPUTERS
have someone to help you out) you can make sure you acquire habits
that make you less likely to make an error and consequently, more
efficient.
Reeeeealy Good: My programs are robust. Can be used for scheduled
flights of an airline, However, they need to be flown by experienced pilots. You have the right habits. You know that errors occur because
of bad programming practice and that you the individual has only one
responsibility: to learn from every mistake.
Super Programmer: Someone who is absolutely not tech-savvy can use
this code; the average person can fly to work. Well, you have the gift,
enjoy. Incidentally, I have this little program that I want written... Finally,
in this stage it is very easy to stop learning, the process never ends, the
day you cease learning is the day you no longer are.
So, where should you be as far as your programming skills are concerned.
You really have to at least be at the novice level. You have to be able to create
functionally correct code.
Here are some simple rules.
First: Dont Panic. Dont look at the whole problem and wonder: How am
I ever going to get this done? Where do I begin?
Second: We begin, as always, with the analysis of the problem at hand.
This is because we are attempting the creation of something that does
not exist till we write it - a program. This is done by synthesis - or
putting the parts that make up our program together. To this end try
to see if you can identify what parts and steps make up your problem.
For example, if you are looking at solving the one-dimensional first order
Eulers equation, we see that
We need to decide the scheme to be used: say FTCS
FTCS requires data at the current time step and will generate data
at the next time step. [ I need to have an array in which to store
my current data and one in which to store the new data ]
All
My algorithm will use Q and E, whereas I am interested in Q.
of these may have to be stored at each grid point.
Need to apply boundary conditions on the left and right.
Need to be able to take one FTCS step for all the interior grid points.
Need to take a lot of time steps and then decide when we are done.
Need to visualise results or get some idea as at what is the state of
the code
Clearly, we need to interact with the user.
What programming language should I use. [ Any thing in which you
are proficient will do for now, remember the Wright flyer. You want
to develop a functionally correct program not some super-duper code]
The actual process is a little more involved. This will suffice for now.
Third: Extract out stuff to be done. Sort them out so that you can identify
things that can be done directly.
In Python they
Fourth: You need to decide how you will handle Q and Q.
could be implemented as classes as follows:
(1) ConsFlowParm represents Q,
Gamma = 1.4 # Cp / Cv
Gm1 = Gamma - 1.
class ConsFlowParm:
""" Manages the conservative flow parameters"""
def __init__( self ):
self.Q = numpy.zeros(3)
def Rho( self ):
return self.Q[0]
def RhoU( self ):
return self.Q[1]
def RhoEt( self ):
return self.Q[2]
def U( self ):
return self.Q[1] / self.Q[0]
def P( self ):
return Gm1*(self.Q[2]-0.5*self.Q[1]*self.U())
class NonConsFlowParm:
"""Manages non-conservative flow parameters"""
def __init__( self ):
self.Q_ = numpy.zeros(3)
def Rho( self ):
return self.Q_[0]
def U( self ):
return self.Q_[1]
def P( self ):
return self.Q_[2]
class FluxE:
"""Manages the x-component of the flux"""
def __init__( self ):
self.E = numpy.zeros(3)
def MassFlux( self ):
return self.E[0]
def MomFlux( self ):
311
312
A. COMPUTERS
return self.E[1]
def EnergyFlux( self ):
return self.E[2]
def SetFlux( self, Q ):
self.E[0] = Q.RhoU()
self.E[1] = Q.RhoU() * Q.U() + Q.P()
self.E[2] = ( Q.RhoEt() + Q.P() ) * Q.U()
def GetFlux( self ):
return self.E
313
What do all of these items have in common? They have nothing to do with grids
or the kind of solver one is going to use. In fact they form a part of a core structure
that one should implement. Each of these functions should be tested. Once you
know they work, set them aside to be used by any of your solvers. Make sure you
test them thoroughly.
Where are
Sixth: We will now need to create two arrays of Q, E and Q.
these arrays created? Right now that can be in your main program. What
is the size of theses arrays? We may need to interact with the user to find
out.
Seventh: Determine all the interactions with the user. Try to keep all of this
in one place in your program. Typical interactions for the one-dimensional
Eulers equation solver would be
Physical extent of the problem?
Inlet condition: Subsonic? Data?
Exit condition: Subsonic? Data?
How many grid points?
?
Convergence criterion: How many time steps?
This interaction is simple enough that it may be done through one function.
Eighth: Go ahead and create all the arrays. The ones at the current time
step need to be initialised. The creation and initialisation can be done a
function.
Ninth: Now we can write the main part of the code. Write a function
given Q and , find time step.
Given Q at the current time step takes a step using our chosen
scheme.
Tenth: Code the program that will call all of these functions to achieve our
goal.
class FlowManager:
"""Manages the full problem...uses some solver"""
def __init__( self ):
self.N = input( " How many grid points \
including the boundary? " )
# Should really generalise this later
# to take care of supersonic and other
self.InletBC = SubSonicInlet()
self.ExitBC = SubSonicExit()
self.x = []
self.dx = []
self.Qlist = []
self.Q_list = []
self.E_list = []
314
A. COMPUTERS
ofp.ConsFlowParm()
(A.1.1)
d
dt
Qd =
f~ n
dS
If we were to discretise our domain into small volumes with volume V having
~ i = si n
m faces each with area A
i , [ no sum over i], we could write for a volume
315
m
X
dQ
~ i = F (Q)
f~i A
V =
dt
i=1
(A.1.2)
Where, Q now represents the mean values in the volume or cell and F (Q) represents
the net efflux from the volume.
We are given the initial condition Q0 . In order to take one time step of magnitude t we do the following. note: read the symbol as for all.
Given Q0 , compute f~0 , and the artificial dissipation D0 .
(1) FVolumes use given Qn , f~n and, D0
(a)
(b)
(c)
(d)
(2) FVolumes
(a)
(b)
(c)
(d)
Compute G(Q ) = F (Q ) + Dn .
Set RQ = RQ 2G.
Compute Q = Qn 0.5 tG(Q ).
Compute f~
(3) FVolumes
(a)
(b)
(c)
(d)
Compute G(Q ) = F (Q ) + Dn .
Set RQ = RQ 2G.
Compute Q = Qn 0.5 tG(Q ).
Compute f~
(4) FVolumes
(a)
(b)
(c)
(d)
(e)
Compute G(Q ) = F (Q ) + Dn .
Set RQ = RQ G.
Compute Qn+1 = Qn tRQ/6.
Compute f~n+1
Compute Dn+1
(5) FVolumes compute the square of the error contributed by that volume
to the whole as
(A.1.3)
Esqr =
X RQi 2
i
Qi
316
A. COMPUTERS
OneSixth = 1. / 6.
class RK4FlowManager(FlowManager):
"""
A flow manager that uses the four step
Runge-Kutta method to march in time.
"""
def __init__( self ):
if hasattr( FlowManager, "__init__" ):
FlowManager.__init__( self )
self.Qnew = copy.deepcopy( self.Qlist )
def IntermediateStep( self, L, alpha ):
dQ = self.SingleStep( self.Qnew, L )
dQ2 = SecondOrderDissipation( self.Qnew )
dQ4 = FourthOrderDissipation( self.Qnew )
for i in range( len( self.Qlist )):
self.Qnew[i].Q = self.Qlist[i].Q -\
( dQ[i] - 0.01*dQ2[i] + 0.001 * dQ4[i] ) * alpha
self.Qnew = self.InletBC.ApplyBC( self.Qnew )
self.Qnew = self.ExitBC.ApplyBC( self.Qnew )
A.2. Programming
I will just collect things here as I remember them and then collate later.
As computers evolve, some of these guidelines will change. Even the ones that
start with Never.
Try not to subtract [ I should say add things of opposite sign ]
Add in preference to multiply
multiply in preference to division
square roots are expensive, trancendentals are worse.
store multidimensional arrays as vectors.
if you insist on using a multidimensional array, at least access it the same
way that your computer stores it.
Readability matters. This means the srtucture of your code is transparent,
the choice of names (variables or otherwise is logical and informative) and
all of these do what they claim to do and nothing else.
Document! When an arbitrary choice is made (example: store two dimensional arrays row wise) - document! Do not drown the code and logic in
documentation.
317
318
A. COMPUTERS
32
31
30
29
28
27
33
13
14
15
16
34
10
11
12
35
36
17
18
19
20
26
25
24
23
22
21
(A.3.1)
=
n+1
i
r
and
n+1
n+1
n+1
n+1
i+1 + i1 + i+N + iN
4
We have captured an inherent parallelism in the Gauss-Seidel algorithm for
this equation. On the other hand, if we would like to keep it simple, we could use
Gauss-Jordan instead. Since the values are updated only after the full sweep, there
is no coupling between the computation of the steps.
If you pay attention to what we are doing ( see equations (A.3.1), (A.3.2)) we
are performing the same operations or instructions. The data or domain is divided
to provide the parallelism. This clearly falls into the category of SIMD or SPMD
[ Single Program Multiple Data] category. The process we are following is called
domain decomposition. Once we realise that we are doing domain decomposition we
can ask: Are there other ways of decomposing the domain to get our parallelism?1
(A.3.2)
=
n+1
i
1I hope this situation here looks familiar. The domain is like the support of a function. The
32
31
30
29
28
33
13
14
15
16
34
10
11
12
319
27
26
25
A
35
36
17
18
19
20
21
24
23
22
APPENDIX B
z = x + iy,
where i =
z1 = x1 + iy1 ,
and z2 = x2 + iy2
|z|2 = z z = x2 + y 2
|z| is our idea of the Euclidean distance of the point (x, y) from the origin. Since
it is the distance from the origin we are talking about the position vector ~r and its
magnitude is |~r| = r = |z|. This suggests that we can write in polar coordinates
the complex number z as
(B.1.4)
321
z = x + iy = rei
r
z = x iy = rei
Figure B.1. The complex plane or the Argand diagram. An arbitrary point z = x + iy and its conjugate are shown. The polar
form is also indicated.
operation
result
(x1 + x2 ) + i(y1 + y2 )
(x1 x2 ) + i(y1 y2 )
(x1 x2 y1 y2 ) + i(x1 y2 + y1 x2 )
(x1 x2 + y1 y2 ) + i(x2 y1 y1 x2 )
x22 + y22
where, is the angle measured from the positive x-axis to the position vector. This
is called the polar form of the complex number. Both the forms of representing a
complex number are useful. When adding or subtracting the standard for is useful.
When one needs to multiply complex number then the polar form is very often
easier to handle. The identity
(B.1.5)
is called Eulers Formula
ei = cos + i sin
322
y
z1 + z2
z2 = x2 + iy2
z1 = x1 + iy1
x
z1
z2
z1 z2
B.2. MATRICES
323
B.2. Matrices
A matrix is a mathematical mechanism to organise and operate upon n-tuples.
For example, the complex numbers we have just encountered consist of a real part
and an imaginary part which is basically two numbers or a double. A position
vector in three spatial dimensions would consist of three numbers or a triple.
Along these lines we could represent n distinct numbers or an n-tuple. Mathematically, if an entity can be represented as a matrix, it is said to have a representation.
You have encountered many entities made up of n-tuples just like the examples
given above. We will now look at how matrices organise and operate upon this data.
A triple as we encounter in vector algebra can be organised in two possible ways
in the matrix world, either a column matrix(column vector) or a row matrix(row
vector). This is shown in the following equation
(B.2.1)
x
Column Matrix: y
z
The row matrix is said to be the transpose of the column matrix and vice-versa.
This is indicated as follows
(B.2.2)
T
x
(x y z) = y ,
z
x
and y = (x y z)T
z
The superscript, T , is the transpose operator and its action is to generate the
transpose of a matrix. If we have two vectors ~a and ~b represented by matrices as
follows (a1 , a2 , a3 ) and (b1 , b2 , b3 ) our conventional operation of the dot product of
two vectors defines the matrix product between a column matrix and a row matrix
in that order as
(B.2.3)
a1
b1
a1 b1 + a2 b2 + a3 b3 = (a1 a2 a3 ) b2 = (b1 b2 b3 ) a2 .
a3
b3
Something that we would write as ~a ~b we would write in matrix form as ~a~b T . Bear
in mind that I have defined the vectors as row vectors. If I had defined them as
column vectors then we would have ~aT ~b. Simply put, we take the element of the row
and the corresponding element of the column and multiply them and accumulate
the products to get the product of a row matrix and a column matrix.
We can build other matrices from our row matrices (all of the same size of
course) by stacking them up. If you stack up as many row matrices as you have
elements in the row we get a square matrix. When we use the term matrix in this
book, we will be referring to a square matrix. We will use two subscripts to index
the matrix. For example aij . The first subscript, i, indicates which row vector we
324
are referring, the second one, j, picks the element out of that row vector.
3 j 1
j
j + 1 n
1
a1,1 a1,2 a1,3 a1,j1 a1,j a1,j+1 a1,n
a2,1 a2,2
2
a2,j a2,n
..
3
. a3,3
a3,j a3,n
a3,1
..
..
..
..
.
.
i1
a
a
i1,j
i1,n
i1,1
ai,1 ai,2 ai,3 ai,j1 ai,j ai,j+1 ai,n
i
i+1
ai+1,1 ai+1,j ai+1,n
.
..
.
.
..
..
..
.
(B.2.4)
If the row vectors that make up the matrix are independent of each other we
have a matrix that is not singular. If we are given a vector ~x we can project it onto
these independent vectors to get a kind of a projection ~b onto the space spanned
by these vectors. If we labelled this matrix as A then we could write
A~x = ~b
(B.2.5)
For example, given the matrices
would we find ~y = C~x?
5
(B.2.6)
C = 0
3
(B.2.7)
(B.2.8)
(B.2.9)
7
1 ,
10
4
and ~x = 6
9
y1 = 5 4 + 2 6 + 7 9 = 63
y2 = 0 4 + 11 6 + 1 9 = 75
y1 = 3 4 + 8 6 + 10 9 = 150
Inspection of the equations (B.2.7) shows us that they could also be rewritten
as
(B.2.10)
7
2
5
y1
y2 = 0 4 + 11 6 + 1 9.
10
8
3
y3
That is, ~y can also be thought of as a linear combination of the vectors formed by
the columns of the matrix. A given square matrix can be viewed as a set of vectors
stacked on top of each other and would span a space called the row space of that
matrix or a set of column vectors placed side-by-side and would then span a space
called the column space of that matrix. Now that we see that the square matrix
can also be treated as a composition of column vectors we can write, analogous to
our previous operation, ~z = ~xT C. You can verify that
7
5 2
4 6 9 0 11 1 = 47 146 124
(B.2.11)
3 8 10
B.2. MATRICES
325
We have so far done the transpose operations to vectors. We can also perform
this operation on any general matrix. To take the transpose of a square matrix A
with entries aij we just swap the elements aij with the element aji for all i < j.
Assignment 2.2
(1) Given the matrix
3
A=
1
(B.2.12)
1
3
Find ~y using equation the ~y = A~x for various values ~x lying on a unit
circle centred at the origin. The easiest
way to pick various values of ~x is
for various values of . Graph ~
to take its components to be cos
x and
sin
the corresponding ~y . What can you say about the relationship of ~x to ~y ?
Do the same for ~z = ~xT A ?
(2) Repeat problem one with points taken from a circle of radius two instead
a unit circle.
(3) Given the matrix
(B.2.13)
B=
1
3
2
2
Repeat the exercise in the first problem. What is the difference between
the two matrices?
(4) Finally repeat these exercises for the matrices
3 0
1 2
, D=
(B.2.14)
C=
0 3
2 4
You should have seen from the assignment that for a unit vector ~x the effect of
A is to rotate the vector and to scale it to obtain the vector ~y . This is true even
in the case of ~z. We will discuss this assignment. We start with the last problem
first.
The matrix C is a bit strange. It rotated every vector enough to get it pointed
in the same direction and scaled it in that direction. Effectively all the points on
the plane collapse into a line. Taking two independent vectors ~x no longer gives you
two independent ~y s. The matrix is said to be singular. This happens as the two
vectors that are stacked to form the matrix are linearly dependent on each other.
In fact, in this case they were chosen so that one has components which are twice
the other. You can confirm this. The easiest way to find out if two vectors are
parallel to each other is to take the cross product. For a 2 2 matrix
this is called
the determinant of the matrix. In general for a matrix A = ac db the determinant
can be written as
a b
a b
= ad bc
(B.2.15)
det A = det
=
c d
c d
326
will need to know how to find determinants of larger matrices and can followup
on references [Str06],[Kum00],[BW92],[Hil88]. If we have three independent
vectors we could find the volume of the parallelepiped that have those three vectors
as edges. The volume of the parallelepiped is the determinant of the matrix made
up of those vectors. So, we have
a11 a12 a13
a22 a23
a21 a23
a21 a22
(B.2.16)
a21 a22 a23 = a11 a32 a33 a12 a31 a33 + a13 a31 a32
a31 a32 a33
Now, let us get back to the fourth problem of the assignment. The matrix
D did stretch every vector. It did not rotate any vector. Its action is said to be
spherical in nature. The matrix is variously referred to as an isotropic matrix or a
spherical matrix. We have employed the last problem to define the determinants
and spherical matrices. Lets see what we can get from the first two problems of
the assignment.
In the first problem, before we look at the action of the matrix, let us make
an observation about the structure of the matrix. You will notice that A = AT .
Such a matrix is said to be a symmetric matrix. Just for completeness, a matrix
which is identical to the negative of the transpose, that is A = AT , is called a
skew-symmetric matrix.
Now let us look at the action of the matrix on various vectors. What was the
effect or action of A on the unit vector at an angle of 45 degree from the x-axis?
It went through a pure stretch and no rotation. The effect of multiplying by a
matrix in that direction turned out to be the same as multiplying by a scalar. To
check this we need to look at the second problem. We see that the magnitude in
this direction again doubled. Along the ~x(45 ) matrix multiplication is like scalar
multiplication. That is
1/2
(B.2.17)
A~x(45) = 2~x(45) , ~x(45) =
1/ 2
Is there another direction like this where the action of the matrix A is a pure
stretch. If you havent found it check out the vector along the line through the
origin at angle of 135 degrees from the x-axis. The direction is called a characteristic
direction of A or an eigenvector of A. Corresponding to the direction is the amount
of stretch, did you check the stretch for the 135 degree direction? The amount of
the stretch is called a characteristic value or an eigenvalue. The two matrices A,
B given above have two eigenvectors and each has a corresponding eigenvalue.
The student should recollect that we have encountered the idea of a characteristic
direction before. In that case the partial differential equation reduced to an ordinary
differential equation along the characteristic direction.
In general the equation (B.2.17) gives us the clue to finding these directions.
We rewrite it since we do not know the characteristic direction and would like to
find out what the characteristics are. So, in general we want to find a number i
and a corresponding direction ~xi as
(B.2.18)
A~xi = i ~xi
The subscript i is used to remind ourselves here of the correspondence between the
stretch i and the eigenvector ~xi . We can rearrange this equation as
(B.2.19)
(A Ii )~xi = ~0
B.2. MATRICES
327
Now, one obvious ~x that satisfies this equation is the zero vector. This is of no use
to us. We could hope for a non-zero ~x if the matrix multiplying it were singular.
The matrix is singular when
(B.2.20)
|A I| = 0
This gives us the so called characteristic polynomial. We will work it out for the
matrix from the first problem in the assignment B-2. To find
the eigenvalues and
the corresponding eigenvectors of the matrix A = 13 13 , solve the following
equation for .
3
1
=0
(B.2.21)
1
3
2 6 + 8 = 0
Fortunately the left hand side of equation (B.2.22) can be factored as ( 2)( 4)
which gives us two eigenvalues
(B.2.23)
1 = 2,
or/and
2 = 4
Now we need to find the corresponding eigenvectors ~x1 and ~x2 . Let us find ~x1
using 1 . We can do this from equation (B.2.19). If we were to substitute 1 = 2,
we would make the matrix singular. As a consequence we cannot get both the
components of ~x1 independently. We rewrite equation (B.2.19) here for convenience
after the substitution
0
1
1 x1,1
=
(B.2.24)
0
1 1 x2,1
where x1,1 is the first component of the first eigenvector and x2,1 is the second
component of the first eigenvector. Solving any one of the equations gives x1,1 =
x2,1 (normally I solve both equations, I check my algebra by solving the second
equation and make sure that I get the same answer as from the first). In this case,
if x1,1 = a, then an eigenvector corresponding to = 2 is (a, a). We can set the
vector ~x1 to the unit vector in that direction as
1/2
(B.2.25)
~x1 =
1/ 2
Assignment 2.3
It is advisable to do these calculations employing a calculator instead of programming them or using a package to find the eigenvalues and eigenvectors. Pay
attention to the relationship between the matrices in the two problems and the way
they relate to the eigenvalues and eigenvectors.
(1) Find the eigenvalues and eigenvectors for the matrices given in the previous assignment B-2.
(2) Repeat problem one for the transpose of the matrices from the previous
assignment.
Now, for each of the matrices you have the matrix X whose entries xi,j provide
the ith component of the j th eigenvector corresponding to the eigenvalue j . Notice
328
that the eigenvectors are being stacked as columns, side-by-side. From the definition
given in equation (B.2.18) we can write
(B.2.26)
AX = X
1
where is a diagonal matrix having j on the diagonal. Make sure you understand
the right hand side of the equation; X 6= X. If we pre-multiply equation
(B.2.26) by X 1 , we get
(B.2.27)
X 1 AX =
X 1 A = X 1
This equation looks something like equation (B.2.26), but not quite. In general if
we have a matrix Re whose columns are made of vectors ~re such that
(B.2.29)
ARe = Re .
The vectors ~re are called the right eigenvectors of A. In a similar fashion if the the
matrix Le is a matrix whose rows are made up of vectors ~le such that
(B.2.30)
Le A = Le .
~
the vectors le are called left eigenvectors. Now we can write using equations (B.2.29)
and (B.2.30)
(B.2.31)
1
A = Re R1
e = Le Le
Le = R1
e
Assignment 2.4
(1) For the matrices A and B given in the earlier assignment find the left and
right eigenvectors.
Is there a way for us to get an estimate of the magnitude of the eigenvalues? Yes
there is. There is an extremely useful result called the Gershgorins circle theorem.
If we have a matrix A and we partition the matrix into two as follows
(B.2.33)
A = D + F,
B.2. MATRICES
329
contained in the union of those discs. Of course if the di = 0 then the circle is
centred at the origin. To make sure you understand the meaning of this theorem
try out the following exercise.
Assignment 2.5
Find / draw the circles as indicated by the Gershgorins theorem for the matrices
given below. Find the eigenvalues.
3 1 0
(1) 1 1 0
1 0 3
0 1 0
(2) 1 0 1, a = 1, 2, 3
0 a 0
Matrices are a popular mode of representation for mathematical entities. This
is so extensive that a mathematical entity is said to have a representation if it has
a matrix representation. You will see in chapter 5 that tensors can be represented
by matrices. As an example for now, consider the matrices
0 1
1 0
, i=
(B.2.35)
1=
1
0
0 1
What do these matrices represent? Evaluate the product ii and find out. Did you
get it to be 1. So, what does 51 + 2i represent? Check out the appendix B.1. A
note of caution here. A mathematical entity may have a representation. That does
not mean that every matrix is a representation. Specifically, a complex number can
be written in terms of 1 and i. Every matrix does not represent a complex number.
( 13 24 ) is an obvious example. In a similar fashion, every tensor may have a matrix
representation. Every matrix does not represent a tensor.
Since we have so many other applications of matrices, we will look at some
useful decompositions of matrices. Any square matrix A can be written as the sum
of
(1) a symmetric matrix and a skew symmetric matrix,
1
1
A + AT +
A AT
(B.2.36)
A=
2
2
(2) a spherical and a deviatoric matrix,
1
1
(B.2.37)
A = tr (A) I + A tr (A) I
3
3
Assignment 2.6
Perform the decomposition on a few of the matrices given the previous assignments.
330
1 cos 2x
dx =
2
0
0
So, the norm of sin on the interval (0, 2) is . Is it obvious that this is the
same for cos? Check it out. What about the norm for an arbitrary wave number
n?
Z 2
Z 2
1 cos 2nx
dx =
sin2 nxdx =
(B.3.3)
hsin nx, sin nxi =
2
0
0
(B.3.2)
hsin x, sin xi =
sin xdx =
Since the norms of all of the functions are the same we have an interesting
possibility here. We could redefine our dot product so that the functions have unit
norm. We redefine the dot product as
(B.3.4)
1
hf, gi =
f gdx
0
I have boxed the definition so that it is clear that this is the one that we will use.
Now, check whether sin and cos are orthogonal to each other.
Z
Z
1 2 sin 2x
1 2
dx = 0
sin x cos xdx =
(B.3.5)
hsin x, cos xi =
0
0
2
You can verify that the sets {sin nx} and {cos nx} are orthonormal both within the
set and between the sets.
What happens to the constant function which we have tentatively indicated as
1?
Z
1 2
dx = 2
(B.3.6)
h1, 1i =
0
331
Tonormalise the constant function under the new dot product, we define C0 (x) =
1/ 2. So, the full orthonormal set is
(B.3.7)
1
S0 (x) = 0, C0 (x) = , S1 (x) = sin x, C1 (x) = cos x,
2
S2 (x) = sin 2x, C2 (x) = cos 2x, ,
Now, a periodic function f (x) can be represented using the Fourier series by
projecting it onto this basis. By taking the dot product, we will find the components
along the basis vectors, or the Fourier coefficients as they are called, as follows
Z
1 2
(B.3.8)
f (x)Cn (x)dx
an =
0
Z
1 2
bn =
(B.3.9)
f (x)Sn (x)dx
0
We then try to reconstruct the function using the Fourier components. The representation f is given by
f(x) =
(B.3.10)
an Cn (x) + bn Sn (x)
an Cn (x) + bn Sn (x)
n=0
It will be identical to f , if f is smooth. f(x) will differ from f , when f has a finite
number of discontinuities in the function or the derivative. We will just use f (x)
instead of f(x) as is usually done and write
(B.3.11)
f (x) =
n=0
A more convenient form of the Fourier series is the complex Fourier series. Now we
can use the deMoivres Formula2 given by
einx = cos nx + i sin nx
(B.3.12)
(B.3.13)
From which we can see that
einx + einx
2
einx einx
(B.3.15)
sin nx =
2i
Substituting back into equation (B.3.11) we get
cos nx =
(B.3.14)
(B.3.16)
f (x) =
n=0
an
einx + einx
einx einx
+ bn
2
2i
332
f (x) =
X
an ibn inx an + ibn inx
e
+
e
2
2
n=0
an ibn
2
an + ibn
(B.3.19)
= cn
cn =
2
then, the Fourier series expansion can be written as
X
cn einx
(B.3.20)
f (x) =
(B.3.18)
cn =
n=
inx
Since e
is complex, we need to define the dot product appropriately so as to get
a real magnitude. The dot product now is redefined as follows
Z 2
1
(B.3.21)
hf, gi =
f gdx
2 0
where g is the complex conjugate of g. Note that hf, f i is real. If g is real then
the dot product degenerates to our original definition (apart from the normalising
factor).
Assignment 2.7
(1) In the case where f and g are complex, what is the relation between hf, gi
and hg, f i?
(2) Verify that the set {einx } is orthonormal using the norm given in equation
(B.3.21)
(3) Find the Fourier components of the function f (x) = x(x 2).
We see that
Z 2
1
f einx dx
2 0
We can recover our original an and bn using equations (B.3.18) as
(B.3.22)
cn =
(B.3.23)
an = cn + cn = cn + cn
(B.3.24)
bn = i(cn cn ) = i(cn cn )
It is clear that given a function f (x) on an interval [0, 2], we can find the corresponding Fourier coefficients.
Bibliography
[Act90]
[Act96]
[Ahl79]
[Ame77]
[Ari89]
[Arn04]
[Bha03]
[BM80]
[Boo60]
[Bra00]
[Bri87]
[BW92]
[CFL67]
[Cha90]
[Chu77]
[CJ04]
[Ded63]
[ea99]
[Fel68]
[FH74]
[Gea71]
[GF82]
[GL83]
[Gol91]
[GU72]
[Ham73]
[Hil88]
[HP03]
[HY81]
333
334
[Jr55]
[Knu81]
[KPS97]
[Kre89]
[Kum00]
[Lat04]
[Lax73]
[LR57]
[LW60]
[Moo66]
[Moo85]
[NA63]
[PJ55]
[Roa00]
[Roe81]
[RS81]
[Sam03]
[Sha53]
[Smi78]
[Sne64]
[SS82]
[Str86]
[Str06]
[Stu66]
[Tol76]
[Var00]
[Wes04]
[Win67]
[You93]
[ZT86]
BIBLIOGRAPHY
2
u
u
J. Douglas Jr, On the numerical integration xu
2 + y 2 = t by implicit methods,
Journal of the Society for Industrial and Applied Mathematics 3 (1955), no. 1, 4265.
D. E. Knuth, The art of computer programming - seminumerical algorithms, vol. II,
Addison-Wesley, 1981.
P. E. Kloeden, E. Platen, and H. Schurz, Numerical solution of sde through computer
experiments, Springer, 1997.
E. Kreysig, Introductory functional analysis with applications, Wiley, 1989.
S. Kumaresan, Linear algebra: A geometric approach, Prentice-Hall of India Private
Limited, 2000.
B. P. Lathi, Linear systems and signals, Oxford University Press, 2004.
P. D. Lax, Hyperbolic systems of conservation laws and the mathematical theory of shock
waves, SIAM, 1973.
H. W. Liepmann and A. Roshko, Elements of gasdynamics, John Wiley & Sons, 1957.
P. D. Lax and B. Wendroff, Systems of conservation laws, Communications on Pure and
Applied Mathematics 13 (1960), no. 2, 217237.
R. E. Moore, Interval analysis, Prentice-Hall, 1966.
, Computational functional analysis, Ellis-Horwood limited, 1985.
Sarmin E. N. and Chudov L. A., On the stability of the numerical integration of systems
of ordinary differential eqautions ariding in the use of the straight line method, USSR
Computtaional Mathematics and Mathematical Physics 3 (1963), no. 6, 15371543.
D. W. Peaceman and H. H. Rachford Jr, The numerical solution of parabolic and elliptic
differential equations, Journal of the Society for Industrial and Applied Mathematics 3
(1955), no. 1, 2841.
P. Roache, Validation, verification, certification in cfd, ??, 2000.
P. L. Roe, Approximate Riemann solvers, parameter vectors, and difference schemes,
Journal of Computational Physics 43 (1981), 357372.
D. H. Rudy and J. C. Strikwerda, Boundary conditions for subsonic compressible
Navier-Stokes calculations, Computers and Fluids 9 (1981), 327338.
H. Samet, Spatial data structures, Morgan-Kaufmann, 2003.
A. H. Shapiro, The dynamics and thermodynamics of compressible fluid flow, The
Ronald Press Company, 1953.
G. D. Smith, Numerical solution of partial differential equations: Finite difference methods, Clarendon Press, 1978.
I. Sneddon, Elements of partial differential equations, McGraw-Hill, 1964.
J. L. Synge and A. Schild, Tensor calculus, Dover Publications, Inc, 1982.
G. Strang, Introduction to applied mathematics, Wellesley-Cambridge Press, 1986.
, Linear algebra and its applications, 4th ed., Thompson Brooks/Cole, 2006.
R. D. Stuart, An introduction to Fourier analysis, Chapman and Hall, 1966.
G. P. Tolstov, Fourier series, Dover Publications, 1976.
R. S. Varga, Matrix iterative analysis, 2nd ed., Springer-Verlag, 2000.
P. Wesseling, Principles of computational fluid dynamics, Springer, 2004.
A. M. Winslow, Numerical solution of the quasi-linear Poisson equation in a nonuniform triangle mesh, Journal of Computational Physics 1 (1967), 149172.
E. C. Young, Vector and tensor analysis, Marcel-Dekker Ltd, 1993.
E. C. Zachmanoglou and D. W. Thoe, Introduction to partial differential equations with
applications, Dover Publications, 1986.