IImanual 2024
IImanual 2024
IImanual 2024
Computational Projects
2023-24
CATAM
Mathematical Tripos Part II
Astrophysics Tripos Part II
Computational Projects
July 2023
Faculty of Mathematics
University of Cambridge
Contents
Units
Introduction
1 Numerical Methods
1.1 Fourier Transforms of Bessel Functions 6
1.6 Multigrid Methods 10
2 Waves
2.2 Dispersion 7
2.11 Fisher’s Equation for Population Dispersal Problems 9
4 Dynamics
4.5 Euler’s Equations 8
5 Quantum Mechanics
5.2 S-Wave Scattering 7
5.3 Bound State Energies for One-Dimensional Potentials 9
7 Mathematical Methods
7.3 Minimisation Methods 8
7.4 Airy Functions and Stokes’ Phenomenon 9
9 Operational Research
9.1 Policy Improvement for a Markov Decision Process 4
9.4 Option Pricing in Mathematical Finance 6
10 Statistics
10.9 Markov Chain Monte Carlo 6
10.16 The Tennis Modelling Challenge 8
11 Statistical Physics
11.3 Classical gases with a microscopic thermometer 8
14 General Relativity
14.5 Cosmological Distances 8
14.6 Isolating Integrals for Geodesic Motion 8
15 Number Theory
15.1 Primality Tests 9
15.10 The Continued Fraction Method for Factorisation 8
16 Algebra
16.1 The Galois Group of a Polynomial 7
16.5 Permutation Groups 7
17 Combinatorics
17.1 Graph Colouring 7
17.3 Hamiltonian Cycles 5
19 Communication Theory
19.1 Random Codes 5
20 Probability
20.5 Percolation and the Invasion Process 9
20.6 Loss Networks 9
23 Astrophysics
23.5 Ionization of the Interstellar Gas near a Star 8
23.6 Accretion Discs 8
You may choose freely from the projects above, independently of whether you are studying the
examinable courses with which your chosen projects are connected. For up-to-date information
on the maximum credit for the Computational Projects in Part II of the Mathematical Tripos,
and the total number of units required to achieve that maximum, please consult both the
Undergraduate Schedules of the Mathematical Tripos and § 2.1 of the Introduction. Please also
see § 2.1 of the Introduction for more information on how credit is awarded.
1 General
Please read the whole of this introductory chapter before beginning work on the projects. It
contains important information that you should know as you plan your approach to
the course.
1.1 Introduction
The course is a continuation of the Part IB Computational Projects course. The aim is to
continue your study of the techniques of solving problems in mathematics using computational
methods.
As in Part IB, the course is examined entirely through the submission of project
reports; there are no questions on the course in the written examination papers. The definitive
source for up-to-date information on the examination credit for the course is the Faculty of
Mathematics Schedules booklet for the academic year 2023-24. At the time of writing (July
2023) the booklet for the academic year 2022-23 states that
No questions on the Computational Projects are set on the written examination pa-
pers, credit for examination purposes being gained by the submission of reports. The
maximum credit obtainable is 150 marks and there are no alpha or beta quality marks.
Credit obtained is added directly to the credit gained on the written papers. The max-
imum contribution to the final merit mark is thus 150, which is the same as the
maximum for a 16-lecture course. The Computational Projects are considered to be
a single piece of work within the Mathematical Tripos.
CATAM projects are intended to be exercises in independent investigation somewhat like those
a mathematician might be asked to undertake in the ‘real world’. They are well regarded by
external examiners, employers and researchers (and you might view them as a useful item of
your curriculum vitae).
The questions posed in the projects are more open-ended than standard Tripos questions: there
is not always a single ‘correct’ response, and often the method of investigation is not fully
specified. This is deliberate. Such an approach allows you both to demonstrate your ability
to use your own judgement in such matters, and also to produce mathematically intelligent,
relevant responses to imprecise questions. You will also gain credit for posing, and responding
to, further questions of your own that are suggested by your initial observations. You are allowed
and encouraged to use published literature (but it must be referenced, see also §5) to substantiate
your arguments, or support your methodology.
1.3 Timetable
You should work at your own speed on the projects contained in this booklet, which cover a
wide range of mathematical topics.
• You are strongly advised to complete all your computing work by the end of the Easter
vacation if at all possible, since the submission deadline is early in the Easter Term.
• Do not leave writing up your projects until the last minute. When you are writing
up it is highly likely that you will either discover mistakes in your programming and/or
want to refine your code. This will take time. If you wish to maximise your marks, the
process of programming and writing-up is likely to be iterative, ideally with at least a
week or so between iterations.
• It is a good idea to write up each project as you go along, rather than to write all the
programs first and only then to write up the reports; each year several students make this
mistake and lose credit in consequence (in particular note that a program listing without
a write-up, or vice versa, gains no credit). You can, indeed should, review your write-ups
in the final week before the relevant submission date.
As was the case last year, the Faculty of Mathematics is supporting Matlab for Part II. During
your time in Cambridge the University will provide you with a free copy of Matlab for your
computer. Alternatively you can use the version of Matlab that is available on the Managed
Cluster Service (MCS) that is available at a number of UIS and institutional sites around the
Collegiate University.
All undergraduate students at the University are entitled to download and install Matlab on
their own computer that is running Windows, MacOS or Linux; your copy should be used for
non-commercial University use only. The files for download, and installation instructions, are
available at
http://www.maths.cam.ac.uk/undergrad/catam/software/matlabinstall/matlab-personal.htm.
This link is Raven protected. Several versions of Matlab may be available; if you are down-
loading Matlab for the first time it is recommended that you choose the latest version.
The Faculty of Mathematics has produced a booklet Learning to use Matlab for CATAM
project work , that provides a step-by-step introduction to Matlab suitable for beginners. This
is available on-line at
However, this short guide can only cover a small subset of the Matlab language. There are
many other guides available on the net and in book form that cover Matlab in far more depth.
In addition:
• The MathWorks also provide the introductory guide Getting Started with Matlab. You
can access this by ‘left-clicking’ on the Getting Started link at the top of a Matlab
‘Command Window’. Alternatively there is an on-line version available at
http://uk.mathworks.com/help/matlab/getting-started-with-matlab.html
• Further, The MathWorks provide links to a whole a raft of other tutorials; see
https://uk.mathworks.com/support/learn-with-matlab-tutorials.html
In addition their Matlab documentation page gives more details on maths, graphics,
object-oriented programming etc.; see
http://uk.mathworks.com/help/matlab/index.html
(a) Numerical Computing with Matlab by Cleve Moler (SIAM, Second Edition, 2008,
ISBN 978-0-898716-60-3). This book can be downloaded for free from
http://uk.mathworks.com/moler/chapters.html
(b) Matlab Guide by D.J. Higham & N.J. Higham (SIAM, Second Edition, 2005, ISBN
0-89871-578-4).
You may be spoilt for choice: Google returns about 100,000,000 hits for the search
‘Matlab introduction’, and about 11,000,000 hits for the search ‘Matlab introduction
tutorial’.
• The Engineering Department has a webpage that lists a number of helpful articles; see
http://www.eng.cam.ac.uk/help/tpl/programs/matlab.html
Use of Matlab is recommended,1 but you are free to write your programs in any computing
language whatsoever. Python, Julia,2 R,3 C, C++, Mathematica,4 Maple 5 and Haskell have been
used by several students in the past, and Excel has been used for plotting graphs of computed
results. The choice is your own, provided your system can produce results and program listings
for inclusion in your report.6
However, you should bear in mind the following points.
• The Faculty does not promise to help you with programming problems if you use a language
other than Matlab.
• Not all languages have the breadth of mathematical routines that come with the Matlab
package. You may discover either that you have to find reliable replacements, or that you
have to write your own versions of mathematical library routines that are pre-supplied in
Matlab (this can involve a fair amount of effort). To this end you may find reference
books, such as Numerical Recipes by W. H. Press et al. (CUP), useful. You may use
equivalent routines to those in Matlab from such works so long as you acknowledge
them, and reference them, in your write-ups.
• If you choose a high-level programming language that can perform advanced mathematical
operations automatically, then you should check whether use of such commands is permit-
ted in a particular project. As a rule of thumb, do not use a built-in function if there is
no equivalent Matlab routine that has been approved for use in the project description,
or if use of the built-in function would make the programming considerably easier than
intended. For example, use of a command to test whether an integer is prime would not
be allowed in a project which required you to write a program to find prime numbers. The
CATAM Helpline (see §4 below) can give clarification in specific cases.
• Subject to the aforementioned limited exceptions, you must write your own computer
programs. Downloading computer code, e.g. from the internet, that you are asked to write
yourself counts as plagiarism (see §5).
Some projects require the use of a Computer Algebra System (CAS). At present none is specifi-
cally recommended but possible choices include the Symbolic Math Toolbox in Matlab, Math-
ematica and Maple.
1
Except where an alternative is explicitly stated, e.g. see footnotes 3 and 5.
2
Julia is a high-level open source language well suited to numerical computation. An Introduction to Julia
for CATAM under ongoing development is available from https://sje30.github.io/catam-julia/.
3
R is a programming language and software environment for statistical and numerical computing, as well as
visualisation. It is the recommended language for some Part II projects. R is available for free download for the
Linux, MacOS and Windows operating systems from http://www.r-project.org/.
4
Mathematica is a software package that supports symbolic computations and arbitrary precision numerical
calculations, as well as visualisation. At the time of writing Mathematica is also available for free to mathematics
students, but the agreement is subject to renewal. You can download versions of Mathematica for the Linux,
MacOS and Windows operating systems from
https://www.maths.cam.ac.uk/computing/software/mathematica/
5
Maple is a mathematics software package that supports symbolic computations and arbitrary precision
numerical calculations, as well as visualisation. It is the recommended language for some Part II projects.
6
There is no need to consult the CATAM Helpline as to your choice of language.
• For intensive numerical calculations Maple should be told to use the hardware floating-
point unit (see help on evalhf).
• If you choose to use Maple, Mathematica, or any other CAS to do a project for which a
CAS is not specifically required, you should bear in mind that you may not be allowed to
use some of the built-in functions (see §1.4.3).
2 Project Reports
For each project, 40% of the marks available are awarded for writing programs that work and for
producing correct graphs, tables of results and so on. A further 50% of the marks are awarded
for answering mathematical questions in the project and for making appropriate mathematical
observations about your results.
The final 10% of marks are awarded for the overall ‘excellence’ of the write-up. Half of these
‘excellence’ marks may be awarded for presentation, that is for producing good clear output
(graphs, tables, etc.) which is easy to understand and interpret, and for the mathematical
clarity of your report.
The assessors may penalise a write-up that contains an excessive quantity of irrelevant material
(see below). In such cases, the ‘excellence’ mark may be reduced and could even become negative,
as low as -10%.
Unless the project specifies a way in which an algorithm should be implemented, marks are, in
general, not awarded for programming style, good or bad. Conversely, if your output is poorly
presented — for example, if your graphs are too small to be readable or are not annotated —
then you may lose marks.
No marks are given for the submission of program code without a report, or vice versa.
The marks for each project are scaled so that a possible maximum of 150 marks are available
for the Part II Computational Projects course. No quality marks (i.e. αs or βs) are awarded.
The maximum contribution to the final merit mark is thus 150 and the same as the maximum
for a 16-lecture course.
2.1.1 Examination credit: algorithm applied to the mark awarded for each project
Each project has a unit allocation. The mark awarded for each project is weighted according
to the unit allocation, with each project unit equating to a maximum of 5 Tripos marks. The
weighted marks for each project are summed to obtain a candidate’s total Tripos mark.
To obtain maximum credit, you should submit projects with unit allocations that sum to 30 units.
If you submit N units, where N > 30 (i.e. if you submit more then the maximum number of
units), then the following algorithm applies:
This algorithm ensures that you can score no more than the overall maximum available, i.e. 150
Tripos marks, by reducing the mark only on your weakest project. There is no expectation that
you submit more than 30 units; this algorithm is simply a way to calculate your mark if you
do.7
A fractional total Tripos mark resulting from the weighting/scaling process is rounded up or
down to the nearest integer, with an exact half being rounded up.
Your record of the work done on each project should contain all the results asked for and your
comments on these results, together with any graphs or tables asked for, clearly labelled and
referred to in the report. However, it is important to remember that the project is set as a piece
of mathematics, rather than an exercise in computer programming; thus the most important
aspect of the write-up is the mathematical content. For instance:
• Your comments on the results of the programs should go beyond a rehearsal of the program
output and show an understanding of the mathematical and, if relevant, physical points
involved. The write-up should demonstrate that you have noticed the most important
features of your results, and understood the relevant mathematical background.
• When discussing the computational method you have used, you should distinguish between
points of interest in the algorithm itself, and details of your own particular implementation.
Discussion of the latter is usually unnecessary, but if there is some reason for including it,
please set it aside in your report under a special heading: it is rare for the assessors to be
interested in the details of how your programs work.
• Your comments should be pertinent and concise. Brief notes are perfectly satisfactory —
provided that you cover the salient points, and make your meaning precise and unam-
biguous — indeed, students who keep their comments concise can get better marks. An
over-long report may well lead an assessor to the conclusion that the candidate is unsure
of the essentials of a project and is using quantity in an attempt to hide the lack of quality.
Do not copy out chunks of the text of the projects themselves: you may assume that the
assessor is familiar with the background to each project and all the relevant equations.
• Similarly you should not reproduce large chunks of your lecture notes; you will not gain
credit for doing so (and indeed may lose credit as detailed in §2.1). However, you will be
expected to reference results from theory, and show that you understand how they relate to
your results. If you quote a theoretical result from a textbook, or from your notes, or from
the WWW, you should give both a brief justification of the result and a full reference.8 If
you are actually asked to prove a result, you should do so concisely.
• Graphs will sometimes be required, for instance to reveal some qualitative features of your
results. Such graphs, including labels, annotations, etc., need to be computer-generated
7
Needless to say, if you submit fewer than 30 units no upwards scaling applies.
8
See also the paragraph on Citations in §5
• You should take care to ensure that the assessor sees evidence that your programs do
indeed perform the tasks you claim they do. In most cases, this can be achieved by
including a sample output from the program. If a question asks you to write a program
to perform a task but doesn’t specify explicitly that you should use it on any particular
data, you should provide some ‘test’ data to run it on and include sample output in your
write-up. Similarly, if a project asks you to ‘print’ or ‘display’ a numerical result, you
should demonstrate that your program does indeed do this by including the output.
• Above all, make sure you comment where the manual specifically asks you to.
It also helps the assessors if you answer the questions in the order that they appear in
the manual and, if applicable, number your answers using the same numbering scheme
as that used by the project. Make clear which outputs, tables and graphs correspond to
which questions and programs.
The following are indicative of some points that might be addressed in the report; they are not
exhaustive and, of course not all will be appropriate for every project. In particular, some are
more relevant to pure mathematical projects, and others to applied ones.
• Does the algorithm or method always work? Have you tested it?
• What is the theoretical running time, or complexity, of the algorithm? Note that this
should be
measured by the number of simple operations required, expressed in the usual
O f (n) or Ω f (n) notation, where n is some reasonable measure of the size of the input
(say the number of vertices of a graph) and f is a reasonably simple function. Examples
of simple operations are the addition or multiplication of two numbers, or the checking of
the (p, q) entry of a matrix to see if it is non-zero; with this definition finding the scalar
product of two vectors of length n takes order n operations. Note that this measure of
complexity can differ from the number of Matlab commands/‘operations’, e.g. there is a
single Matlab command to find a scalar product of two vectors of length n.
• What is the accuracy of the numerical method? Is it particularly appropriate for the
problem in question and, if so, why? How did you choose the step-size (if relevant), and
how did you confirm that your numerical results are reliably accurate for all calculations
performed?
• How do the numerical answers you obtain relate to the mathematical or physical system
being modelled? What conjectures or conclusions, if any, can you make from your results
about the physical system or abstract mathematical object under consideration?
The word brief peppers the last few paragraphs. To emphasise this point, in general eight sides
of A4 of text, excluding in-line graphs, tables, etc., should be plenty for a clear concise report
of a seven or eight unit project.9 Indeed, the best reports are sometimes shorter than this.
To this total you will of course need to add tables, graphs etc. However, do not include every
single piece of output you generate: include a selection of the output that is a representative
sample of graphs and tables. It is up to you to choose a selection which demonstrates all the
important features but is reasonably concise. Presenting mathematical results in a clear and
concise way is an important skill and one that you will be evaluated upon in CATAM. Twenty
pages of graphs would be excessive for most projects, even if the graphs were one to a page.10
Remember that the assessors will be allowed to deduct up to 10% of marks for any project
containing an excessive quantity of irrelevant material. Typically, such a project might be long-
winded, be very poorly structured, or contain long sections of prose that are not pertinent.
Moreover, if your answer to the question posed is buried within a lot of irrelevant material then
it may not receive credit, even if it is correct.
Choice of Word Processor. As to the choice of word processor, there is no definitive answer.
Many mathematicians use LATEX (or, if they are of an older generation, TEX), e.g. this
document is written in LATEX. However, please note that although LATEX is well suited for
mathematical typesetting, it is absolutely acceptable to write reports using other word-
processing software, e.g. Microsoft Word or LibreOffice.
• Microsoft Word is commercial, but is available free while you are a student at Cam-
bridge: see
https://help.uis.cam.ac.uk/service/collaboration/office365.
• LibreOffice can be installed for free for, inter alia, the Windows, MacOS and Linux
operating systems from
9
Reports of projects with fewer/more units might be slightly shorter/longer.
10
Recall that graphs should not as a rule be printed one to a page.
LATEX. If you decide to use LATEX, you will probably want to install it on your own personal
computer. This can be done for free. For recommendations of TEX distributions and
associated packages see
• http://www.tug.org/begin.html and
• http://www.tug.org/interest.html.
Front end. In addition to a TEX distribution you will also need a front-end (i.e. a ‘clever
editor’). A comparison of TEX editors is available on WikipediA; below we list a
few of the more popular TEX editors.
TEXstudio. For Windows, Mac and Linux users, there is TEXstudio. The proTEXt
distribution, based on MiKTEX, includes the TEXstudio front end.
TEXworks. Again for Windows, Mac and Linux users, there is TEXworks. The
MiKTEX distribution includes TEXworks.
TEXShop. Many Mac aficionados use TEXShop. To obtain TEXShop and the TEXLive
distribution see http://pages.uoregon.edu/koch/texshop/obtaining.html.
TEXnicCenter. TEXnicCenter is another [older] front end for Windows users.
LyX. LyX is not strictly a front end, but has been recommended by some previous
students. LyX is available from
http://www.lyx.org/.
However, note that LyX uses its own internal file format, which it converts to
LATEX as necessary.
Learning LATEX. A Brief LATEX Guide for CATAM is available for download from
http://www.maths.cam.ac.uk/undergrad/catam/files/Brief-Guide.pdf .
• The LATEX source file (which may be helpful as a template), and supporting files,
are available for download as a zip file from
http://www.maths.cam.ac.uk/undergrad/catam/files/Guide.zip .
Mac, Unix and most Windows users should already have an unzip utility. Win-
dows users can download 7-Zip if they have not.
Other sources of help. A welter of useful links have been collated by the Engineering De-
partment on their Text Processing using LATEX page; see
http://www.eng.cam.ac.uk/help/tpl/textprocessing/LaTeX_intro.html.
Layout of the first page. The first page of your report should include the project name and
project number.
Your script is marked anonymously. Hence, your name or user identifier should not ap-
pear anywhere in the write-up (including any output).
Further technicalities. Please do not use red or green for text (although red and/or green lines
on plots are acceptable). Please leave a margin at least 2 cm wide at the left, and number
each page, table and graph.
Program listings. At the end of each report you should include complete listings (i.e. printout
of source code) of every major program used to generate your results. You do not need to
include a listing of a program which is essentially a minor revision of another which you
3 Computing Facilities
You may write and run your programs on any computer you wish, whether it belongs to you
personally, to your College, or to the University.
When permitted by COVID protocols, you can also use other computing facilities around the
University; for further information (including which Colleges are linked to the MCS network)
see11
https://help.uis.cam.ac.uk/service/desktop-services/mcs/mcs-sites
At most MCS locations you can access the Matlab software and any files you store on the MCS
from one location should be accessible from any other MCS location.
If you believe that do not have access to an adequate computer to complete the CATAM projects,
you should contact your Director of Studies and/or the CATAM helpline well in advance of any
project deadlines.
3.1 Backups
Whatever computing facilities you use, make sure you make regular (electronic and pa-
per) backups of your work in case of disaster! Remember that occasionally systems go down
or disks crash or computers are stolen. Malfunctions of your own equipment or the MCS
are not an excuse for late submissions: leave yourself enough time before the deadline.
Possibly one of the easiest ways to ensure that your work is backed up is to use an online ‘cloud’
service; many of these services offer some free space. WikipediA has a fairly comprehensive list
at http://en.wikipedia.org/wiki/Comparison_of_online_backup_services. In particular note
that eligible students have 5TB of OneDrive personal storage space via their University Microsoft
account under a University agreement and unlimited storage via Google Drive (see https://help.
uis.cam.ac.uk/individual-storage).
4 Information Sources
contains much useful information relating to CATAM. There are on-line, and up-to-date,
copies of the projects, and any data files required by the projects can be downloaded.
There is also the booklet Learning to use Matlab for CATAM project work .
11
Note that the Phoenix Teaching Rooms and the Titan Room are used during term-times for practical classes
by other Departments, but a list of these classes is posted at each site at the start of each term so that you can
check the availability in advance (see Opening Hours).
The CATAM Helpline. If you need help (e.g. if you need clarification about the wording of a
project, or if you have queries about programming and/or Matlab), you can email a query
to the CATAM Helpline: [email protected]. Almost all queries may be sent to the
Helpline, and it is particularly useful to report potential errors in projects. However the
Helpline cannot answer detailed mathematical questions about particular projects. Indeed
if your query directly addresses a question in a project you may receive a standard reply
indicating that the Helpline cannot add anything more.
In order to help us manage the emails that we receive,
• please use an email address ending in cam.ac.uk (rather than a Gmail, etc. address)
both so that we may identify you and also so that your email is not identified as
spam;
• please specify, in the subject line of your email, ‘Part II’ as well as the project number
and title or other topic, such as ‘Matlab query’, to which your email relates;
• please also restrict each email to one question or comment (use multiple emails
if you have more than one question or comment).
The Helpline is available during Full Term and one week either side. Queries sent outside
these dates will be answered subject to personnel availability. We will endeavour (but
do not guarantee) to provide a response from the Helpline within three working days.
However, if the query has to be referred to an assessor, then it may take longer to receive
a reply. Please do not send emails to any other address.
The CATAM FAQ Web Pages. Before asking the Helpline about a particular project, please
check the CATAM FAQ web pages (accessible from the main CATAM web page). These
list questions which students regularly ask, and you may find that your query has already
been addressed.
Advice from Supervisors and Directors of Studies. The general rule is that advice must be gen-
eral in nature. You should not have supervisions on any work that is yet to be submitted
for examination.
The objective of CATAM is for you to learn computational methods, mathematics and written
presentation skills. To achieve these objectives, you must work independently on the projects,
both on the programming and on the write-ups.
The work that you turn in must be your own. This applies equally to the source code and the
write-ups, i.e. you must write and test all programs yourself, and all reports must be written
independently.
Any attempt to gain an unfair advantage, for example by copying computer code, mathematics,
or written text, is not acceptable and will be subject to serious sanctions.
If you have any questions about what constitutes unfair means, you should seek advice from the
CATAM helpline.
Citations. It is, of course, perfectly permissible to use reference books, journals, reference articles
on the WWW or other similar material: indeed, you are encouraged to do this. You may
quote directly from reference works so long as you acknowledge the source (WWW pages
should be acknowledged by a full URL). There is no need to quote lengthy proofs in full,
but you should at least include your own brief summary of the material, together with a
full reference (including, if appropriate, the page number) of the proof.
Programs. You must write your own computer programs. Downloading computer code, e.g.
from the internet, that you are asked to write yourself counts as plagiarism even if cited.
Acceptable collaboration. It is recognised that some candidates may occasionally wish to discuss
their work with others doing similar projects. This can be educationally beneficial and is
accepted provided that it remains within reasonable bounds. Acceptable collaboration may
include an occasional general discussion of the approach to a project and of the numerical
algorithms needed to solve it. Small hints on debugging code (note the small ), as might
be provided by an adviser, are also acceptable.
Generative AI. Using generative AI (e.g. ChatGPT, Bing, Bard and similar) to produce some
or part of the submitted write-up or source code would not be original work and hence
is considered a form of academic misconduct. This interpretation is consistent with Uni-
versity guidelines. We use software that is capable of detecting AI-generated content, and
where a case of unfair means is suspected, the Examiners may, at their discretion, examine
a candidate by means of an Oral Examination.
• using someone else’s program or any part of it as a model, or working from a jointly
produced detailed program outline;
• turning in output from a generative AI either in the report or in the source code.
These comments apply just as much to copying from the work of previous Part II students, or
another third party (including any code, etc. you find on the internet), as they do to copying
from the work of students in your own year. Asking anyone for help that goes past the limits of
acceptable collaboration as outlined above, and this includes posting questions on the internet
(e.g. StackExchange), constitutes unfair means.
Further, you should not allow any present or future Part II student access to the work you have
undertaken for your own CATAM projects, even after you have submitted your write-ups. If
you knowingly give another student access to your CATAM work you are in breach of these
guidelines and may be charged with assisting another candidate to make use of unfair means.
5.1 Further information about policies regarding plagiarism and other forms
of unfair means
University-wide Statement on Plagiarism. You should familiarise yourself with the University’s
Statement on Plagiarism.
There is a link to this statement from the University’s Good academic practice and plagia-
rism website
http://www.plagiarism.admin.cam.ac.uk/,
which also features links to other useful resources, information and guidance.
Faculty Guidelines on Plagiarism. You should also be familiar with the Faculty of Mathematics
Guidelines on Plagiarism. These guidelines, which include advice on quoting, paraphrasing,
referencing, general indebtedness, and the use of web sources, are posted on the Faculty’s
website at
http://www.maths.cam.ac.uk/facultyboard/plagiarism/.
In order to preserve the academic integrity of the Computational Projects component of the
Mathematical Tripos, the following procedures have been adopted.
Declarations. To certify that you have read and understood these guidelines, you will be asked
to sign an electronic declaration. Further instructions will be given during Michaelmas
Term.
In order to certify that you have observed these guidelines, you will be required to sign
an electronic submission form provided when you submit your write-ups, and you are
advised to read it carefully; it will be similar to that reproduced (subject to revision) as
Appendix A. You must list on the form anybody (students, supervisors and Directors of
Studies alike) with whom you have exchanged information (e.g. by talking to them, or by
electronic means) about the projects at any more than a trivial level: any discussions that
affected your approach to the projects to any extent must be listed. Failure to include on
your submission form any discussion you may have had is a breach of these guidelines.
Checks on submitted program code. The Faculty of Mathematics uses (and has used for
many years) specialised software, including that of external service providers, which
automatically checks whether your programs either have been copied or have un-
acceptable overlaps (e.g. the software can spot changes of notation). All programs
submitted are screened.
The code that you submit, and the code that your predecessors submitted, is kept in
anonymised form to check against code submitted in subsequent years.
Checks on electronically submitted reports. In addition, the Faculty of Mathematics will
screen your electronically submitted reports using the Turnitin UK text-matching
software. Further information will be sent to you before the submission date. The
electronic declaration which you will be asked to complete at the start of the Michael-
mas term will, inter alia, cover the use of Turnitin UK.
Your electronically submitted write-ups will be kept in anonymised form to check
against write-ups submitted in subsequent years.
Sanctions. If plagiarism, collusion or any other method of unfair means is suspected in the
Computational Projects, normally the Chair of Examiners will convene an Investigative
Meeting (see §5.2). If the Chair of Examiners deems that unfair means were used, the case
may be brought to the University courts. According to the Statues and Ordinances of the
University 12
suspected cases of the use of unfair means (of which plagiarism is one form) will
be investigated and may be brought to one of the University courts or disci-
plinary panels. The University courts and disciplinary panels have wide powers
to discipline those found to have used unfair means in an examination, including
depriving such persons of membership of the University, and deprivation of a
degree.
The Faculty of Mathematics wishes to make it clear that any breach of these
guidelines will be treated very seriously.
However, we also wish to emphasise that the great majority of candidates have, in the past,
had no difficulty in keeping to these guidelines. Unfortunately there have been a small number
12
From https://www.admin.cam.ac.uk/univ/so/.
Viva Voce Examinations. A number of candidates may be selected, either randomly or formu-
laically, for a Viva Voce Examination after submission of either the core or the additional
projects. This is a matter of routine, and therefore a summons to a Viva Voce Examina-
tion should not be taken to indicate that there is anything amiss. You will be asked some
straightforward questions on your project work, and may be asked to elaborate on the ex-
tent of discussions you may have had with other students. So long as you can demonstrate
that your write-ups are indeed your own, your answers will not alter your project marks.
Examination Interviews. Additionally, the Chair of Examiners may summon a particular can-
didate or particular candidates for interview on any aspect of the written work of the
candidate or candidates not produced in an examination room which in the opinion of
the Examiners requires elucidation. If plagiarism or other unfair means is suspected, an
Investigative Meeting will be convened (see below).
Investigative Meetings. When plagiarism, collusion or other unfair means are suspected the
Chair of Examiners may summon a candidate to an Investigative Meeting 13 . If this hap-
pens, you have the right to be accompanied by your Tutor (or another representative at
your request). The reasons for the meeting, together with copies of supporting evidence
and other relevant documentation, will be given to your Tutor (or other representative).
One possible outcome is that the case is brought to the University courts where serious
penalties can be imposed (see Sanctions above).
Timing. Viva Voce Examinations, Examination Interviews and Investigative Meetings are a for-
mal part of the Tripos examination, and if you are summoned then you must attend. These
will usually take place during the last week of Easter Full Term. Viva Voce Examinations
are likely to take place on the Monday of the last week (i.e. Monday 10th June 2024),
while Examination Interviews and Investigative Meetings may take place any time that
week. If you are required to attend a Viva Voce Examination, an Examination Interview
and or an Investigative Meeting you will be informed in writing just after the end of the
written examinations. You must be available in the last week of Easter Full Term in
case you are summoned.
In order to gain examination credit for the work that you do on this course, you must write
reports on each of the projects that you have done. As emphasised earlier it is the quality (not
quantity) of your written report which is the most important factor in determining the marks
that you will be awarded.
13
For more information see
https://www.plagiarism.admin.cam.ac.uk/files/investigative_2016.pdf.
When you submit your project reports you will be required to complete and upload the submis-
sion form provided, detailing which projects you have attempted and listing all discussions you
have had concerning CATAM (see §5, Unfair Means, Plagiarism and Guidelines for Collabora-
tion, and Appendix A). Further details, including the definitive submission form, will be made
available when the arrangements for electronic submission of reports and programs (see below)
are announced.
• complete and submit your submission form listing each project for which you wish to gain
credit.
Further details about submission arrangements will be announced via CATAM News and email
closer to the time.
The submission deadline is
After this time, projects may be submitted only under exceptional circumstances. If an extension
is likely to be needed, a letter of application and explanation is required from your Director of
Studies. The application should be sent to the CATAM Director by the submission date as
detailed above.14
• Applications must demonstrate that there has been an unexpected development in the
student’s circumstances.
• Extensions are not normally granted past the Friday of submission week.
A student who is dissatisfied with the CATAM Director’s decision, can request within 7 days of
the decision, or by the submission date (extended or otherwise), whichever is earlier, that the
Chair of the Faculty review the decision.
The Computational Projects Assessors Committee reserves the right to reduce the marks
awarded for any projects (including reports and source code) which are submitted late.
You will be required to submit electronically copies of both your reports and your program source
files. Electronic submission enables the Faculty to run automatic checks on the independence
of your work, and also allows your programs to be inspected in depth (and if necessary run) by
the assessors.
14
Alternatively, the University’s procedure can be invoked via the Examination Access and Mitigation Com-
mittee; see the Guidance Notes for Dissertation and Coursework extensions.
After the submission deadline the electronic files will be taken offline and you will not be able
to download your submitted work from the submission site. We recommend that you keep
electronic copies of your work.
Since the manuals will be taken off-line after the close of submission, you might also like to save
a copy of the projects you have attempted.
It is critical that you do not make your reports or source code available to any
present or future students. This includes posting to publically accessible repositories
such as github.
Please note that all material that you submit electronically is kept in anonymised form to check
against write-ups and program code submitted in subsequent years.
If a student is returning from intermission that began in an academic year during which they
submitted some or all of the CATAM projects, then in certain circumstances it is possible to
carry forward some or all of their CATAM marks from that year. Action is required by the
Director of Studies. Hence, before attempting any further CATAM work, the student should
discuss the options available with their Director of Studies and decide on their intended strategy.
The following general policies have been approved by the Faculty Board. If there are exceptional
circumstances in which these seem inappropriate, the Director of Studies should discuss these
with the CATAM Director: [email protected].
In the unlikely event that a Part II student submits some CATAM projects in the Easter Term,
intermits, and is then allowed to repeat the entire year starting in Michaelmas Term, they should
normally be expected to start CATAM afresh as a logical part of repeating the year.
On the other hand, if a Part II student submits some CATAM projects in the Easter Term,
then intermits, and then returns at the start of either the Lent Term or the Easter Term, then
any marks on projects submitted should be carried over. In addition, the student may submit
as many new projects as they wish in the Easter Term of the year they return. If the total
number of units submitted is greater than 30, then they will receive credit for their best 30 units,
as defined by the standard algorithm. The Director of Studies must notify the Undergraduate
Office and the CATAM Director about any marks that are to be carried forward.
15
See https://password.csx.cam.ac.uk/forgotten-passwd.
COMPUTATIONAL PROJECTS
STATEMENT OF PROJECTS SUBMITTED FOR EXAMINATION CREDIT
1. Your name, College or CRSid User Identifier must not appear anywhere in the submitted
work.
2. Complete this declaration form and submit it electronically with your reports.
3. The Moodle submission site will close at 4pm on submission day and it is likely to be
slow immediately prior to the deadline. Please turn in your work earlier if possible and be
prepared for delays in the website on submission day.
IMPORTANT
Candidates are reminded that Discipline Regulation 7 reads:
No candidate shall make use of unfair means in any University examination. Unfair
means shall include plagiarism16 and, unless such possession is specifically autho-
rized, the possession of any book, paper or other material relevant to the examina-
tion. No member of the University shall assist a candidate to make use of such unfair
means.
To confirm that you are aware of this, you must check and sign the declaration below and
include it with your work when it is submitted for credit.
The Faculty of Mathematics wishes to make it clear that failure to comply with this
requirement is a serious matter that could render you liable to sanctions imposed by
the University courts.
16
Plagiarism is defined as submitting as one’s own work, irrespective of intent to deceive, that which derives
in part or in its entirety from the work of others without due acknowledgement.
July 2023/Part II/Introduction Page 18 of 19 c University of Cambridge
DECLARATION BY CANDIDATE
I hereby submit my reports on the following projects and wish them to be assessed for exami-
nation credit:
I certify that I have read and understood the section Unfair Means, Plagiarism and Guidelines
for Collaboration in the Projects Manual (including the references therein), and that I have
conformed with the guidelines given there as regards any work submitted for assessment at the
University. I understand that the penalties may be severe if I am found to have not kept to the
guidelines in the section Unfair Means, Plagiarism and Guidelines for Collaboration. I agree to
the Faculty of Mathematics using specialised software, including Turnitin UK, to automatically
check whether my submitted work has been copied or plagiarised and, in particular, I certify
that
• the composing and writing of these project reports is my own unaided work and no part
of it is a copy or paraphrase of work of anyone other than myself;
• the computer programs and listings and results were not copied from anyone or from
anywhere (apart from the course material provided);
• I have not shown my programs or written work to any other candidate or allowed anyone
else to have access to them;
• I have listed below anybody, other than the CATAM Helpline or CATAM advisers, with
whom I have had discussions or exchanged information at any more than a trivial level
about the CATAM projects, together with the nature of those discussions and/or ex-
changes.
Signed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Date . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 Introduction
Bessel’s equation of order n is the linear second-order equation
x2 y 00 + xy 0 + (x2 − n2 )y = 0. (1)
Bessel functions of the first kind are solutions of (1) which are finite at x = 0. They are usually
written Jn (x).
Write a program to sum a truncation of this series. Plot Jn (x) for n = 0, 1, 4 for a range
of x, e.g., for 0 6 x 6 100. Discuss your choice of truncation, and identify a range of x
for which this summation method is not accurate and explain why.
If F (x) is a function which is only appreciably non-zero over a limited range of x, say 0 < x < X,
then it is possible to approximate F̂ (k) by means of finite sums. Suppose
In order to deduce the relationship between the F̂s and F̂ (k), we first note from (5) that F̂s
represents values of the Fourier Transform spaced by the “wavenumber” interval ∆k, where
∆k = 1/X . (7)
Now it is to be expected that (5) will fail to approximate to (3) when the exponential function
oscillates significantly between sample points, that is when
1
|k| & = 21 K. (9)
2∆x
This, together with its periodicity, suggests that F̂s will be related to F̂ (k) by
(
F̂ (s∆k) s = 0, . . . , 21 N − 1,
F̂s ∼
= (10)
F̂ (s∆k − K) s = 12 N, . . . , N − 1.
Because of the periodicity, the F̂s are usually thought of as a series with s = 0, . . . , N − 1, the
upper half being mentally re-positioned to correspond to negative “wavenumber”. Note that if
F (x) is real, and ∗ denotes a complex conjugate, then
Question 3 Carefully discuss under what limiting conditions for both N and X (pos-
sibly after a suitable change in origin in x), does the DFT tend to the Fourier Transform?
The Fast Fourier Transform (FFT) method provides an efficient way to evaluate the DFT. This
method involves effcient evaluation of sums of the form
N
X −1
σrs
λs = µr ωN , s = 0, . . . , N − 1 , σ = ±1 , (13)
r=0
where N is an integer and the µr are a known sequence. The “fast” in FFT requires N to be
a power of a small prime, or combination of small primes; for simplicity we will assume that
N = 2M .
A brief outline of the FFT method is given in the Appendix. However, it is not necessary to
understand the implementation details, since you may use the Matlab one-dimensional Fast
then
Im(I2 ) = 0 , Re(I2 ) = 2Re(I1 ). (14b)
With the definitions of §2 and §3, the FFT algorithm is ideally suited to approximating
I1 rather than I2 . Hence if an approximation to I2 is desired, an approximation to I1
could first be calculated, and then the relations (14b) could be used. If this procedure for
calculating I2 is adopted, and FN 6= F0 , explain why F0 should be replaced by 12 (F0 + FN )
before calculating the DFT. What is the equivalent result to (14b) if F (x) is a real odd
function?
Question 5 Using a FFT code, and the results of question 4, find numerically the
Fourier Transform of Jn (x):
Z +∞
ˆ
Jn (k) = Jn (x) exp (−2πikx) dx . (15)
−∞
To obtain Jn (x), you may either devise a method of your own (e.g., a combination of
questions 1 and 2), or you may use the Matlab procedure besselj.
You should obtain results for n = 0, 1, 2, 4, and 8. Choose sufficient points in the
transform to adequately resolve the functions.
Plots of Jn (x) for a few representative values of n should be included in your write-up.
You should also include plots of Jˆn and Jˆn on the same graph. Choose a range of k which
allows you to see the detailed behaviour in the interval −1 6 πk 6 1.
Comment on your results and discuss their accuracy. Discuss how the FFT deals with
any values of k which might be expected from the theoretical result to give problems. You
∗
You will not receive credit if you do not use the FFT method.
The Fast Fourier Transform (FFT) technique is a quick method of evaluating sums of the form
N
X −1
σrs
λr = µs ωN , r = 0, . . . , N − 1, σ = ±1, (18)
s=0
where N is an integer, µs is a known sequence and ωN = e2πi/N . The “fast” in FFT depends
on N being a power of a small prime, or combination of small primes; for simplicity we will
assume that N = 2M . Write
λr ←→ µs , r, s = 0, . . . , N − 1 (19)
Hence if the half-length transforms are known, it costs 12 N products to evaluate the λr .
To execute an FFT, start from N vectors of unit length (i.e., the original µs ). At the sth stage,
s = 1, 2, . . . , M , assemble 2M −s vectors of length 2s from vectors of length 2s−1 – this “costs”
2M −s × 12 (2s ) = 2M −1 = 21 N products for each stage. The complete discrete Fourier transform
has been formed after M stages, i.e., after O( 21 N log2 N ) products. For N = 1024 = 210 , say,
the cost is ≈ 5 × 103 products, compared to ≈ 106 products in naive matrix multiplication.
A description and short history of the FFT are given in Chapter 12 of the book Numerical
Recipes by Press et al.
We consider the problem of solving Poisson’s equation in a square domain with homogeneous
Dirichlet boundary conditions
with u = 0 on x = 0, x = 1, y = 0 and y = 1.
A numerical solution is attempted by finding values for u at grid points in a square N × N
mesh. The (i, j)th point is given by (xi , yj ) = (ih, jh) where h = 1/(N − 1). The value of ∇2 u
is approximated at each of the interior points by a finite-difference formula
1
(∇2 u)i,j ' [ui+1,j + ui−1,j + ui,j+1 + ui,j−1 − 4ui,j ] . (2)
h2
By requiring that (∇2 u)i,j is equal to f (xi , yj ) at each of the interior points, we obtain (N − 2)2
linear equations for the (N − 2)2 unknowns ui,j , (1 6 i 6 N − 2, 1 6 j 6 N − 2), of the form
1
[ui+1,j + ui−1,j + ui,j+1 + ui,j−1 − 4ui,j ] = f (xi , yj ) . (3)
h2
The values of ui,j at the boundary points are set by the boundary conditions. Here ui,j is equal
to zero at each boundary point.
We now have to solve these linear equations as quickly and as accurately as possible. Note that
even if the solution of the linear equations were obtained with perfect accuracy, it would still
be only an approximate solution to the original partial differential equation, since (2) is only
an approximation to equation (1).
For larger values of N it is impractical to solve the (N − 2)2 linear equations by direct methods,
such as Gaussian elimination, because of storage limitations. An alternative approach is to
use an iterative “relaxation” method. Equation (3) may be reordered to suggest the iteration
scheme
1 h n+1 i
un+1
i,j = u + u n+1
+ un
i+1,j + un
i,j+1 − h 2
f (x i , yj ) for i, j = 1, . . . , N − 2 , (4)
4 i−1,j i,j−1
where the superscripts denote the number of the iteration; this is conventionally called the
Gauss-Seidel scheme. Note the appearance of (n + 1)th iterates on the right-hand side. The
calculation works through the grid with i and j increasing, and updated values are used as soon
as they become available.
Question 1 Take f (x, y) = x2 (1 − x)y 3 (1 − y). Write a program to apply the Gauss-
Seidel scheme (4) to solve (3) on an arbitrary sized (N × N ) mesh. Use your program to
investigate the convergence properties of the scheme as N varies. In particular, after a
reasonably large number of iterations you should calculate:
maxi,j |n+1
( )!
i,j |
r∞ = − log lim . (6)
n→∞ maxi,j |ni,j |
What do you conclude about the number of iterations needed for convergence to a spec-
ified accuracy (e.g. for the magnitude of residual error to be less than a given tolerance
at each point)? Estimate as a power of N the number of operations (i.e. additions, mul-
tiplications and divisions) needed for such convergence. Check your answer by measuring
the computational time in different cases.∗ Suggested values for N that you might try are
9, 17, 33, 65, etc. Also estimate as a power of N the number of operations needed for
convergence to an accuracy consistent with the truncation error of the discretisation (3)
of equation (1).
Your calculations should show that the part of the error that decays slowest for each N (and
therefore that which dominates after a large number of iterations) has a form very similar to
the lowest Fourier mode that will fit into the domain. The convergence is thus limited by large
scales, not by small scales.
This motivates the multigrid method described below. The basic idea is that the error left
after a few iterations is on scales much larger than the grid scale. The correction needed to
the approximate solution to remove this error may therefore be determined more efficiently
by transferring the error to a coarser grid, iterating on the coarser grid where convergence is
more rapid, then transferring the calculated correction back to the finer grid, updating the
approximate solution, and iterating on the finer grid again. The whole procedure is then
repeated until the required convergence is achieved. Furthermore the procedure need not be
confined to two grids. It is natural to improve the convergence of the coarse grid problem by
transferring the error in that to a coarser grid still, and so on.
The multigrid procedure may be defined more exactly as follows. Assume that we have a
sequence of K grids, labelled by J = 1, . . . , K in increasing order of fineness, the Jth grid
having size NJ × NJ . It is convenient to take the mesh spacing of the (J − 1)th grid to be twice
that of the Jth grid, i.e. NJ = 2NJ−1 − 1.
On the Jth grid we wish to solve the linear system
where the operator LJ corresponds to that acting on the left-hand side of (3), if NJ = N . Note
that it is important when writing down the form of L for arbitrary J to remember that h in (3)
must be replaced by 1/(NJ − 1).
∗
Hint: given the speed of current computers, the timing of a single run of your program might be dominated
by start/end overheads.
(A) Apply the Gauss-Seidel iteration scheme (hereafter G-S) ν1 times to obtain an approximate
solution ũ(J) . The error v(J) in this solution therefore satisfies
(B) Transfer the problem of determining v(J) to the coarser (J − 1)th grid as
Coarsest grid
(C) On the coarsest grid apply G-S ν2 times to obtain an approximate solution ũ(1) .
(D) Transfer the approximate solution on the (J − 1)th grid to the Jth grid to give a new
approximation to the solution to the problem on that grid
(J)
ũ(J)
new = ũold + P ũ
(J−1)
, (10)
(E) Apply G-S ν3 times on the Jth grid to improve the approximation ũ(J) .
The ascending part of the cycle repeats (D) and (E), starting with J = 2 and ending with
J = K to leave an approximate solution to the full problem.
Note that within each multigrid cycle, the approximate solution ũ(J) and the right-hand side
r(J) are generated from the problem on the (J + 1)th grid during the descending part of the
cycle and must be stored for use again at the Jth level during the ascending part of the cycle.
Each r(J) changes from cycle to cycle, except r(K) which is always equal to f (K) (i.e. the vector
whose elements are f evaluated at each internal point of the Kth grid).
It remains to specify the restriction and prolongation operators R and P that you should use.
It is natural to take both to be linear operators. Consider the following two sets of points.
• • • • • • . • . •
• • • • • P . . . . .
part of Jth grid ←− part of (J − 1)th grid
• • • • • R • . • . •
centred on (i, j) centred on (k, l)
• • • • • −→ . . . . .
• • • • • • . • . •
That on the left is a set of points in the Jth grid with the centre point labelled (i, j). That on
the right is the same region in the (J − 1)th grid. In the latter only those points marked with
a • are included in the grid, with the centre point now labelled (k, l) say.
The prolongation operator P maps a function defined on points in the (J − 1)th grid onto the
points in the Jth grid. Similarly the restriction operator R maps a function defined on points
That for P means that if f = 0 for all points in the (J − 1)th grid except that labelled (k, l),
then Pf will be zero at all points on the Jth grid except for the square of nine points centred
on (i, j) where it will take the values in the “mask”. Pf may be evaluated for general functions
f by linearity. Each number in the mask for R represents the contribution from a point in the
Jth grid, e.g. the square of nine points centred on (i, j) to the point (k, l) in the (J − 1)th grid;
note that points outside this square make no contribution.
Question 2 Write a program to apply the multigrid method as specified above. You
will probably find it useful to have separate procedures/subprograms, working on grids of
arbitrary resolution, to carry out each of the operations of prolongation and restriction, to
calculate the residual in the difference equations and to apply the Gauss-Seidel iteration
(exploit your existing program from question 1 here).
Apply the multigrid method to the solution of the same equation as in question 1. Investi-
gate the rate of convergence associated with a single multigrid cycle for a fixed resolution
of the finest grid, particularly its dependence on
To start with, a suggested value for NK is 65, for NK−1 is 33, etc. In each case estimate
the total number of operations in a complete cycle, and give a measure of the numerical
efficiency. Justify carefully the measure of efficiency that you are using (e.g.
remember to include the cost of all operations within a cycle).
What are your conclusions about the best choices for the resolution of the coarsest grid,
and for the numbers ν1 , ν2 and ν3 ? Next, choose suitable values of N1 , ν1 , ν2 and ν3 ,
and investigate the dependence of the rate of convergence on NK . Finally, discuss the
improvement in efficiency of multigrid over the simple Gauss-Seidel iteration in question 1
when the aim is convergence to an accuracy consistent with the truncation error of the
discretisation (3) of equation (1).
References
[1] Briggs, William L., Henson, Van Emden, McCormick, Steve F. (2000) A Multigrid Tutorial,
SIAM (ISBN 0-89871-462-1).
1 Introduction
This project illustrates the way in which a disturbance in a ‘dispersive-wave’ system can change
shape as it travels. In order to fix ideas we shall consider one-dimensional waves, depending
on a single spatial coordinate x and time t, which are modelled by a system of linear constant-
coefficient partial differential equations that is (i) second-order in time and (ii) time-reversible.
Such a system has single-Fourier-mode (aka ‘plane-harmonic-wave’) solutions proportional to
eikx∓iω(k)t (1)
for any real ‘[angular] wavenumber’ k, where the [angular] frequency’ ω is real and related to k
by a system-dependent ‘dispersion relation’. The waves are ‘dispersive’ if ω is not proportional
to k (and so ‘group velocity’ dω/dk and ‘phase velocity’ ω/k vary with k, and are unequal).
As an example, one-dimensional ‘capillary-gravity’ waves on the free surface of incompressible
fluid of uniform depth h have dispersion relation
where g is gravitational acceleration, ρ the fluid density and γ the coefficient of surface tension.
If the disturbance is described by a function F (x, t), representing say the [non-dimensionalised]
vertical displacement of the fluid surface, the general solution for F will be a superposition of
all Fourier modes of the form (1):
Z ∞
F (x, t) = a+ (k) eikx−iω(k)t + a− (k) eikx+iω(k)t dk , (3)
−∞
where the amplitudes a+ (k) and a− (k) are fixed by the initial conditions. For simplicity we
shall take these to be
2
x ∂F
F (x, 0) = exp − 2 cos (k0 x) and (x, 0) = 0 . (4)
σ ∂t
In order to plot the solution some method is needed for evaluating the Fourier integral (5).
with inverse Z ∞
G(x) = Ĝ(k) eikx dk . (7)
−∞
provided that ∆k is small enough to resolve the variation of the integrand with k, and that
Ĝ(k) is only significant for |k| < 21 N ∆k. With ∆k = 2π/L and ∆x = L/N , this approximates
G(m∆x) by
N/2
2π X
gm ≡ Ĝn e2πimn/N for −N/2 + 1 6 m 6 N/2 (9)
L
n=−N/2+1
[note that gm is periodic in m with period N , and cannot be expected to give a useful approx-
imation to G(m∆x) for |m| > N/2, i.e. for |x| > L/2, since the eikx -factor in the integrand
would be chronically under-resolved].
(9) is the exact inverse of
N/2
L X
Ĝn = gm e−2πimn/N for −N/2 + 1 6 n 6 N/2 ; (10)
2πN
m=−N/2+1
the right-hand side is a discretisation of the integral (6) with k = n∆k, but that will not
be required in this project. The so-called Discrete Fourier Transform (10) and its inverse (9)
converge to the Fourier Transform (6) and its inverse (7) in the double limit L → ∞, N/L → ∞.
The Fast Fourier Transform (FFT) technique is a quick method of evaluating sums of the form
N
X −1
λm = µn (ζN )smn , m = 0, . . . , N − 1 , ζN = e2πi/N , s = ±1 (11)
n=0
where the µn are a known sequence, and N is a product of small primes, preferably a power
of 2. A brief outline of the FFT is given in the appendix for reference, but it is not necessary
to understand the details of the algorithm in order to complete the project – indeed, you are
strongly advised to use a black-box FFT procedure such as Matlab’s fft/ifft. Note that since
4 No Dispersion
Question 2
In the limit of ‘shallow water’ (|k|h 1 ⇒ tanh (kh) ≈ kh) and negligible surface tension
(ρ−1 γ|k|3 g|k|), the dispersion relation (2) can be approximated by the ‘dispersionless’
ω 2 = c20 k 2 (13)
√
with c0 = gh. The integral (5) can then be evaluated analytically.
Use this to test the program for t up to 10 s, taking σ = 0.5 m, k0 = 0 m−1 and c0 = 1 m s−1
[so h ≈ 0.1 m if g = 9.81 m s−2 ]. Choose appropriate values for the parameters L and N
so that your plots are correct to ‘graphical accuracy’; present evidence of this accuracy in
your write-up. Comment on your results [e.g. on the appropriateness of the ‘shallow-water’
approximation for these parameter values].
5 Gravity Waves
The ‘deep-water’ (|k|h 1 ⇒ tanh (kh) ≈ sign(k)) and negligible-surface-tension limit of the
dispersion relation (2) is
ω 2 = g|k|. (14)
Question 3 Take g = 9.81 m s−2 and in the first instance use initial condition (4) with
σ = 1 m, k0 = 0 m−1 .
• For t = 2 s investigate the effects of changing the values of L and N (maybe start
with L = 32 m and N = 32). Report the results of this investigation in your write-
up, especially with regard to the errors in the solution, using both numerical values
and plots.
Note: The behaviour of the solution for large |x| can be understood asymptotically
by performing integrations-by-parts on (5), but is not of primary interest here [and
does not apply for waves on fluid of finite depth] the main concern should be locating
the crests and troughs with reasonable accuracy.
• Display graphical results to illustrate how the solution for this initial condition evolves
for t up to at least 6 s, giving justification for your choices of L and N . Do likewise
for the initial condition (4) with σ = 6 m and k0 = 1 m−1 , for t up to at least 20 s.
Comment on the solutions, particularly in the light of group and phase velocity.
Consider now the dispersion relation for ‘deep-water’ surface waves when surface-tension effects
dominate over gravitational:
ω 2 = ρ−1 γ|k|3 . (15)
Question 4 Perform similar calculations to those in Q3 for water with ρ = 103 kg m−3
and γ = 0.074 kg s−2 , using the initial condition (4) with σ = 0.002 m, k0 = 0 m−1 and
with σ = 0.005 m, k0 = 1250 m−1 , for t up to at least 0.1 s. Compare and contrast your
results with those in Q3. You will want to use different value(s) for L (and maybe N ):
can the concept of group velocity help in choosing a suitable L for given time?
How much difference would it make to these results if the exact ‘deep-water’ dispersion
relation
ω 2 = g|k| + ρ−1 γ|k|3 (16)
were used, with g = 9.81m s−2 ?
References
Billingham, J. & King, A. C., Wave Motion: Theory and Applications, CUP.
Lighthill, M. J., Waves in Fluids, CUP.
Whitham, G. B., Linear and Nonlinear Waves, Wiley.
For simplicity restrict to the optimal case N = 2M . Then the DFT (11) can be split into its
even and odd terms
N/2−1 N/2−1
X smn0 sm
X smn0
λm = µ2n0 ζN/2 + (ζN ) µ2n0 +1 ζN/2 (17)
n0 =0 n0 =0
| {z } | {z }
λE
m λO
m
sN/2
and since λE O
m and λm are periodic in m with period N/2, and (ζN ) = −1,
sm O
λm+N/2 = λE
m − (ζN ) λm . (18)
Problem Formulation
An equation commonly encountered in population genetics is the one-dimensional diffusion
equation
∂ ρ̂ ∂j
=− + F (ρ̂). (1)
∂ t̂ ∂ x̂
Here, x̂ denotes the spatial position, t̂ the time, ρ̂(x̂, t̂) the population density, j the population
flux, and F (ρ̂) is a local source term that describes the net rate of growth in the population
density.
A typical model for local population growth is given by the Pearl-Verhulst law
(
γ ρ̂(1 − ρ̂/ρ̂s ) 0 < ρ̂ < ρ̂s ;
F (ρ̂) = (2)
0 ρ̂ 6 0 ρ̂ > ρ̂s .
This describes how a homogeneous population would grow, initially in an exponential manner,
until the population saturated at some density ρ̂s .
The flux j is the source of the diffusive behaviour and is given by,
∂ ρ̂
j = −D . (3)
∂ x̂
If it is assumed that dispersal is due to random motion of individuals, then the diffusion coeffi-
cient D is constant and Fisher’s equation is obtained. However, as a remedy to overcrowding,
dispersal would be much more effective if the diffusion coefficient were population density-
dependent. In fact this has been observed in populations of small animals. Here we consider
the case D = D0 ρ̂. With suitable non-dimensionalisation, we obtain the modified Fisher equa-
tion,
∂ρ ∂ ∂ρ
= ρ + ρ(1 − ρ). (4)
∂t ∂x ∂x
A similar equation also arises in combustion dynamics.
Travelling wave solutions to this equation are the subject of project 2.11(a). A situation of
more practical interest is when the population density is known at some initial time, and the
subsequent evolution of the population is required. In projects 2.11(b) and 2.11(c) the expansion
of a population which is initially limited to a finite spatial range is considered. Thus solutions
to (4) are required subject to the following boundary conditions:
(
ρ0 (x), 0 6 x 6 1,
ρ(x, 0) = (5)
0, x > 1,
∂ρ
(0, t) = 0, t > 0,
∂x
ρ(x, t) → 0 as x → ∞.
0 6 x 6 s(t) :
ρt = (ρρx )x + ρ(1 − ρ),
ρ(x, 0) = ρ0 (x),
ρx (0, t) = 0, (6)
ρ(s(t), t) = 0, ρx (s(t), t) = −ṡ(t).
s(t) < x :
ρ(x, t) ≡ 0
We refer to x = s(t) as the population front. From the initial population distribution (5) we
see that s(0) = 1.
In project 2.11(b), solutions of (4) are obtained for a particular initial distribution, and the
behaviour as t → ∞ is examined. In project 2.11(c), the code developed in project 2.11(b) is
used to examine the propagation of the population front.
φ ∼ 1 − Aeλξ (8)
for arbitrary ξ0 .
Question 3 Obtain solutions for c = 2, 1.5, 1 and 0.8 using a suitable integration
method. These travelling wave solutions should be plotted on the same graph on axes
with the origin chosen such that φ(ξ = 0) = 12 (to within graphical accuracy).
Investigate the change in the wave profile as the wave speed c is decreased still further.
The equation (11) is to be solved subject to the boundary conditions (12), and initial conditions
ρ(y, 0) = ρ0 (y). (13)
The final condition in (12) then determines the motion of the population front, with s(0) = 1.
Many methods exist for the numerical solution of parabolic equations. Here we consider a very
simple finite-difference method, where spatial derivatives are expressed using centred differences
and the solution is advanced in time using forward Euler. Writing tj = j(∆t), yn = n/N ,
(n = 0, 1, . . . , N ), and using the notation ρj,n ≡ ρ(tj , yn ), sj ≡ s(tj ) we discretise (11) in the
form,
ρj+1,n − ρj,n 1 ρ2j,n+1 − 2ρ2j,n + ρ2j,n−1 ṡj yn ρj,n+1 − ρj,n−1
= + + ρj,n (1 − ρj,n ),
∆t 2s2j (∆y)2 sj 2(∆y)
n = 1, 2, . . . , N − 1
ρj,0 = ρj,1 ,
ρj,N = 0,
sj+1 − sj ρj,N −2 − 4ρj,N −1
= ṡj = −s−1
j ,
∆t 2∆y
where ∆y = 1/N . The expression for ṡj is obtained by using the final condition in (12) with a
three-point backward difference expression for ρy (y = 1).
There are several more sophisticated numerical methods of solving this system of equations, but
the method described is very simple to implement and proves sufficient for the current purposes.
The main drawback is that ∆t must be chosen very small to ensure numerical stability.
Question 6 Write your own program to solve (11) using the discretisation given above.
Obtain solutions for initial distribution ρ0 (x) = 0.3ex (1 − x). Start with N = 100 and
∆t = 0.0001, but confirm that your code produces solutions that are independent of mesh-
size. Plot the solution as a function of the original spatial variable x at t = 0, 2, 4, 6, 8
and 10. Also plot the velocity of the wave front, ṡ(t), as a function of time. Compare the
large-time wave profile with the travelling wave solutions obtained in project 2.11(a).
Question 7 Using the same mesh-size as above, obtain solutions for 0 < t 6 0.5,
for initial distribution (i). Consider various values of the coefficient A1 , in the range
0.1 6 A1 6 0.9. For the larger values of A1 it may be necessary to reduce the time
step-size. Do not include plots of ρ(x, t) in your report, but concentrate on the motion of
the wave front. Write down a relationship between the initial velocity of the wave front
and the initial profile and show that this is in agreement with your numerical results.
Question 8 Calculate solutions for initial distribution (ii) with A2 = 0.2 for 0 < t 6 1.
As before plot ṡ(t) as a function of time. Repeat these calculations with the spatial mesh-
size reduced to ∆y = 0.002 and then ∆y = 0.001, adjusting ∆t as necessary. Describe the
movement of the wave front. Repeat these calculations with A2 = 0.05, for 0 < t 6 0.75.
Analysis suggests that for some classes of initial distributions, the population front is fixed until
a certain waiting time tw has elapsed, after which the population expands.
Question 9 For initial distributions which are locally quadratic in the vicinity of the
wavefront, it can be shown that the waiting time is given by
1
tw = log 1 + (14)
6g2
where ρ0 (x) ∼ g2 (1 − x)2 , as x → 1. Are the numerical results you have obtained in broad
agreement with this result? Discuss why such a phenomenon may occur in the evolution
of a population.
Question 10 Calculate the motion of the population front for initial distribution (iii)
with A3 = 0.2. As with case (ii), reduce the mesh-size. Compare your results with the
results of (ii).
References
A background to the biological models underlying these equations can be found in Some exact
solutions to a nonlinear diffusion problem in population genetics and combustion by Newman
(J. Theoretical Biology (1980) 85, 325–334).
An in-depth analysis of equations of this form is presented in The effects of variable diffusivity
on the development of travelling waves in a class of reaction-diffusion equations by King &
Needham (Phil. Trans. Roy. Soc. Lond. A (1994) 348, 229–260). This contains derivation of
the results for waiting times, but reference to this paper is not necessary for the purposes of
this project.
A one-dimensional periodic flow in a fluid has velocity u in the x-direction only, given by
A material fluid element subject to this motion will have trajectory X(t) satisfying
dX
= α cos k(X(t) − ct). (2)
dt
Question 1 Explain why, without loss of generality, distance and time units may be
chosen so that k = 2π and c = 1, giving
dX
= a cos 2π(X(t) − t). (3)
dt
How is a related to α?
Question 3 Verify from your numerical results that for |a| sufficiently small, there is
a time-averaged mean ‘drift’ velocity of 12 a2 . Include details of your method.
Question 4 Give a physical interpretation of the interaction between the flow and the
material element. Do not confine your answer only to small |a|.
dX
Hint: You may wish to consider a graph of dt against X.
Question 5 Analyse mathematically the above system, using any approach you see
fit, e.g. in the case of question 3 you might seek an approximate solution for small |a|.
1 Introduction
The tubes that carry fluid around the body (such as veins, arteries, lung airways, the urethra,
etc.) have deformable walls. The shape of such a tube is strongly coupled to the flow within it
through the internal pressure distribution. This nonlinear flow-structure interaction imparts to
such systems unusual but biologically significant properties, notably “flow limitation” (so that
airway flexibility limits the rate at which you can expel air from your lungs, for example). To
explore such interactions, one can consider a simple model system in which an incompressible
fluid flows steadily through a two-dimensional channel, one wall of which is formed by a mem-
brane under longitudinal tension. Assuming that the channel is long and thin, and that the
fluid is sufficiently viscous, lubrication theory can be used to describe the flow.
Suppose the channel lies in 0 6 y 6 h(x), 0 6 x 6 L, where L h. Applying no-slip and
no-penetration conditions along the rigid wall y = 0 and the membrane y = h, the relationship
between the steady, uniform flux q of fluid along the channel and the local pressure gradient
px is approximately q = −h3 px /(12µ), where µ is the fluid’s viscosity, assumed constant. The
fluid pressure distribution p(x) is controlled by the shape of the channel wall according to
p = −T hxx , where T is the tension in the membrane, assumed constant; the pressure outside
the membrane is taken to be zero. We assume that the membrane is fixed at either end, so
that h(0) = h(L) = h0 , for some constant h0 . The flow is controlled by the upstream and
downstream pressures p(0) = pu and p(L) = pd , and characterised by the relationship between
the flux q and the pressure drop along the channel, pu − pd , holding either pu or pd constant.
The problem can be simplified by nondimensionalisation. Let
where p0 = T h0 /L2 . This yields nondimensional parameters Q = 12µL3 q/(T h40 ), Pu = pu /p0 ,
Pd = pd /p0 and governing equations
Q = −H 3 PX , P = −HXX (0 6 X 6 1) (1)
subject to
H(0) = 1, H(1) = 1, P (0) = Pu , P (1) = Pd . (2)
We seek graphs of ∆P = Pu − Pd > 0 as a function of Q, for fixed values of Pu or Pd . (So only 3
of the 4 boundary conditions in (2) are relevant.)
This is a two-point, third-order, boundary-value problem. It can be solved by two different
methods: shooting, which is relatively easy to program but which cannot normally be extended
to problems in higher dimensions; and a direct finite-difference method, which is more com-
plicated to set up but adaptable to more complex situations. The relatively straightforward
problem given by (1) and (2) can be used to explore the relative merits of each method; both
methods can be used to explore the fluid mechanics of collapsible channel flow.
4 Continuation techniques
The Newton-Raphson method usually requires a “good” initial guess in order to converge to a
solution, so to generate solutions corresponding to strongly deformed channels a continuation
technique should be used. (The shooting method can also benefit from this approach, but it is
not usually necessary.) Start the computation with parameter values corresponding to a known
solution (e.g. a slightly deformed channel with Pd = 0, Q 1) and use the undeformed channel
(H = 1) as an initial guess. Having found this solution, slowly increment the parameters Q and
Pd to construct solutions with the channel highly deformed.
5 Questions
Throughout this project you should comment on the physical interpretation of your computed
results as well as their mathematical or numerical features.
References
[1] Acheson, D. J. Elementary Fluid Mechanics, Oxford University Press, 1990.
[2] Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P. Numerical Recipes (available
in various editions for different languages). Cambridge University Press, 1992.
A smoke ring is a vortex tube wrapped around into a closed circle (a vortex ring), which
propagates normal to the plane of the circle under its self-induced velocity field. The politically
incorrect method of generating them involves the inhalation of noxious substances; a more
socially acceptable method involves a volcano [6]. We will, throughout this project, neglect
various effects and crudely model a three-dimensional axisymmetric vortex ring of diameter a
and strength κ by a pair of point vortices in two-dimensional fluid of strengths κ and −κ, a
distance a apart.
where
rij = ((xi − xj )2 + (yi − yj )2 )1/2 . (1c)
Question 2 Show carefully that the equations of motion can be written in the form
dxi ∂H dyi ∂H
= κ−1
i , = −κ−1
i (no summation) , (2)
dt ∂yi dt ∂xi
where
1 X
H=− κi κj log rij . (3)
4π
i,j
i6=j
H is invariant under space translations and rotations, which implies the existence of three
scalar conserved quantities. What are they?
Programming Task: Write a program to integrate the equations of motion (1a), (1b)
and (1c). You should use an adaptive stepsize ODE integrator (such as the Matlab
function ode45). You will find it useful to write your code to handle arbitrarily many
vortices.
Question 4 What happens when two smoke rings are fired towards each other on the
same axis? Describe the resulting motion, giving clear physical explanations for each
behaviour observed. You should start by considering two rings with equal strengths and
widths, but should also explain what happens in the general case.
Question 5 What happens when two smoke rings are fired in the same direction on
the same axis? Describe the resulting motion, giving clear physical explanations for each
behaviour observed. You should start by considering two rings with equal strengths and
widths, but should also explain what happens in the general case. You should not need
to integrate to large times, but you should (where relevant) show several cycles of the
motion.
Programming Task: Produce a program which can only model coaxial smoke rings,
but which explicitly enforces the symmetry about the axis. In other words, use the mirror
symmetry of the model system to reduce the number of ODEs that you have to solve.
Note that this is not the best way to handle symmetries and conserved quantities in the
numerical solution of ODEs. See Iserles et. al. [2] for more information. For this project,
however, this method will suffice.
Question 8 Use this new program to repeat the simulation you did for question 7.
Show your output and comment.
Question 9 Consider three coaxial smoke rings, fired in the same direction. Use your
new program to investigate the resulting motion. Give a survey of the different kinds of
behaviour you observe, including a selection of your plots (four or so should suffice).
Note that the parameter space you have to search is rather large. You should think of ways
to reduce it.
References
[1] Acheson, D. J. 2000, Instability of vortex leapfrogging, Eur. J. Phys., 21, 269–273.
Follow links from https://iopscience.iop.org/journal/0143-0807 for a download-
able version; at the time of writing https://iopscience.iop.org/article/10.1088/
0143-0807/21/3/310 should take you there directly.
[2] Iserles, A., Munthe-Kaas, H. Z., Nørsett, S. P. & Zanna, A. 2000, Lie-group methods, Acta
Numerica, 9, 215–365.
[3] Pullin, D. I. & Saffman, P. G. 1991, Long-time symplectic integration: the example of
four-vortex motion, Proc. Roy. Soc. Lond. A, 432, 481–494.
[5] Tophøj, L. & Aref, H. 2013, Instability of vortex pair leapfrogging, Physics of Fluids, 25,
014107; https://doi.org/10.1063/1.4774333.
[6] http://www.swisseduc.ch/stromboli/etna/etna00/etna0002photovideo-en.html.
1 Introduction
The angular velocity with respect to principal axes of inertia in a rigid body is taken to be
ω = (ω1 , ω2 , ω3 ). (1)
If the principal moments of inertia are A, B, C with respect to these axes, then the angular
momentum is
h = (Aω1 , Bω2 , Cω3 ). (2)
These axes are fixed in the body and have angular velocity ω with respect to an inertial frame
instantaneously coincident with the principal axes. The rate of change of angular momentum
with respect to such an inertial frame is
dh
+ ω ∧ h. (3)
dt
In the case when there is no net moment of external forces acting on the body, the law “rate of
change of angular momentum = moment of external forces” gives:
dh
+ ω ∧ h = 0. (4)
dt
Expanding this equation into components gives:
A dω
dt + (C − B)ω2 ω3 = 0
1
dω2
B dt + (A − C)ω3 ω1 = 0 (5)
dω3
C dt + (B − A)ω1 ω2 = 0
It can be shown analytically that these equations have two first integrals, which say that the
energy and the magnitude of the angular momentum remain constant, as follows:
2
1
2 Aω1 + 12 Bω22 + 21 Cω32 = E (6)
A2 ω12 + B 2 ω22 + C 2 ω32 = H 2 (7)
Since the moment of external forces is zero, we also know that the angular momentum vector
h is constant when measured in an inertial frame.
2 Project work
Write a program to solve Euler’s equations (5) numerically and plot the results. You should
use MATLAB’s 64-bit (8-byte) double-precision floating-point values or the equivalent in other
programming languages. Output from your program should include:
Equations (6) and (7) can be used to check the accuracy of your numerical results by calculating
and displaying the values of these expressions at the beginning and end of runs.
The objective of the project is to investigate and classify all possible types of motion. The
following questions provide some guidelines for the investigation.
Question 1 Since A, B, C, ω1 (0), ω2 (0) and ω3 (0) may all in principle take arbitrary
values, the parameter space to be explored may seem very large. If A, B and C take
distinct values, explain how the results from taking
A>B>C (8)
can be generalised. Briefly explain what happens if any two (or all three) of A, B, C are
equal. For given values of A/B and C/B, explain why we may take B = 1 without loss
of generality.
Show further that choosing E = 1 is equivalent to re-scaling the time variable t, and give
the scaling factor.
From this point on, take B = 1 and E = 1 and build these values into your program. Your
program should allow you to set and change the values of A/B and C/B. Assume that A/B > 1
and C/B < 1, and work with A = 1.4, B = 1 and C = 0.7 unless other values are suggested.
You may find it convenient to accept arbitrary input for ω1 (0), ω2 (0) and ω3 (0) and then scale
the input values so that E = 1.
Question 2 Use your program to demonstrate that solutions are possible inpwhich the
vector ω(t) rotates around the OX axis with small amplitude deviation from ( 2/A, 0, 0)
(i.e., (1, 0, 0) before scaling), and that similar stable solutions exist near the OZ axis.
Include copies of your results.
Question 4 Are your numerical results consistent with equations (6) and (7)? To
what extent are further checks on numerical accuracy needed?
Question 5 What solutions do you obtain if starting conditions are chosen so that
ω(0) lies very close to the OY axis? Describe the motion physically.
Question 6 Make a plausible case based on your computed results that there exists a
solution ω(t) which begins away from the OY axis but which tends towards the steady
(but unstable) solution parallel to the OY axis as t → ∞. What happens if you attempt
to simulate such a solution numerically? What value must H take for such a solution?
Question 8 The rigid body is now subjected to slow friction via a retarding couple
−kω, where k is a very small parameter. How does this affect equations (5), (6) and (7)?
Alter your program to incorporate the couple and investigate a few types of solutions for
the original values of A, B and C. You may find it useful to consider 3D phase plots of a
suitably normalised ω(t), as well as your normal plots.
Question 9 Is your classification in question 7 still of any use or has it become irrele-
vant? Is there still a division of the solution space into regions?
1 Introduction
Scattering theory can be used to compute what happens when a beam of electrons is fired
towards a target object. The electrons collide with the object and are scattered off in new
directions. The angular distribution of the outgoing electrons is described by a sum of contri-
butions, in which the simplest term corresponds to an isotropic distribution. If the energy of
the incoming electron is not too large, this isotropic contribution dominates the distribution.
This situation is called s-wave scattering.
2 Theory
The electrons are incident on a target that is located at the origin and is decribed by an isotropic
interaction potential U (r). The time-independent Schrodinger equation for the outgoing elec-
trons has a general solution
∞
X χ` (r)
ψ(r, θ) = P` (cos θ) (1)
r
l=0
d2 χ`
2 `(` + 1)
+ k − U (r) − χ` = 0 (2)
dr2 r2
with boundary condition χ` (0) = 0. The constant k is determined by the energy E of the
incoming electrons, with E proportional to k 2 .
Suppose that the target has radius r0 and that U (r) = 0 for r > r0 . Then for large r one has
asymptotically χ` (r) ≈ A` sin(kr − 12 lπ + δ` ), where δ` is the (k-dependent) phase shift of the
`th partial wave. The total scattering cross-section σ is given by
∞
4π X
σ= 2 (2` + 1) sin2 δ` . (3)
k
`=0
The s-wave contribution to (1) is the term with ` = 0. The associated (k-dependent) cross
section is therefore σ0 = (4π/k 2 ) sin2 δ0 . To characterise the low-energy behaviour, a useful
quantity is the scattering length a, which is defined by (1/a) = limk→0 [−k cot δ0 (k)].
For further information on this theory, see for example
Programming Task: Numerically solve (2) for χ0 (r) (i.e. the case ` = 0) taking χ0 (0) =
0 and χ00 (0) an arbitrary value. (You should verify that this arbitrary value only affects
the normalisation of the wavefunction.)
Question 1 Explain your choice of numerical method and discuss the accuracy of the
solutions you obtain.
Question 2 Investigate the solutions for χ0 as you vary k in the range 0 to 5 and U0
from 0 to 10. Discuss the dependence of the wavefunction on U0 for a few different values
of k. Your report should include a few plots to support your observations, which should
include the case U0 = 0.
Note: throughout this project, you should provide graphs that illustrate clearly the sim-
ilarities and differences between the various cases. Large numbers of graphs are very
unlikely to be effective in communicating this information.
Question 3 Explain how a poor choice of r1 and r2 in Equation (5) can lead to a large
error in δ0 . How did you avoid this? Note also that Equation (5) has multiple solutions
for δ0 : explain which solution you have taken.
Question 4 For a few values of U0 , compute the phase shift and the cross-section as
functions of k, with 0 < k < 5. Plot graphs of these quantities. Take care to resolve the
behaviour near k = 0.
Question 5 For the results of Question 4, plot the phase shift δ0 as a function of U0 ,
for some small value(s) of k. Give a physical (quantum-mechanical) interpretation of the
observed behaviour.
Question 6 Still for the results of Question 4, plot the cross section σ as a function
of U0 (always for small k). How are the features of this curve (extrema, etc) related to
physical properties of the the scatterer?
Question 7 Using again the results of Question 4, determine (numerically) the scat-
tering length a from the small-k behaviour of δ0 . Is the result consistent with your results
for the cross section σ in question 5?
1 Introduction
One-dimensional bound states in quantum mechanics are investigated by using a matrix method
to estimate eigenvalues of the Schrödinger operator. Several cases are considered and the answers
are compared with theory, including the predictions of perturbation theory and variational
methods.
1 d2 ψi
− + V (x)ψi = Ei ψi .
2 dx2
To obtain approximate solutions to this equation, the real-valued position x is replaced by a
discrete set of 2N points spaced by , such that −N 6 x < N . The eigenfunction, ψ(x), is
replaced by a 2N -dimensional vector, e, where ψ(xn ) = en , with xn = (n − N ), 0 6 n < 2N .
The Schrödinger equation becomes the matrix eigenvalue equation
M ei = 2 Ei ei ,
off-diagonal entries bn = − 12 ∀n
3 Given’s Procedure
Given a symmetric tri-diagonal matrix, M , with diagonal entries cn and off-diagonal entries bn ,
consider the sequence qn , (0 6 n < 2N ), for fixed real parameter λ:
q0 = c0 − λ
qn = (cn − λ) − b2n /qn−1 n > 0. (1)
Let s(λ) be the number of the qn that are negative. Then the number of eigenvalues of
M whose values are less than λ is s(λ). That is, if the eigenvalues are ordered so that
Ei 6 Ei+1 , then
s(λ) can be computed as a function of λ by starting with a sufficiently small value of λ, incre-
menting λ in small steps and computing the sequence {qn } for each value. When s(λ) increases
e0 = 1
e1 = 2(c0 − 2 E)
en+1 = 2(cn − 2 E)en − en−1 n > 0.
Note: for bound states the relevant eigenvectors are required to decay exponentially for large
|x|. It can be shown that the matrix M only has eigenvectors which satisfy this boundary
condition.
There are three cautions:
(a) In Equation (1) there is a division by qn−1 . Should qn−1 become too small it is permissible
to replace it by a small default value, to avoid numerical instabilities: the results are
unaffected by this procedure. For the cases considered below this eventuality has not
been found to occur in practice.
before computing em .
(c) The wavefunction also decays exponentially for large positive x. This means that for large
n (bigger than N ), the values of en will become very small. However, if you continue the
calculation to very large n, numerical (round-off) errors can lead to exponential growth
of the numerical estimates of en . The calculation is not accurate in this regime: if this
happens you should stop the calculation at some nmax < 2N , in order to obtain an accurate
estimate of the true eigenvector e. Alternatively, use the fact that all wavefunctions are
either even or odd so the en only need to be computed for 0 6 n 6 N . (Take care with
the normalisation if you use this method).
[See the Appendix for some theoretical results which may be of use here and in later sections.]
As a check of your code give the four lowest eigenenergies for the potential
1
V (x) = x2 .
2
Adjust N and to get results to at least 3 decimal places for the eigenvalues and accurate to
within 1% for the significant part of the wavefunctions. Make sure that N is not too big since
the wavefunctions are very small for x = N and so nothing is gained. However, if is too
big and/or N is too small the results will be inaccurate. A bit of trial and error will yield
good values with which to work, but in each case considered you should check that results are
insensitive to changes in N and within the accuracy required. Reasonable values to start with
are N = 50 and = 0.1 but you should be able to increase N up to 500 at least.
Question 1 State the values of N and that you have chosen, to obtain the required
accuracy. Justify these choices. For these values, include plots of the wavefunctions
corresponding to the two lowest energies and compare with the known analytic form.
Note: throughout this project, you should provide graphs that illustrate clearly the effect
of the parameters and the similarities and differences between the wavefunctions. Large
numbers of graphs are very unlikely to be effective in communicating this information.
5 Anharmonic Oscillator
Question 2 Labeling the harmonic oscillator eigenstates by |ni, i.e. H0 |ni = En |ni
with
p 2 x2 1
H0 = + = a† a + ,
2 2 2
explain or show that hn + k|x6 |ni = 0 for all k > 6.
Question 3 Using perturbation theory derive expressions for the lowest two energies
to second order in b. You may use without proof, for integer j > 0, that
Z ∞
2
dz z 2j e−z = Γ(j + 12 ) ,
−∞
√
with Γ( 12 ) = π and Γ(σ + 1) = σΓ(σ).
Question 4 Compute (numerically) the four lowest energy eigenvalues and plot the
corresponding wavefunctions for b = 0.02. Compare the results with your perturbative
estimates for the lowest two energies. Try values b = 0.001 and 0.1, as well as any others
that you feel might be relevant. You need not plot the wavefunctions for these cases, but
you should consider tables and/or plots of the energies vs. b. How well does perturbation
theory work here? Is second order perturbation theory an improvement over first order?
Finally consider
x2
1 2 2 2 2
V (x) = 4 (x − d ) d + (2)
9d 8
This problem is not so easy to study by perturbation theory. Instead we use trial wavefunctions.
(This approach is the same as the variational method described in the Applications of Quantum
Mechanics course, but no knowledge of this method is required as full details are given below.)
Observe that for small y, the potential V resembles the harmonic oscillator from question 1.
A similar result occurs if we change variable to y = x + d instead: the potential is symmetric
in x so this must happen. Based on this observation we introduce two wavefunctions that are
ground states of the relevant harmonic oscillators:
1 1 2
ψ+ (x) = e− 2 (x−d)
π 1/4
1 1 2
ψ− (x) = e− 2 (x+d) .
π 1/4
Define two trial wavefunctions as
φ± = C± (ψ+ ± ψ− )
You will investigate how close are these wavefunctions to the solutions of the Schrödinger equa-
tion.
Question 6 Determine the normalization constants C± and show that the expectation
values of the Hamiltonian, E± = hφ± |H|φ± i are
(A ± B)
E± = ,
(1 ± e−d2 )
where
7d2
1 7 5 2 7 1 5
A= + + and B = e−d − + + + .
2 32d2 192d4 18 48 16d2 192d4
Question 9 For the cases considered in question 7, how close are the bounds E± to
the true energies? For two interesting values of d, plot the potential and indicate the four
lowest lying energy levels on the same plot. Is it possible for some of the energy levels to
be lower than the height of the central peak, i.e. than V at x = 0?
Question 10 How well does the trial wavefunction method estimate the energy dif-
ference, ∆E, between the first excited state and the ground state? What happens to ∆E
as β → 0?
Appendix
H0 = 1 , C0 = 1
1
H1 = 2x , C1 = √
2
1
H2 = 2(2x2 − 1) , C2 = √
2 2
1
H3 = 4x(2x2 − 3) , C3 = √
4 3
1
H4 = 4(4x4 − 12x2 + 3) , C4 = √
8 6
1
H5 = 8x(4x4 − 20x2 + 15) , C5 = √
16 15
1
H6 = 8(8x6 − 60x4 + 90x2 − 15) , C6 = √
96 5
1
H7 = 16x(8x6 − 84x4 + 210x2 − 105) , C7 = √
96 70
and so on.
1 Introduction
There are many numerical methods for finding the least value of a function of N variables,
f (x1 , x2 , x3 , . . .) = f (x), say, given that the first derivatives
∂f
gi = , i = 1, 2, . . . N , (1)
∂xi
can be calculated. Most of the methods are iterative and each iteration reduces the value of
f (x) by searching along a descent direction in the space of the variables in the following way.
The iteration begins with a starting point x0 , and at this point the gradient vector g is calcu-
lated. Then a search direction, s, say, is chosen, that satisfies the condition g · s < 0 (the dot
denotes a scalar product). It follows that if we move from x0 in the direction of s, then the
value of f (x) becomes smaller initially. In other words the function of one variable
satisfies the condition φ0 (0) < 0 which is equivalent to g · s < 0. The next stage is to consider
the function φ (λ), and choose a value of λ, λ∗ say, that satisfies the inequality
x2 x
x+y+ − y 2 + (y 2 − )2 , (4)
4 2
∗
In your program for this project, you should allow for manual input of estimates for λ∗ , based on plots of
φ (λ). To speed up longer runs you will probably wish also to use MATLAB routines or other automated search
algorithms to minimise φ (λ), but this is not required. In either case, the values of λ∗ used should appear in the
hard copy of results.
In addition the following quadratic function of three variables will be used to demonstrate some
properties of the DFP algorithm:
2 Steepest Descents
The Steepest Descents method simply uses the search direction s = −g. Write a program
to implement the algorithm as described above. Use a simple x–y plot of φ (λ) to help you
determine λ∗ at each stage (you need never determine λ∗ to more than 2 significant figures). At
each stage after the first, arrange for your program to display the current value of f (x) and the
decrease achieved over the last step. Also arrange for a plot of the iteration points x0 , x1 , x2 ,
etc., (a sequence of line segments will illustrate the methods well). The iteration point plot may
be built up as the calculation proceeds, or you can store the data and produce it on command
from your programme at a point of your choosing.
[N.B. A well-implemented fully automatic algorithm for general use will need to have checks
for special cases and exceptions built into it. For example, if a point xn is encountered for
which g ≈ 0 then a stationary point has been found and the process should quit. Likewise, if
the iteration points are not changing significantly a fully automatic algorithm ought to quit.
You may find it helpful to include such features in your program. If you wish to proceed semi-
automatically, with λ∗ being decided by eye from the plot at each stage, there is no need to
include the special checks in your code.]
Question 1 Obtain contour plots and/or surface plots of functions (4) and (5) (this
should be fairly straightforward to do using MATLAB).
Work out analytically where they have minima and find their minimum values. Suitable
axis intervals are −1.5 6 x, y 6 1.5.
Question 2 Using function (4) and starting from (−1.0, −1.3), run the Steepest De-
scents method for 10 iterations. Produce a plot of the progress of the iteration. On the
basis of your numerical results (i.e., imagine that you do not know the analytical an-
swer), estimate the minimum value of the function at the point to which your iteration is
converging, and estimate intervals in which the co-ordinates of the minimum lie. What
general statement can you make about the precision with which the minimum value itself
can be found, compared to the precision with which the minimum point is known? What
property of the function being minimised gives rise to this effect?
Question 3 Using function (5) and starting from (0.676, 0.443), run the Steepest De-
scents method for 9–15 iterations, and produce a plot of the progress of the iteration. To
what point do you think the iteration will eventually converge? Comment on the rate of
convergence. How sensitive is the iteration path to variations in the choice of λ∗ at each
stage? Comment on the circumstances that can make steepest descents inefficient.
The conjugate gradients algorithm uses steepest descents for its first step and then adjusts the
search direction in an attempt to overcome the problems of steepest descents alone. Let x0 ,
x1 be two successive points where x1 has been obtained using steepest descents from x0 , and
let g0 , g1 be the corresponding gradients (the initial search direction is s0 = −g0 ). Take the
second search direction as
g1 · g1
s1 = −g1 + β s0 = −g1 − β g0 where β = . (7)
g0 · g0
If f (x) is a quadratic function of N variables then the choice of directions may be continued up
to the N th search direction to give the N conjugate directions
gk · gk
sk = −gk + β sk−1 where β = .
gk−1 · gk−1
In this case, if all the values of λ∗ had been chosen to minimise the φ (λ) exactly at each stage,
the algorithm would have converged. In practice of course f (x) may not be quadratic and the
values λ∗ may not be chosen exactly, and in this case it is usual in practice to restart the method
after N steps. When N = 2, as it is for functions (4) and (5), this implies that every other step
is a steepest descent.
Write a program to implement the conjugate gradients algorithm, with the same features as used
for the steepest descents method, but with the search direction determined as just described.
Question 4 For the function (4), repeat Q2 using the conjugate gradients algorithm,
and compare results.
Question 5 For the function (5), repeat Q3 using the conjugate gradients algorithm,
and compare results.
Does the conjugate gradients algorithm offer much of an improvement over steepest descents?
4 DFP Algorithm
The Taylor Series expansion of any smooth function f (x) may be written
f (a + ∆x) ∼
= f (a) + g · (∆x) + 12 (∆x)T H−1 (∆x) + · · ·
where the gradient vector g is evaluated at x = a and H−1 ≡ G is the Hessian matrix, i.e., the
matrix of second derivatives
∂2f
Gij = .
∂xi ∂xj
Finding a point where g vanishes is therefore similar to the Newton–Raphson method for a
system of equations, and if H were known and f (x) were a quadratic function, the point could be
found in a single step. However the matrix H is not available initially unless second derivatives
are calculated; this is not always easy and in any case can be time-consuming, especially for
large N . Therefore we now study a very successful technique that extends the steepest descent
method by forming a suitable H-matrix as the calculation proceeds. It is known as the DFP
algorithm and is one of the class of “variable metric methods”. It can be shown that H−1
converges to the Hessian (you are not required to prove this).
HppT H qqT
H∗ = H − + T , (8)
pT Hp p q
where p and q are column vectors giving the changes in g and x respectively during the step,
that is
p = g (x0 + λ∗ s) − g (x0 ) , q = λ∗ s (9)
(Note: H∗ p = q which is useful when checking your program.)
Write a program to implement the DFP algorithm, with the same features as used for the
two preceding programs, but with the search direction determined as just described. Include
provision to print out H.
Question 6 A property of the DFP algorithm is that it calculates the least value of a
quadratic function in at most N iterations for any initial choice of x0 if on each iteration
the value of λ∗ is calculated to minimise exactly the function φ (λ). Apply the DFP
algorithm to (6) for three iterations from starting point x0 = (1, 1, 1) using the sequence
of values
λ∗ = 0.3942, 2.5522, 4.2202.
There is no need to verify these values to this precision, but your program will already
have facilities for checking that these values are appropriate. Investigate how sensitive
the result obtained after three iterations is to small changes in these values. Verify that
H does indeed tend to the inverse Hessian matrix. You may note that
−1
0.8 0 1 3.3333 0 −1.6667
0 0.4 0 = 0 2.5 0 (10)
1 0 2 −1.6667 0 1.3333
Question 7 For the function (4), repeat Q2 using the DFP algorithm. Examine H
and compare with the true value.
Question 8 For the function (5), repeat Q3 using the DFP algorithm. Examine H
and compare with the true value.
Question 9 Compare the performance of the three methods for these functions.
References
[1] Fletcher, R. and Powell, M.J.D. Rapidly convergent descent method for minimisation, Com-
puter Journal, 7 (1963).
[2] McKeown, J.J., Meegan, D., and Sprevak, D. An Introduction to Unconstrained Optimi-
sation - A Computer Illustrated Text, IOP Publishing (1990). Although references to the
computer programs (designed for BBC micro) are best ignored, the text is still relevant.
1 Introduction
The Airy functions Ai(z) and Bi(z), where z is a complex variable, are two linearly independent
solutions of the differential equation
d2
y(z) = zy(z) (1)
dz 2
satisfying √ √
Ai(0) = α, Ai0 (0) = −β, Bi(0) = 3 α, Bi0 (0) = 3β
where
1 1
α= ≈ 0.355028053887817, β= ≈ 0.258819403792807.
32/3 Γ( 32 ) 31/3 Γ( 13 )
but you do not need to know anything about its properties for this project. The Airy functions
are useful in many problems involving transition regions of all kinds, for example in optical
diffraction (the transition between relatively light and dark regions), wave theory, electron
tunnelling, and asymptotic analysis. Ai and Bi have Maclaurin series given by
√
Ai(z) = αf (z) − βg(z), Bi(z) = 3 αf (z) + βg(z)
where
1 3 1·4 6 1·4·7 9
f (z) = 1 + z + z + z + ···
3! 6! 9!
and
2 4 2 · 5 7 2 · 5 · 8 10
g(z) = z + z + z + z + ··· .
4! 7! 10!
For large |z|, any solution y(z) of (1) is given asymptotically by the relation
and
1 2 5
G(z) = √ z −1/4 exp( z 3/2 )(1 + z −3/2 + · · · ),
π 3 48
is a solution of 1. Here C is any contour that starts at ∞e−2πi/3 and ends at ∞e2πi/3 .
Show furthermore that this solution satisfies y(0) = α, y 0 (0) = −β and that it is therefore
equal to Ai(z). [Hint: deform C into two (straight) rays that meet at the origin. You may
assume without proof the reflection formula for the Gamma function, viz. Γ(z) Γ(1 − z) =
π/sin(πz).]
This integral representation of Ai(z) can be used to check the asymptotic expansion given above
for large |z|, but you are not required to do this.
Question 3 Modify your program to instead calculate Ai(z), and try to evaluate Ai(z)
at the same points as in Question 2. You may find it useful to know that Ai(1) ≈ 0.13529.
Draw a graph of Ai(z) for real positive z. Which of your evaluations are you confident
are accurate? What goes wrong with the method? Why is this unavoidable?
One way to avoid this problem is, instead of integrating from z = 0 towards infinity, to start
from a value of z with large modulus, and step towards the origin. The asymptotic expansion for
Ai(z) (and the derivative of this expansion) can be used to approximate the initial conditions.
Question 4 Explain why this alternative approach should work. Write a program to
implement it; start from |z| = a, for some large fixed constant a, and integrate towards the
5 −3/2
origin. Use only the zeroth order term of the asymptotic expansion (i.e., ignore 48 z
and higher order terms in F (z)); a more advanced implementation might take more terms
into account.
To start with you might like to use a = 20; but you should experiment with other values
and explain what difference they might make. State the value you finally settle on and
why.
Use your program to evaluate Ai(z) at the same points as in Question 2.
3 Matched expansions
Question 5 By finding series expansions about the origin, or otherwise, prove that
the given expressions for the Maclaurin series of Ai(z) and Bi(z) are correct.
A much quicker, and more accurate, approach to evaluating the Airy functions is to avoid
numerical integration altogether and instead use the analytic series expansions. In theory, the
Maclaurin series for Ai and Bi are valid for all z, but in practice they are not very helpful for
larger values of |z| because of rounding errors caused by adding together large numbers of terms.
Here we will try an approach based on using the Maclaurin series when |z| < b, for some fixed
constant b, and using the asymptotic expansion when |z| > b; we hope to achieve accuracy at
least as high as 4 significant figures, and preferably more.
Question 6 Investigate the feasibility and potential accuracy of this approach for
evaluating Ai(z) on the positive real axis. You should use only the first two terms in the
asymptotic expansion (i.e., do not attempt to find more terms in F (z) than are given
above), though you may use as many terms of the Maclaurin series as you wish. You
should try various different values of b, and experiment with the number of terms to use
from the Maclaurin series for best results. What level of accuracy is attainable?
Include a plot of your composite approximation and some sample values close to |z| = b.
How did you sum the Maclaurin series in order to minimize rounding errors?
How do you expect the time taken by this algorithm to compare with that for Question 4?
A professional implementation of this method (at least for real z) would use a selection of
Chebyshev polynomial approximations in different overlapping regions and choose the best one
automatically.
Question 7 Use the programs you have developed in previous questions to describe
how the behaviour of Ai and Bi with |z| changes as the rays approach the Stokes line at
arg z = π/3 from within R.
Question 8 By experimenting with rays outside R, determine the location of the third
Stokes line. How do Ai and Bi behave on this line?
Question 9 What can you say about the values of A and B in each of the three
regions which lie between each pair of Stokes lines? Can you estimate these values from
your numerical results?
What, if anything, can you say on the Stokes lines?
[Note that no knowledge of Quantum Mechanics is required for this section of the project: all
required equations are given below.]
A one-dimensional quantum-mechanical particle is confined to the region x > 0 and is subjected
to a force of constant magnitude k directed towards the origin. The governing equation for the
wavefunction ψ(x) is
~2 d2 ψ
− + kxψ = λψ
2m dx2
with boundary conditions ψ(0) = 0, ψ(x) → 0 as x → ∞, where λ is the energy of the particle.
This is a Sturm–Liouville problem with eigenvalue λ.
Question 10 Show, using your computed results from earlier questions, that there is
a discrete set of energy eigenvalues λn . Find an approximate value for the first two of
these eigenvalues in units where ~2 k 2/2m = 1.
Car owners are haunted by the following problem. Every day, the operating cost for their car
increases, as does the probability that the car breaks down. Even worse, when trading in the
car for a different one dealers will pay less for older cars and charge more for newer ones. The
problem, then, is to find an optimal policy for trading in the car.
We model the problem as a Markov decision process. Let gj (u) be the instantaneous cost
incurred if one takes action u in state j and let pjk (u) be the probability of then moving to
(n) (n)
state k. Define sequences γ (n) , fj , uj by the recursions
and
(n+1) (n)
X
uj is the u-value minimising gj (u) + pjk (u)fk . (2)
k
Question 1 Consider the following stationary policy: for fixed n, whenever state j
(n)
occurs take action uj . What is the long-term average cost of this policy? Explain.
(n)
Note that the values fj determined by (1) are arbitrary up to an additive constant and can
(n)
be normalised, for example by letting f1 = 0. If the matrix of transition probabilities is
irreducible in every stage, then (1) will always have a solution for f . The sequence γ (n) is
non-increasing, and will converge to a minimum value γ in a finite number of steps if u can take
(n)
only a finite number of values. The policy uj will then have converged to an average optimal
policy.
Question 2 Instantiate the above framework for the car replacement problem. You
may want to introduce states representing the age of the car in appropriately chosen units
of time, and an additional state in which the car is written off and has a trade-in value
of zero. Describe the set of actions, and define the instantaneous costs gj (u) and the
transition probabilities pjk (u).
Question 3 Write a program to find the optimal replacement policy. You are not
required to write your own linear algebra routines, but you should describe any math-
ematical manipulations involved in bringing the equations in the desired form. Give a
Table 1: Instance of the car replacement problem with time units of two years and N = 6
j: 1 2 3 4 5 6 7
sell/keep: keep keep keep keep keep sell sell
buy car of age: – – – – – 2 2
Table 3: Instance of the car replacement problem with time units of six months and N = 21
Question 4 Give the optimal replacement policy for the data in the file table2.csv
available from the CATAM website and displayed in Table 3. What is the value of γ?
Question 5 Suppose that purchase price, trade-in price, operating cost, and survival
probability are all monotonically increasing or decreasing in the obvious direction. Sup-
pose further that the optimal policy tells you to sell a car when it reaches a certain age,
but that you neglect to do so. Is it possible that the same policy stipulates hanging on to
the car now that it is older? Either construct an example for which the optimal policy is
of this kind, or prove that this is impossible.
References
1 Black-Scholes model
A standard model used in option pricing is that the logarithm of the stock price follows a
Brownian motion. Hence, if St is the stock price at time t, we assume that log(St /S0 ) is
normally distributed with mean µt and variance σ 2 t, where σ is the volatility, µ = ρ − σ 2 /2,
and ρ is the continuously-compounded riskless interest rate.
The celebrated Black-Scholes formula gives the price of a call option (exercised only at expiry).
The price of the option is
log(S /c) + (ρ + σ 2 /2)t log(S /c) + (ρ − σ 2 /2)t
0 0 0 0
S0 Φ √ − ce−ρt0 Φ √ (1)
σ t0 σ t0
where c is the strike price and t0 is the expiry time. (See the Appendix for details of how to
calculate Φ.)
Question 1 Write a routine to evaluate the Black-Scholes price (1). Compile a table
of the price when c = 40, S0 = 52, 100 or 107, σ = 0.5, ρ = 0.035, and t0 = 2 or 3.
Question 2 How does the price vary with each of the parameters c, S0 , σ, ρ, t0 ? Keep
your explanations brief, but support them with solid mathematics where necessary.
2 Bernoulli approximation
The most widely-used method for approximating option prices which are based on the Black-
Scholes model is to replace the Brownian motion by a discrete-time simple random walk. This
approximation breaks up the interval [0, t0 ] into [0, t0 /n, 2t0 /n, . . . , (n − 1)t0 /n, t0 ] and assumes
that between the times it0 /n and (i + 1)t0 /n, i = 0, . . . , n − 1, the increment in the logarithm
of the price is g or −g with probability p or 1 − p respectively, where g and p are chosen so that
the increment has mean µt0 /n and variance σ 2 t0 /n.
This approximation is primarily of interest for cases such as the American put where exact
formulae are not available. For a European (or American) call option, the price obtained from
the random walk will approximate the true price obtained from the Black-Scholes formula, and
this can be a useful benchmark to judge the performance of the approximation.
One way to implement the approximation is to set
Vi,j = pVi+1,j+1 + (1 − p)Vi+1,j e−ρt0 /n for j = 0, . . . , i and i = n − 1, . . . , 0
(2)
with boundary conditions
+
(2j−n)g
Vn,j = S0 e −c for j = 0, . . . , n. (3)
Question 4 For the data in Question 1 and n = 27, compile a table of the approximate
prices. How do they compare to the prices obtained from the Black-Scholes formula?
Question 7 Estimate the rate at which the error decreases as n increases. Explain
every step and justify your answers.
3 American Put
Consider the case of an American put; now early exercise of the option may be optimal and no
closed-form formula exists for the price.
Question 8 Modify your programs to approximate the price of the option by consid-
ering how equations (2) and (3) should be changed in this situation. Compile a table of
the approximate price of the option for the same values of the parameters used in Question
1. Comment briefly on how the approximate price varies with n in this case.
4 Extrapolation
Suppose that fn is the approximation to the option price, and we wish to find the limiting value
of fn as n → ∞. One method of extrapolation assumes that fn may be approximated by a
polynomial in 1/n:
fn ≈ g0 + g1 n−1 + g2 n−2 + · · · + gs n−s .
The limiting value of fn is then approximated by g0 . One way to achieve this is as follows. Let
nm = rm n0 and calculate fn at n = n0 , . . . , ns . Set
Question 9 Experiment with this extrapolation procedure for small values of r and
s, say 2 to 4, for the at-the-money European call case studied above. How does this
extrapolation compare in accuracy with just calculating fn for a single suitably large
value of n? Try to estimate the error analytically. Does your answer depend on whether
n0 is odd or even? If so, carefully explain why.
The approximation in Section 2 uses a Bernoulli (two-valued) distribution between time steps.
The method may be refined by replacing the Bernoulli distribution between times it0 /n and
(i + 1)t0 /n by, say, a binomial distribution taking k + 1 equally-spaced values (for some k > 1)
with mean and variance chosen to match those of the Brownian motion.
Question 10 Implement this refinement, and explain how you calculate p and g in
this case. How do prices produced by the refined algorithm (k > 1) differ from those
produced by the Bernoulli scheme (k = 1)? Does your answer depend on whether you
are pricing the European call or American put option, and if so why? How does the
computation time for this algorithm vary with n and k?
Appendix
An easy method to approximate the standard normal distribution function is as follows. For
x > 0 set
x2 X 9
t
1 − Φ(x) = exp − + ai ti
2 2
i=0
√ −1
where t = (1 + x/ 8) and
References
[1] J. Hull, Options, Futures and Other Derivative Securities. Prentice-Hall, 1989.
Introduction
Markov Chain
The key idea is quite simple. We want to sample from a distribution π(x), x ∈ Rm , but cannot
do so directly. Instead we create a discrete-time Markov chain X(n) (taking values in Rm ) such
that X(n) has equilibrium distribution π. Then
where f is any real-valued function on Rm for which the right-hand side above is well-defined.
The second of these limits can be used to calculate means and variances of components of X,
as well as approximations to the distribution functions. For example, f (x) = xi gives the mean
of the ith component Xi , and f (x) = I(xi 6 b) gives the distribution function of Xi at point b.
Gibbs sampler
Suppose we do not have a tractable closed-form expression for the equilibrium density π(x) =
π(x1 , . . . , xm ), but we do know the induced full conditional densities π(xi |x−i ), where x−i is
the vector x omitting the ith component, x−i = (x1 , . . . , xi−1 , xi+1 , . . . , xm ).
A systematic form of the Gibbs sampler algorithm proceeds as follows. First, pick an arbi-
trary starting value x0 = (x01 , . . . , x0m ). Then successively make random drawings from the full
conditional distributions π(xi |x−i ), i = 1, . . . , m, as follows:
This cycle completes a transition from x0 = (x01 , . . . , x0m ) to x1 = (x11 , . . . , x1m ). Repeating
the cycle produces a sequence x0 , x1 , x2 , . . . which is a realization of a Markov chain, which is
Question 1 Assume that the Markov chain X(n) takes values in a finite subset
m
S ⊂ R . Verify that π is an equilibrium distribution for this chain. That is, check that
for all y ∈ S, X
π(x)π(x, y) = π(y).
x
It can be shown that this implies that π is the equilibrium distribution of the Gibbs
sampler,
in the sense of (1), but do not attempt to prove it. Thus our estimate of Eπ f (X) , taken over
N iterations, is
N
1 X
f (xn ).
N
n=1
Football data
Data from the performance of K football teams, over T years has been scored on a scale of 0
(no wins) to 114 (win in all 38 games), with a win scoring three and a draw scoring one point.
Let us model Ykt , the score of the kth team in year t, as
with the hierarchical prior structure that the team mean µk and variance σk2 are independently
distributed, given θ, as
µk |θ ∼ N(θ, σ02 )
σk−2 ∼ Γ(α0 , β0 ),
where σ02 , α0 and β0 are known parameters, and θ is a second-stage prior with distribution
θ ∼ N(µ0 , τ02 ),
If you wish, you can use a package to simulate distributions, but you should implement the
Gibbs sampler yourself without using library routines.
Question 4 Use your Gibbs sampler to estimate the posterior mean of each parameter
2
θ, µk , σk . Plot a histogram of the posterior distribution of θ and comment on your
histogram. Explain how you obtained it.
Question 5 Now choose a team k. Estimate the posterior probability that your chosen
team is above average, P(µk > θ|y).
Question 6 Build up an idea of how accurate your estimates for µk and P(µk >
θ|y) are, for your chosen team k, by performing independent runs of the Gibbs sampler,
computing estimates for each of the parameters on each run and then computing the
sample variances of these estimates. Comment on how fast your estimates converge by
considering sample variances at different values of N .
Question 7 Now try letting the algorithm run for an initial period of M cycles before
calculating estimates based on a further N iterations. This might allow the distribution
to settle down to equilibrium before being measured. Calculate sample variances (as the
previous question) for a few suitable values of M to see if this makes any noticeable
difference. Explain why you do or do not see a difference.
Question 8 In any MCMC procedure we must ensure that we are exploring the full
sample space. One way to check this is to run a number of chains that start from different
points. Using a few widely dispersed starting points confirm, or otherwise, that your
results are independent of the starting point. What is the effect of running an initial M
cycles in this situation?
λγ xγ−1 e−λx
f (x) = ,
Γ(γ)
with mean γ/λ and variance γ/λ2 . Also recall that a Gamma Γ(n/2, λ) has the same distribution
as the scaled chi-squared (2λ)−1 χ2n .
You can assume that a Γ(2.50001, λ) is approximately distributed as a χ25 (suitably scaled),
which in turn is exactly equal to the sum of two independent exponentials plus an independent
normal squared.
1 Introduction
In this project you shall analyse a dataset scraped from the website tennis-data.co.uk. The
file mensResults.csv available from the CATAM website contains information about matches
between the top male professional players in the period 2000–2016. Each row corresponds to a
match and lists the following variables:
• W1-W5: The number of games won in sets 1-5 by the winner. If an entry is missing, the
match ended before the corresponding set.
• Series: Name of ATP tennis series (Grand Slam, Masters, International or International
Gold).
• Date
2 Bradley–Terry model
The Bradley–Terry model for a tournament considers each match to be independent, and
exp(βa − βb )
Pr(Player a wins a match against Player b) = , (1)
1 + exp(βa − βb )
for a vector of parameters β of equal length as the number of players. This model was first
invented by the mathematician Ernst Zermelo in 1929 as a way to rank chess players.
Question 3 Obtain a 68% confidence interval for the probability that Roger Federer
beats Andy Murray in a match.
It is well-known among tennis enthusiasts that certain players enjoy an advantage on specific
surfaces. For example, Rafael Nadal does very well on the Roland–Garros tournament, which
is played on clay, while Roger Federer has an excellent record in Wimbledon, played on grass.
You suggest a new model with
exp(βa + βa,s − βb − βb,s )
Pr(Player a wins a match against Player b on surface s) = ,
1 + exp(βa + βa,s − βb − βb,s )
where βa,s can be interpreted as the advantage of player a on surface s compared to a baseline
fitness βa .
Question 5 Perform a formal hypothesis test which might allow you to reject the
model in Question 2 in favour of the alternative in Question 4. Can you reject the simpler
model at the 1% level? Does this agree with the cross-validation comparison of the two
models?
Question 6 Discuss how you might use the variables W1-W5 and L1-L5 as an output
in a GLM.
3 Regularisation
The lasso estimator β̂ (λ) for a GLM without an intercept solves the problem
1
minimise − L(β) + λkβk1 , (2)
β∈Rp n
Pp (0) is the
where L(β) is the log-likelihood, λ > 0 and kβk1 = j=1 |βj |. When λ = 0, β̂
maximum likelihood estimator. When λ > 0, the second term penalises large values of the
coefficients. While this biases the estimator toward 0, it can actually reduce the mean squared
error, especially when there are many input variables.
This estimator is also convenient because it can be exactly 0 for a subset of the coefficients,
which is a form of automatic variable selection. Increasing the parameter λ produces estimators
of increasing sparsity.
The penalty term in (2) is separable, i.e. it is a sum over the coefficients in the model. The
function glmnet in R allows us to choose a vector w ∈ Rp through the argument penalty.factor
in order to solve the problem
p
1 X
minimise − L(β) + λ wj |βj |. (3)
β∈Rp n
j=1
Question 8 With the optimal value of λ, how many of the estimates for coefficients
βa,s are non-zero? Compute the logistic loss on the held-out data from 2015 and 2016 and
compare to the maximum likelihood estimator in Question 4.
Question 9 It is likely that the ability of each player drifts from year to year. Mod-
ify the model in Question 2 to take account of this fact and apply a lasso penalty of
your choice. Plot the probability that Federer wins against Nadal as a function of time.
Compare your proposal against previously considered alternatives in any reasonable way.
The dataset contains betting odds from the site Bet365. Each number is a ratio of the payoff,
including the gambler’s stake, to the stake. For example, suppose that in a match between
Federer and Nadal which was won by Federer, the variable B365W equals 1.85 and the variable
B365L equals 1.90. Then, betting £1 on the event that Federer won would have a payoff of
£0.85 in addition to the gambler’s stake, while betting £1 on the event that Nadal won would
have a payoff of £0.90 in addition to the gambler’s stake.
Suppose you have a budget of £1,000 to bet on k tennis matches played in 2015, and you have
access to data from 2000–2014. Let φi,W and φi,L be the amounts we bet on the winner and the
loser in match i, respectively∗ . We call the vector φ = (φ1,W , φ1,L , . . . , φk,W , φk,L ) the portfolio.
A Markowitz portfolio aims to maximise the expected profits while minimising the risk, mea-
sured by the variance of the profit, under a model which specifies the probabilities of each
outcome in each match. Letting Ri,W , Ri,L be the profit made on the bet on the winner and
loser of match i, and R = ki=1 [Ri,W + Ri,L ] be the total profit, we can formalise the problem
P
as follows
for some positive semidefinite matrix Q and a vector q. Provide an expression for Q and
q in terms of the betting odds and the predictions of a logistic regression model.
∗
The winner and loser are not known a priori, but we use these labels as a way to distinguish the two players
in each match.
The variance of the profits R in the definition of the Markowitz portfolio (4) assumes that the
probability of each outcome for each match is known exactly. This does not take into account
the uncertainty of the model parameters, which can increase a portfolio’s risk. A Bayesian
Markowitz portfolio assumes that the probabilities predicted for each match by the model are
random and distributed according to some posterior distribution. The expectation and variance
in Eq. (4) take into account this source of randomness.
In reality bets are made sequentially, as we wouldn’t know which matches are played in 2015
in advance. One approach to sequential betting is the Kelly criterion. Before match i, we bet
fractions φi,W and φi,L of our current bankroll on the winner and loser of the match, which
maximise the expectation of the logarithm of the bankroll after the bet. Here, φi,W , φi,L are
constrained to be non-negative and φi,W + φi,L 6 1. The fractional Kelly strategy further
reduces the risk by staking fractions ρφi,W and ρφi,L of the bankroll on each player, where ρ is
a constant smaller than 1.
Question 13 Write a function to compute the Kelly fractions numerically using a grid
search. Evaluate the fractional Kelly strategy with ρ = 0.1 on the data from 2015 using
the predictions of the model in Question 4. Plot the bankroll as a function of the date in
2015.
The results of this section should be rather surprising, as the betting market pools the opinions
of many experts and the betting odds ensure that the bookmaker turns a small profit. It is
possible to improve the statistical model significantly, using any of the variables provided for
the period 2000–2014. One way to obtain excellence marks in this project, but not
the only way, is to try to improve upon the models outlined above and share your
results† .
5 Programming notes
The use of R is strongly recommended. In particular, the package glmnet can be used to fit all
the models in the project and may be installed through the following commands.
install.packages("glmnet")
library(glmnet)
The function glmnet fits the more general elastic net estimator, in which the penalty term in
Eq. (2) is replaced by
1−α 2
λ kβk2 + αkβk1 ;
2
†
You may also guard them as a trade secret.
References
[1] James, G., Witten, D., Hastie, T. and Tibshirani, R. An Introduction to Statistical Learning
with Applications in R. Springer, New York, 2013. [http://www-bcf.usc.edu/~gareth/
ISL/ISLR%20First%20Printing.pdf]
1 Introduction
Consider a gas of N non-interacting classical particles. The momentum of the ith particle is pi
and its kinetic energy is Ei . The energy Eg of the gas is
N
X
Eg = Ei .
i=1
To this system we add one additional degree of freedom, which acts as a thermometer. The
thermometer stores energy, and can exchange it with the gas. The energy of the thermometer is
Ed and the total energy E = Eg + Ed is conserved (we consider the microcanonical ensemble).
We will show that measuring the average value of Ed can be used to infer the temperature of
different kinds of classical gas.
2 Algorithm
We use a stochastic (random) algorithm to calculate the statistical behaviour of this system.
This is an example of a Monte Carlo algorithm. It operates as follows:
1. As an initial configuration, set pi = e1 for all i, where e1 is a unit vector in the x-direction.
Initialise also Ed = 0.
2. Choose one of the N particles at random and compute its current energy Ecurr . Generate
a random vector ∆p and propose a change of the particle’s momentum, from pi to pi +∆p.
A good choice is to take each component of the vector ∆p to be a random number from
(−ε, ε) with ε = 0.1. Compute the energy that the particle would have if its momentum
was pi + ∆p: this is the proposed energy Eprop .
3. Define ∆E ≡ Eprop − Ecurr . If ∆E 6 Ed then accept the change. That is, update the
momentum of particle i to a new value pi + ∆p, and update Ed to a new value Ed − ∆E.
If ∆E > Ed then the change is rejected and no variables are updated.
4. Whether or not the change was accepted, record the value of Ed as a new value in an
array (or list) which will later be used to plot a histogram. Also record the energy of
the particle. (This is called the single-particle energy.) If the change was accepted, you
should record these values after the update was performed.
5. Repeat steps 2-4 until the total number of attempted updates is Nupdates . Since each
update only affects one particle, it is useful to define Nsweeps = Nupdates /N so that Nsweeps
is the typical number of times that each particle has been chosen for an update.
P (Ed ) ∝ Ωg (Eg ) .
3 Ideal gas
Programming Task: Write a program to simulate a gas of N particles using the Monte
Carlo algorithm outlined above. Consider a 3-dimensional gas of nonrelativistic particles,
so p = (p1 , p2 , p3 ) and
|p|2
E(p) = .
2
You will need to keep track of the momentum vectors for the N particles in the gas. It will
be useful in later questions if your program includes a function which returns the particle
energy, given p as input.
You will also need to plot histograms of the quantities that were recorded in step 4 of
the algorithm: the value of Ed and the single-particle energy. Remember, a histogram is
a graph of the relative frequency that a quantity such as Ed lies within a particular bin.
This relative frequency is f (Ed ).
Your program should also calculate the average of Ed .
Throughout this project, should compare your results with the behaviour that you would expect
from the theory of statistical physics. The results should be presented in such a way that this
comparison is clear.
Question 3 For N = 100, plot a histogram of Ed for Nsweeps = 10, 100, 1000. [You may
wish to plot log f (Ed ) instead of f (Ed ).] Your program should not take more than a few
minutes to run. Discuss (and explain) the results, including the dependence on Nsweeps .
Do the results depend on the parameter ε that appears in step 2 of the algorithm?
Programming Task: Modify your program so that each particle is initialised with a
randomly assigned momentum (instead of all starting with pi = e1 ). For example, assign
each component of pi independently at random from (−a, a), with a = 1.
(Note: depending on a, you may want to change the parameter ε that appears in step 2
of the algorithm.)
Question 6 How does this change in initial conditions affect the histograms of Ed
and the single-particle energy? What happens for different values of a? How does the
temperature depend on a? Explain your observations, including their consistency with
the theory of ideal gases from statistical physics.
(Note: depending on a, you may want to change the parameter ε that appears in step 2
of the algorithm.)
4 Relativistic gases
Question 7 For a = 1, compute and plot histograms of Ed and of the single particle
energy. Estimate the temperature of the gas. Vary a and compute the temperature. Plot
this temperature as a function of the total energy of the system. Compare the result
with the case considered in question 5 (non-relativistic particles in three dimensions), and
discuss their consistency with the theory of ideal gases from statistical physics.
Question 8 Consider different values of the total energy by varying a in the range
0.1 to 2.0. How does the temperature depend on the total energy? By considering the
behaviour of E(p) for large and small values of |p|, comment on the relation of this result
to the cases from previous questions. Compare the histograms of single-particle energies
for a few representative cases.
The Lorenz equations are named after the meteorologist who first studied them in 1963:
Question 1 Integrate the equations for values of r = 0, 15, 21 and 29. Use x = y = 1,
z = r − 1 as the initial conditions. You may use any standard integrating packages
that are available and enable you to choose an appropriate step-length and then fix on
it∗ ; comment however on the effect of changing the step-length and why you chose your
particular value. You should plot x(t) against z(t) to show your results. For some of the
above values of r consider plotting the solution only for t > T for some time T > 0; why
can this be useful?
The persistent erratic non-periodic oscillations seen when r = 27 are due to the existence of a
“strange attractor” in the flow. (The existence of this attractor was discovered numerically by
Lorenz but there is still no completely rigorous proof that it exists and has the properties we
are about to study). This attractor is stable for (approximately) r > 24.06, but for r < 24.06
some solutions spend a long time wandering about near it before eventually tending towards
a stable stationary point. (It exists, but is unstable, for approximately 13.9236 < r < 24.06.)
This phenomenon is known as intermittency.
Question 3 For various initial conditions as given in the following list, plot x(t)
against t at r-values of your choice in 23 < r < 25: in each case include in your write-up
one or two plots showing the different possible behaviours.
(i) Start very close to the origin (0, 0, 0) but not on the z-axis (why not?).
(ii) Start very close to one of the other fixed points.
∗
For example you can download a suitable solver from http://www.mathworks.com/matlabcentral/answers/98293
Which type of initial condition is best suited for deciding the r-value at which the strange
attractor becomes attracting? Which is useful to confirm your stability analysis for the
stationary points obtained in question 2 above? Which best illustrates intermittency?
You will find that nearby initial conditions sometimes give very different values of tc , and as r
increases towards 24.06 you may find that it becomes increasingly difficult to find tc for all of
your chosen initial conditions; you should start with r = 20 and be prepared to stop increasing
r when the amount of machine time used becomes excessive.
Question 5 Suggest a formula for the way in which the average tc value increases
with r. You will need a fairly large sample of tc values to make a reasonable estimate.
You should not plot the first few points obtained from any given trajectory in order to give any
transient behaviour time to die out. You may generate points from many trajectories or from
one long trajectory. You will observe that the points on this scatter diagram all lie very near
to a certain curve C, which can therefore be used as a predictor for the successive zi values.
Question 7 Describe in some detail the chief features of this curve and how they
relate to your numerical solutions. In particular, consider the following points:
Question 8 On a copy of your diagram, draw an approximation to the curve C and use
this hand-drawn curve (which you should include in your write-up) to predict a succession
of zi values. For how many steps does your prediction agree well with an actual sequence
produced by the numerically computed trajectory? Are there any features of the curve
which would lead you to expect this result?
Question 9 How does the curve C vary as r decreases? Draw the curves for r = 24.3
and 22.9, extending them in a sensible way to z = r − 1. (For r = 22.9 you will need
to use initial conditions which give intermittent trajectories in order to generate much of
the curve). Describe how the features of the curve change, and explain how these changes
relate to the other aspects of behaviour studied in this project.
References
[1] Colin Sparrow The Lorenz equations: bifurcations, chaos, and strange attractors. Springer,
1982.
This project considers issues that arise in the numerical solution of dynamical systems which
display complicated ‘chaotic’ behaviour. We first consider the discrete-time case, defining Lya-
punov exponents which measure the rate at which nearby points separate under iteration. Then
we discuss how ‘noisy’ trajectories of an iterated map, where the ‘noise’ arises through numerical
errors, are actually close to true trajectories of the system - this property is known as ‘shad-
owing’. Finally the project considers a continous-time (ODE) example of complicated motion
motivated by celestial mechanics.
Let D be a closed bounded subset of Rm , and let F (x) be a continuously differentiable map
from D to itself. A major task in dynamical systems is to characterise the behaviour of points
under repeated iteration of the map F . We call the sequence of points x0 , x1 , x2 , . . . constructed
by setting xn+1 = F (xn ) the trajectory from the initial condition x0 . The standard notation
for the repeated composition of F is to let F n denote the n-fold composition of F with itself,
i.e. xn = F (xn−1 ) = F 2 (xn−2 ) = · · · = F n (x0 ).
In many situations the rate at which nearby trajectories separate from each other is of interest.
This can be characterised by the Lyapunov exponents λ(x0 , v), defined to be the asymptotic
rate of divergence of trajectories with initial conditions x0 and x0 + v, where v is a small
perturbation from x0 :
Under the conditions given above it can be shown that the limit exists. For a given x0 there
will in general be m (possibly non-distinct) values of λ(x0 , v) as we choose different vectors v –
divergence occurs at different rates in directions corresponding to the different eigenvectors of
the Jacobian matrix DF evaluated at x0 . The formula above for λ(x0 , v) will give the largest
positive Lyapunov exponent of the system for almost all choices of the vector v. We denote
the largest positive Lyapunov exponent by λ(x0 ), or simply by λ. If x0 is a fixed point then
the Lyapunov exponents are simply the (real parts of the) Floquet multipliers, so in some sense
the idea of a Lyapunov exponent developed above is a generalisation of the idea of a Floquet
multiplier to arbitrary trajectories.
For the purposes of this project we will define a map to be chaotic if it appears that λ(x0 ) > 0
for almost all x0 , so that in general neighbouring points will separate exponentially.
Here we consider a 2-dimensional (area preserving) map on the unit square (x, y) ∈ [0, 1]2 .
Given some initial condition (x0 , y0 ), we define
K
xn+1 = xn + sin(2πyn ) (mod 1) (2)
2π
yn+1 = yn + xn+1 (mod 1) (3)
Note that here(mod 1) means that the map is restricted to the unit square.
In contrast to the asymptotic quantity λ(x0 ) as defined above, a possibly more useful quantity
is the local Lyapunov exponent, λl (x0 ), defined as
N (∆)
1 X kxn+1 − xn+1 k
λl (x0 ) = lim log (∆)
(4)
∆→0 N kxn − xn k
n=0
(∆)
where xn is the nth iterate of x0 , xn is the nth iterate of x0 + ∆, and N is a suitable finite
number of iterations of the map. For infinitesimal perturbations ∆, λl (x0 ) > 0 indicates a local
expansion of trajectories starting near x0 .
Note that N should be chosen neither too small, nor too large. In practice, you might also want
to discard the first few terms of this sum in your numerical calculations.
Question 2 For K = 3, find the maximum local Lyapunov exponent for different
initial conditions (x0 , y0 ) and small perturbation k∆k 1, using the Euclidean norm in
equation (4). What value of N did you use? How did you decide? What is your estimate
of the global maximum Lyapunov exponent?
Question 3 We can define a different “Lyapunov exponent”, λ2 = log2 eλl . Why, when
doing binary arithmetic, might λ2 be more interesting than λ? What is your interpretation
of the information λ2 provides? Given that the calculations here are done to 16 signifi-
cant figures (or to whatever precision achieved by the code you have used), what would
you expect the number of iterations required to be before the results obtained become
meaningless? How does your answer compare with what you found numerically?
Numerical calculations introduce round-off and truncation errors into iteration. For chaotic
maps, such as the 2D map above this introduces an effective error at each iteration; this is in
some sense equivalent to the explicit perturbation in the initial conditions we considered above.
Since a large class of interesting problems is reducible to iterating nonlinear, and chaotic, maps,
it is of some interest to consider whether any numerical calculation can be said to follow the
“true” trajectory of such systems.
Here we will consider the simple example used above, assuming the “true” trajectory is given
by a double precision calculation of the trajectory, while a single precision calculation provides
a “noisy” trajectory.
For some nonlinear systems it is possible to define a “shadow” trajectory to a noisy trajec-
tory (obtained by adding a small perturbation), such that the shadow trajectory is a “true”
trajectory of the system, and the “shadow distance” (initially the perturbation) of the shadow
trajectory from the noisy trajectory is bounded ([1], [2], [5]). In 2D when there exists one
unstable (expanding) direction and one stable (contracting) direction, it has been proved that,
for sufficiently small perturbations, “shadow” trajectories can exist for arbitrarily long times.
In many other systems it is still possible to define a “shadow” trajectory for a finite time.
Question 4 Let Ln be the Jacobian matrix of the map at iteration n. Construct and
write down explicitly the four elements of the Jacobian matrix of the standard map above.
Define en+1 = pn+1 − f (pn ), where en is the error iterating the map, f , on the vector p by one
step. We want to construct a correction term, Φn , such that
p̃n = pn + Φn (5)
defines a “shadow” orbit of p, i.e. {p̃n } is a true orbit of the dynamical system.
Solving for Φ, we find:
Φn+1 = f (p̃n ) − en+1 − f (pn ). (6)
For Φn small, we can expand f (p̃n ) to linear order, and
At each iteration, small perturbations along the contracting direction will decay exponentially
forward in time, while small perturbations in the expanding direction will grow exponentially
forward in time. The reverse will happen when evolving backwards in time.
We therefore want to find basis vectors un , sn aligned with the directions defining the maximum
expansion and contraction of the local volume of phase space at step n. We can construct un ,
sn by iterating the equations:
Ln un
un+1 = (8)
||Ln un ||
and
Ln s n
sn+1 = . (9)
||Ln sn ||
√ √ √ √
That is, take some initial u0 , s0 (eg. (1/ 2, 1/ 2), (−1/ 2, 1/ 2)), and a vector p0 = (x0 , y0 ).
Iterate un forwards, i.e. start with your u0 and iterate equation (8) forward until it has converged
to the local direction of expansion. To construct sn , do the same, but take the initial sN for
Φn = αn un + βn sn (10)
and
en = ηn un + ξn sn (11)
for some α, β, η, ξ.
Using equation (7) we find
αn+1 un+1 + βn+1 sn+1 = Ln (αn un + βn sn ) − (ηn+1 un+1 + ξn+1 sn+1 ). (12)
Question 5 Substitute equations (8) and (9) into equation (12) to find a recursion
relation for αn , βn . As before, solve for the αn by forward iteration from n = 0 and for
the βn by backward iteration from n = N for some suitable, fixed N .
Question 6 Integrating the standard map in double precision, from some known ini-
tial condition, with a known error, show that the shadow map of the erroneous initial
conditions follows the true trajectory within some shadow distance. If necessary, iterate
the shadowing to get a more closely shadowed orbit.
Now integrate your initial condition with single precision (introducing some, in principle
unknown) error per iteration, and construct the corresponding double precision shadow
trajectory.
Plot your trajectories and comment. (Finding and plotting such trajectories can be tricky!)
Note that shadowing does not always work. A trivial counter-example is provided by the one
dimensional logistic map f (x) = 1 − 2x2 , x ∈ (−1, 1).
Near x = 0, no true orbit can shadow general noisy orbits, as noise in f (x) may take the map
out of the domain and iterating the subsequent trajectory will take x to −∞.
It is known that the N -body problem, of N > 2 bodies moving under their own mutual gravi-
tational attraction only, is chaotic.
Here we consider a well known special case of the restricted three-body problem, where one of
the bodies has zero mass. In this particular problem, known as the Sitnikov problem ([4], [3]),
the motion of the zero mass is restricted to the z axis, defined by the normal to the plane of
motion of the two massive bodies, through the center of mass. The two massive bodies move
on Keplerian ellipses with eccentricity ∈ [0, 1] around their centre of mass.
Question 7 Write down the energy of the third mass, i.e. limm→0 (E/m) and solve for
the critical velocity, vc , for which the energy is zero. Write down z(t) for = 0. Write
down the Jacobian matrix of this map.
It is useful to define the initial velocity as some multiple of vc . We are interested in
(initially) bound motion, so v0 6 vc .
For = 0.03, 0.04, 0.05 and v0 /vc = 0.92, 0.94, 0.96, plot z(t) vs t. Comment.
In continuous time we can define a (maximum) Lyapunov exponent exactly analogous to the
discrete-time case:
1 kφt (z0 + w) − φt (z0 )k
λ(z0 ) = lim lim log (14)
t→∞ →0 t kwk
for almost all choices of perturbation w, where z = (z, ż) and φt denotes the evolution operator
defined by integrating the ODEs forwards in time.
Question 8 As before, construct a trajectory in (z, ż) space with an initial “error”, δ,
and integrate the true and erroneous trajectories for a chosen value of v0 ≈ 0.95vc .
Estimate numerically the Lyapunov exponent of the mapping for the different cases. Is
the motion chaotic?
Question 9 Using the method discussed in the previous section, construct a shadow
trajectory for the zero mass body, and compare “true” trajectories integrated with double
precision arithmetic, with the corresponding “shadow” trajectories integrated from the
same initial condition with single precision arithmetic.
Comment on the integrability of the N -body problem. Do you think numerical integrations
of N -body systems are reliable – or can be made reliable – in some sense?
References
[1] Bowen, R, 1975 J. Diff. Eqns., 18, 333.
[2] Grebogi, C., Hammel, S.M., Yorke, J.A., Sauer, T., 1990 Phys. Rev. Lett., 65, 1527
[3] Liu, J., Sun, Y.-S., 1990 Cel. Mech. and Dyn. Astro., 49, 285.
[4] Marchal, C., 1990 The Three-Body Problem, Elsevier (Oxford).
[5] Quinlan, G.D., Tremaine, S., 1992 MNRAS, 259, 505.
A deterministic finite-state automaton (DFA) is one of the simplest computing machines, con-
sisting of a finite number of states and bounded read/write memory. Such machines are very
limited in the things they can compute. However, the tasks they do compute are performed
extremely fast, like integer reductions mod m.
Associated to each DFA is a unique minimal state DFA; the unique one (up to renumbering of
states) which accepts the same language and has the smallest possible number of states. Thus,
every regular language can be represented by a unique minimal state DFA. From hereon, an
(n, k)-DFA is one with n states and an alphabet of size k.
There are many well-known algorithms that, on input of any DFA, produce the minimal DFA
associated to it. We suggest that in this project you use Hopcroft’s table-filling algorithm, as
detailed in [1] or in the lecture notes. The purpose of this project is to count all the minimal
(n, k)-DFAs for sufficiently small (n, k), and to investigate which of these define finite languages.
There are many symmetry properties of DFAs you can appeal to when going through this
project. These might make your programs run faster and more efficiently (and in some cases,
remove the need for unnecessary additional code). Moreover, you may find that you can verify
some of the small cases of what you are asked to compute, by hand, which might be useful to
do to check that your programs are running correctly.
2 Conventions
Without loss of generality, we will assume that our states are always labelled by integers, and
that any DFA with n states has these labelled by {1, . . . , n}. We will also always assume that
the state labelled 1 is the start state. Finally, we will assume that any DFA on k letters has
alphabet {1, . . . , k}. We will keep all these conventions throughout the project.
An (n, k)-transition table is a transition table for an (n, k)-DFA (recalling all the conventions
set out above).
Before you begin this project, you will need to establish a way to store transition tables. As we
have adopted the convention that state 1 is the start state, this does not need to be reflected in
the table. One way to store such tables could be as a matrix (indeed, Matlab uses matrices
as one of its primary data structures). You can use an n × (k + 1) matrix to represent an
(n, k)-transition table, with the final column consisting of 1’s and 0’s to denote which states
are accepting or non-accepting. Conventions vary, but in this project the start state can be an
accept state.
The number of (n, k)-DFAs can be vast, even for relatively small k and n. However, the number
of unique languages defined by these is much less than the total number of possible DFAs.
Question 1 Give a closed-form expression for the number of DFAs with n states and
alphabet of size k. Present this as a 6×4 table of values (in exponent form, to 2 significant
figures) for 1 6 n 6 6 and 1 6 k 6 4.
4 Accessible states
A state in a DFA is said to be accessible if it can be reached by starting at the start state and
following a path in the transition diagram labelled by some word w. Otherwise, it is inaccessible.
Clearly, a minimal DFA can have no inaccessible states.
In all subsequent questions, an arithmetic operation can be taken to be the addition, subtraction,
multiplication or division of two integers; or the reading or re-writing of a matrix entry.
Question 2 Write a program to determine the set of accessible states from any arbi-
trary transition table. Run your program on the DFAs given by Table1, Table2, Table3
and Table4 in the file DataTables on the CATAM website. What is the complexity of
your algorithm for an arbitrary (n, k)-transition table, in terms of n and k? Compute the
number of (n, k)-DFAs with no inaccessible states, for k = 2 and n = 1, 2, 3, 4.
Two states p, q of a DFA are said to be equivalent if, for every word w, if we follow the two
(unique) paths in the transition diagram from p and from q labelled by w, then either both end
at accepting states or both end at non-accepting states. Hopcroft’s table-filling algorithm is a
quick and efficient way to determine which states of a DFA are equivalent.
It is often useful in mathematics to compute and analyse some special cases of a mathematical
object, to gain some intuition for how the object behaves as a whole. In this project, we will
try and understand properties of minimal DFAs and the language they define, by looking at all
the minimal (n, k)-DFAs for small n and k. This section will require you to use the programs
developed in the previous sections as sub-routines.
7 Finite languages
It is immediate that every finite language is regular. We will now investigate which of the
minimal DFAs computed in previous questions define finite languages.
Question 5 Let D be a DFA with n states. Prove that L(D) is infinite if and only if
L(D) contains a word w of length n 6 |w| 6 2n − 1.
From Question 4 you should have a list (possibly with some repetition) of all the minimal
(n, 2)-transition tables, for n = 1, 2, 3, 4. You should make use of this list in the remaining
questions.
Question 8 For n = 1, 2, 3, 4, write out all minimal (n, 2)-transition tables (only one
for each equivalence class) which give the finite languages of size 1 and of size 2n−1 − 1 (if
any exist). Write out the language each of these defines.
Question 9 For n > 1, give closed-form expressions in terms of n for f (n, 2, 1) and
f (n, 2, 2 n−1 − 1). Prove your expressions hold. What about f (n, 2, s) when 2n−1 6 s 6
n
2 − 1?
References
[1] J.E. Hopcroft, R. Motwani and J.D. Ullman, Introduction to automata theory, languages
and computation (Chapters 2–4), 2nd edn, Addison-Wesley, 2001.
1 Introduction
In cosmology there are many ways to specify the distance between two points because, in the
expanding Universe, the distances between objects are changing and Earth-bound observers look
back in time as they look out in distance. All these distances measure the separation between
events on radial null trajectories, trajectories of photons which terminate at the observer.
The metric for a homogeneous isotropic universe, in spherical polar coordinates for the spatial
part, is
dr2
2 2 2 2 2 2 2 2
ds = c dt − R(t) + r (dθ + sin θdφ ) , (1)
1 − kr2
where, by a suitable choice of radial coordinate r, k = −1, 0 or 1 for open, Euclidean or closed
geometries.
In this metric the redshift relative to an observer at the spatial origin is given by
1 + z = R(t0 )/R(t1 ), (2)
where t0 is the coordinate time at which the photon is received and t1 that at which it was
emitted. Thus, for a given observer, the redshift depends only on the radial scale factor of the
Universe at the time the photon was emitted divided by its value at the observer’s time. The
redshift is important because it can be measured easily from the observed wavelengths of atomic
transition lines with known rest wavelengths.
When the matter density at time t is ρ and the pressure is zero one of the Einstein field equations
with the cosmological term becomes
Ṙ2 kc2 Λc2 8πG
+ 2 − = ρ, (3)
R2 R 3 3
where Ṙ = dR
dt and Λ is a constant. The other field equation can be combined with this to give
the conservation of matter equation
ρR3 = const. (4)
For small distances the redshift cz = H0 d, where d is the distance to the source. Then H0 ,
the Hubble constant, gives the local expansion rate. It is often written in the form H0 =
100h km s−1 Mpc−1 = 3.2409 × 10−18 h s−1 , where h is dimensionless. The actual value of h
is still uncertain, and hotly debated, but most would agree on measurements of 0.72 ± 0.08.
The megaparsec is an astronomical length unit appropriate for separations between galaxies,
1 Mpc = 3.0856 × 1022 m. The Hubble time tH = 1/H0 = 3.0856 × 1017 h−1 s and the Hubble
distance DH = c/H0 = 9.26 × 1025 h−1 m. Take the number of seconds in one year to be
3.1556926×107 s.
Our Universe can be described by two parameters, the matter density now ρ0 and the cosmo-
logical constant Λ, and we can express these in a dimensionless form using H0 as
8πGρ0
Ωm ≡ (5)
3H02
Ωm + ΩΛ + Ωk = 1. (7)
2 Lookback Time
The lookback time tL is the difference between the age t0 of the Universe now and the age te
when the photons were emitted
dz 0
Z z
tL = tH 0 0
. (8)
0 (1 + z )E(z )
Question 2 Write a program to determine the lookback time in Gyr for general H0 ,
Ωm and ΩΛ . If H0 = 72 km s−1 Mpc−1 , tabulate the lookback time to z = 0.1, 1.0, 2.0,
4.0 and 6.7 (one of the highest individual object redshifts measured so far) for
(1) an Einstein-de-Sitter universe Ωm = 1, ΩΛ = 0,
(2) a classical closed universe Ωm = 2, ΩΛ = 0,
(3) a baryon dominated low density universe Ωm = 0.04, ΩΛ = 0 and
(4) the currently popular Universe Ωm = 0.27, ΩΛ = 0.73.
What is the age of the Universe for each of these models [to the nearest 100 million years]?
Produce a graph showing lookback time against redshift for the four models and comment
on any overall trends.
3 Distance Measures
(2) The angular diameter distance is the ratio of an object’s physical size to its angular size (in
radians). For an object of size ` at redshift z the angular size is θ = `/DA , where θ is a small
angle (so sin θ ≈ tan θ ≈ θ).
(3) The luminosity distance DL is defined by the relationship between the observed photon
energy flux f , integrated over all frequencies, and the intrinsic energy output from the source
L by
L
f= .
4πDL2
It is related to the angular diameter distance by
DL = (1 + z)2 DA . (11)
Question 4 Write a program to determine the luminosity and angular diameter dis-
tances given the redshift z and plot the dimensionless values DA /DH and DL /DH for
redshifts 0 < z < 7 for (Ωm , ΩΛ ) = (1, 0), (0.04, 0) and (0.27, 0.73). For these three cases
tabulate the values at redshifts z = 1, 1.25, 2.0 and 4.0.
4 Comoving Volume
The comoving volume VC is the volume measure in which the number density of non-evolving
objects is constant with redshift. The comoving volume element in solid angle sin θdθdφ and
redshift interval dz is
(1 + z)2 DA
2
dVC = DH sin θ dz dθ dφ.
E(z)
Integrating this from the present to redshift z gives the total comoving volume over the whole
sky to redshift z,
4π DL3 4π 3
V = = D for Ωk = 0. (12)
3 (1 + z)3 3 C
A method to test whether a sample of objects has a uniform comoving density and luminosity
which does not change with cosmic time is to use the < V /Vmax > test. It is assumed that all
objects with observed flux f > f0 are detected and the observed flux f and the redshift z are
measured for each object. For a given luminosity L there is a maximum redshift zmax (L) at
which the observed flux is f0 so that the object is just included. Corresponding to this redshift
is a maximum volume Vmax (L). Then, if we have a distribution of luminosities so that Φ(L)dL
is the number per unit comoving volume with luminosity between L and L + dL, the total
number of objects in the sample is
Z ∞ Z Vmax (L)
Φ(L) dV dL.
0 0
Question 6 Write a program to read pairs of numbers z and f /f0 , determine V and
Vmax for these for Universe models for which Ωk = 0 and determine the average value
< V /Vmax > for each model.
Verify that for small values of z the program gives the Euclidean limit, for individual cases
3
V /Vmax ∝ (f /f0 )− 2 .
Apply the program to the sample, listed below, of 114 quasars from an area of sky. What
is the value of < V /Vmax > for this sample if Ωm = 0.27 and ΩΛ = 0.73? Is the value of
< V /Vmax > what you would expect from a constant comoving population? How might
you interpret the result you obtain?
Question 7 The sample in the previous question was also subject to the constraints
z > 0.20 and z < 3.0, because it is only in this range that an object be recognised as a
quasar. How would you modify the V /Vmax quantity so that for a uniform distribution
in this redshift range the average value is still 12 ? What is the result of using this on the
sample of 114 quasars?
References
[1] Peebles, P. J. E., 1993, Principles of Physical Cosmology, Princeton University Press.
Quasar data. The following may also be found in the file quasar.dat in the data directory
on the CATAM website:
z f /f0 z f /f0 z f /f0 z f /f0 z f /f0 z f /f0
0.202 1.570 0.217 3.250 0.225 2.884 0.237 3.630 0.246 1.213 0.259 1.330
0.274 1.614 0.298 1.330 0.315 2.032 0.322 1.066 0.332 1.976 0.351 1.018
0.362 1.096 0.373 1.191 0.385 2.937 0.402 2.355 0.433 1.853 0.449 4.168
0.460 5.105 0.479 1.706 0.492 1.629 0.507 1.940 0.530 1.472 0.549 1.419
0.571 2.511 0.582 2.089 0.590 1.599 0.609 1.406 0.624 1.018 0.641 1.018
0.659 3.564 0.672 2.511 0.679 2.013 0.692 1.294 0.714 1.294 0.723 1.584
0.737 1.342 0.754 1.076 0.774 1.753 0.781 2.128 0.791 2.779 0.803 2.421
0.832 1.106 0.847 1.527 0.874 1.158 0.892 1.202 0.913 2.167 0.934 1.629
0.955 1.887 0.973 2.208 0.993 2.558 1.012 1.355 1.025 1.247 1.040 1.318
1.056 1.213 1.072 1.803 1.092 1.330 1.115 1.342 1.140 1.086 1.152 1.180
1.182 1.180 1.205 1.393 1.220 1.247 1.234 1.342 1.247 2.535 1.263 1.047
1.288 1.541 1.313 1.028 1.332 1.037 1.343 1.235 1.376 2.779 1.388 1.202
1.400 1.086 1.440 1.127 1.455 1.009 1.469 1.056 1.487 1.330 1.511 1.330
1.543 1.028 1.559 1.202 1.583 1.819 1.593 1.294 1.619 1.614 1.641 3.047
1.664 1.393 1.684 1.158 1.700 1.513 1.727 1.629 1.756 1.137 1.776 1.355
1.810 2.831 1.844 1.018 1.878 2.208 1.913 2.606 1.941 1.342 1.961 1.555
1.976 2.910 2.005 1.106 2.035 1.306 2.075 1.887 2.092 1.958 2.106 1.367
2.134 1.406 2.187 2.089 2.244 1.202 2.297 1.106 2.329 1.009 2.388 1.737
2.442 1.644 2.523 1.000 2.595 1.393 2.649 1.282 2.786 1.527 2.936 1.445
ds2 = gtt dt2 + 2gtφ dtdφ + gφφ dφ2 + grr dr2 + gθθ dθ2 (1)
where the metric components are functions of the coordinates r and θ only.
are constants of geodesic motion in any axisymmetric spacetime (1). Hence, derive the
mass conservation integral
2 2
dr dθ
grr + gθθ = −Veff (r, θ, E, Lz ) (4)
dτ dτ
where the effective potential, Veff , should be found in terms of E, Lz and the metric
components.
The effective potential defines the allowed regions of geodesic motion for a particular choice of
the energy, E, and angular momentum, Lz . Motion is only possible where Veff > 0.
The Kerr metric has components
2 a m r sin2 θ 2 m r (r2 + a2 )
2mr
gtt = 1 − , gtφ = , gφφ =− ∆+ sin2 θ,
Σ Σ Σ
Σ
grr = − , gθθ = −Σ (5)
∆
where Σ = r2 + a2 cos2 θ, ∆ = r2 − 2 m r + a2 , m is a constant (the mass of the black hole) and
a is another constant (the spin parameter of the black hole).
Programming Task: Write a program to numerically integrate the second order timelike
geodesic equations
d2 xi j
i dx dx
k
= −Γjk (7)
dτ 2 dτ dτ
for the Kerr metric (5). Do not make use of the first integrals derived above (3)–(4), but
write the four second order equations as a set of eight coupled first order equations. We
will use the first integrals to verify the numerical accuracy of the integrations.
The Schwarzschild metric may be obtained by setting a = 0 in the Kerr metric (5).
Question 3 What are the non-zero Christoffel symbols for the Schwarzschild metric?
Take m = 1 and find the zeros of the effective potential (4) in the equatorial plane,
θ = π/2, for the case E = 0.97, Lz = 4. Hence determine the allowed range of radii, r, of
bound orbits in the equatorial plane. Then, using your geodesic code, do the following
a Take initial conditions r = 15, θ = π/2, dr/dτ = 0 and the value of dθ/dτ determined
from the effective potential (4). Plot the coordinates, (t, r, θ, φ), of the particle as a
function of τ over several orbits. Check that the three conservation laws (3)–(4) are
satisfied at a reasonable level of numerical accuracy.
b For the same choice of E and Lz , take a range of initial conditions that lead to bound
motion (e.g., consider initial conditions in the equatorial plane with dr/dτ = 0 and
a range of values of r(0)). Output the values of r and dr/dτ every time the orbit
crosses the equatorial plane, θ = π/2, with dθ/dτ > 0. Plot these values on a graph,
with r on the horizontal axis, and dr/dτ on the vertical axis. What do you notice?
c Experiment with a few different values of E, Lz and initial conditions.
You have plotted a Poincaré map for these orbits. If the Poincaré map of an orbit is a closed
curve it indicates the possible existence of an extra isolating integral for the motion.
Question 4 Take a = 0.9, E = 0.95 and Lz = 3, and use the effective potential to find
the allowed range of r0 for which the initial conditions θ = π/2, r = r0 and dr/dτ = 0 lead
to bound motion. Plot a Poincaré map as described above for a range of initial conditions
of this type. Is the result similar to what you saw for the Schwarzschild metric?
is conserved for geodesic motion in the Kerr metric, where δ is a numerical constant
that should be determined. You may use a CAS to help demonstrate this, but should
include evidence of the calculation. What does Q become in the limit a = 0, i.e., for the
Schwarzschild metric? Provide a physical interpretation if possible.
References
[1] Chandrasekhar, S.; The Mathematical Theory of Black Holes; Clarendon Press: Oxford;
1992.
[2] D’Inverno, R.; Introducing Einstein’s Relativity; Clarendon Press: Oxford; 1992.
[3] Goldstein, H., Poole, C. & Safko, J.; Classical Mechanics, third edition, Pearson Education
International: New Jersey; 2002.
A primality test is an algorithm used to determine whether or not a given integer is prime.
In this project we consider several different primality tests, and apply them to numbers in the
range from 3 up to 1010 .
1 Trial division
The simplest primality√test is trial division: N is prime if and only if it is not divisible by any
integer t with 1 < t 6 N .
Question 1 Write a program to test for primality using trial division. Use your pro-
gram to list the primes in the intervals [188000, 188200] and [109 , 109 + 200].
Fermat’s Little Theorem states that if p is prime then ap−1 ≡ 1 mod p for any a coprime to p.
The Fermat test base a for N , if 1 < a < N , is to compute aN −1 mod N : if this is not ≡ 1 mod N
then N is certainly composite. A Fermat pseudoprime base a is a composite N which passes
the Fermat test base a.
Question 2 Write a program to carry out the Fermat test base a, capable of working
10
for N up to 10 . Run it on the intervals in Question 1, say for a up to 13. Check
that the output is consistent with your earlier answer, and make a note of any Fermat
pseudoprimes that you find.
You should take care to devise a good algorithm for computing ab mod N , ensuring that
there is no possibility of integer overflow during the calculation, in view of the possibility
that the integers in your chosen language may be limited to 1015 or similar. (See the note
on programming at the end of the project.) If the integers in your language are large
enough for there to be no chance of overflows, then you should still comment briefly on
how you might manage if you were limited to 1015 . (Hint: A multiplication modulo N
can be done in two pieces.)
Briefly discuss the complexity of your algorithm, i.e. the time theoretically taken. To do
this, you should consider (without coding it) how your program might extend to arbitrarily
large N , and work out roughly how many basic operations are needed as N → ∞, where
a basic operation could be addition or multiplication of two numbers of similar size to N .
(A more sophisticated analysis might also take into account the time required to add and
multiply large numbers on a finite machine, but it isn’t necessary to go into such details
here.)
The existence of infinitely many absolute Fermat pseudoprimes (proven in 1994) means that
the Fermat test cannot be relied on to prove the primality of N any faster than trial division,
although it can usually detect compositeness quickly.
Question 4 Write a procedure to evaluate the Jacobi symbol and modify your previous
program to carry out the Euler test. Find the Euler pseudoprimes base 2 and the absolute
Euler pseudoprimes up to 106 . How many values of a are necessary to determine the
primality or otherwise of those N in this range which are not absolute Euler pseudoprimes?
Question 7 Combining all your previous results, devise the most efficient algorithm
you can for determining the primality or otherwise of a number N in the range from 3 up
to 1010 . (This may be a mixture of the above methods.)
Experiment on a suitable number of random numbers, say 10000, in the range to compare
the time taken by your algorithm and trial division. Comment also on the time required
theoretically for these algorithms for large N , making clear any assumptions you are
making.
An important question in applications is the reliability of these tests for determining the pri-
mality of randomly chosen numbers in some range.
Question 8 Fix a value of k and suppose that N is chosen uniformly at random from
all odd integers of exactly k bits, that is, between 2k−1 and 2k − 1. (You should be able
to take k at least 15.) Investigate the probability that if N passes t rounds of the strong
test with randomly chosen a, then N is in fact composite.
Programming note
The quantities of interest in this project are natural numbers. You might be aware that computer
languages typically like to be told if a number is an integer rather than a general real, because
it means there is no need to save a fractional part, and it avoids concerns over rounding errors.
Matlab, though, is incorrigibly real-minded. Hence, although Matlab allows you to tell it a
number is an integer with the int32 command (see the CATAM manual), it is better for this
project if you don’t do so. In other words, you need not worry about distinguishing between
reals and integers.
You should, however, be aware that Matlab can handle integers in this way only up to about
15 digits. (Try eps(10^15) and eps(10^16). Try also 10^8 * 10^8 - (10^16-1).) So you
must avoid calculations involving integers exceeding 1015 .
You may use the Matlab functions conv and deconv to multiply and divide polynomials.
Obviously no credit can be given in this project for using inbuilt functions such as isprime or
factor — you are expected to write and analyse your own programs. You may however use
the inbuilt Matlab function gcd.
[1] Riesel, H., Prime numbers and computer methods for factorisation, 2nd edition, Progress in
Mathematics 126, Birkhauser, 1994
[2] Koblitz, N., A Course in Number Theory and Cryptography, Graduate Texts in Mathematics
114, Springer, 1987.
1 Factor bases
In this project N will be a (usually large) integer, that we would like to factor, and B will be
a finite set of (usually small) primes. We call B the factor base. Sometimes it is convenient to
allow −1 as an element of B.
The integers that may be accurately represented in your chosen language may be limited to 1015
or similar. For example in MATLAB numbers are represented by default as doubles, meaning
that they are stored to 16 significant (decimal) figures. For integers larger than 1015 functions
such as mod and int2str may give incorrect answers. If your language is able to handle larger
integers, then you are still expected to comment where appropriate on how you would manage
if you were restricted to 1015 .
On a modern computer it is practical to factor integers of the size considered in this project by
trial division. We study a factoring method that remains practical for much larger values of N .
To enable comparisons, we therefore make the following artificial restriction: the factor base B
is only allowed to contain primes that are less than 50.
2 Continued fractions
The continued fraction algorithm applied to a real number x0 = x forms a sequence of partial
quotients an by the transformation
an = bxn c
1
xn+1 =
xn − an
where as usual bxc denotes the greatest integer 6 x. The algorithm terminates if xn = an ; this
happens if and only if the initial x is rational.
The convergents Pn /Qn are defined for n > 0 by
Pn = an Pn−1 + Pn−2
Qn = an Qn−1 + Qn−2
For a fixed value of the positive integer N , the equation x2 − N y 2 = 1, in integer unknowns x
and y, is called Pell’s equation. The negative Pell equation is x2 − N y 2 = −1.
Question 3 Tabulate the quantities Pn2 − N Q2n for some values of N and hence com-
ment on the use of continued fractions to solve Pell’s equation, and the negative Pell
equation. Can you see any simple condition on N which ensures that the negative Pell
equation is insoluble?
Write a procedure that given x, y, N 6 1015 tests whether x2 − N y 2 = ±1, being careful
to avoid integer overflow. (Hint: try working mod p for several primes p.)
Use your observations to write a program to find non-trivial solutions to Pell’s equation.
Tabulate the solutions found for each N in the ranges 1 6 N 6 100 and 500 6 N 6 550,
when such a solution exists. You may find a few values of N , such as 509, beyond the
capacity of your program; you do not need to correct for this. You should however make
sure that all answers you do give are correct.
3 Factorization
One source of integers√x, y with x2 ≡ y 2 mod N arises from the convergents of the continued
fraction expansion of N .
Question 5 Modify your programs to compute Pn mod N and Pn2 mod N for N up to
1010 . Explain how you avoid integer overflow. (Hint: a multiplication mod N can be done
in two pieces.) Run your program for N = 2012449237, 2575992413 and 3548710699.
Question 6 Let A be a matrix over the field F2 = Z/2Z. Write a program using
Gaussian elimination to determine whether there exists a non-zero column vector v with
Av = 0, and to find one if there is.
References
[3] Riesel, H., Prime numbers and computer methods for factorization.
1 Introduction
The Galois group G(f ) of a polynomial f defined over a field K is the group of K-automorphisms
of the field generated over K by the roots of f (the Galois group of the splitting field for f over
K).
We shall consider Galois groups over the rationals and polynomials f which are monic and have
coefficients in Z. Assume that f has no repeated factor. Let f have degree ∂f = n with Galois
group a subgroup of the symmetric group Sn acting on the roots of f . The decomposition group
of f modulo p is the Galois group Gp (f ) of f regarded as a polynomial over the finite field
GF(p), provided that f modulo p does not have a repeated factor. The key result we shall use
is that the decomposition group (when defined) is always cyclic and isomorphic to a subgroup
of the Galois group of f . Furthermore, it is isomorphic in a way which preserves cycle types, so
the cycle type of the generator of the decomposition group will also occur in the Galois group.
We shall use the decomposition groups to derive information about the Galois group of a poly-
nomial f . For example, if G(f ) contains a 2-cycle, a (n−1)-cycle and an n-cycle, then it must be
Sn . As the decomposition group is always cyclic, this information does not distinguish between
groups with the same abelian subgroups, but computing a sufficient number of decomposition
groups will usually determine the cyclic subgroups and hence often determine G(f ), although
there is always the possibility that our answer will be too small if we do not compute enough.
2 The algorithms
To find the decomposition group of f modulo p, we need information about the factorisation
of f over GF(p). There is a repeated factor in f iff f has a factor in common with its formal
derivative f 0 and this can be determined by applying the Euclidean algorithm. Since GF(pr ) is
the splitting field of any irreducible polynomial of degree r over GF(p) and the Galois group of
GF(pr ) over GF(p) is the cyclic group Cr generated by x 7→ xp , it is only necessary to find the
degrees of the irreducible factors in f in order to find its Galois group over GF(p). We let fr
be the product of all the irreducible factors of degree r in f : then there are nr = ∂fr /r factors
of degree r in f and the Galois group Gp (f ) of f over GF(p) is cyclic, where the generator has
nr r-cycles for each r.
We determine fr by the observation that the elements of GF(pr ) all satisfy the equation φr (X) =
r
X p − X = 0 and hence, if we proceed by successively removing the factors f1 , . . . , fr−1 then
at the rth stage we can obtain fr by taking the highest common factor of the residue with φr .
Question 1 Write procedures to compute the quotient and remainder from dividing
two polynomials over GF(p) and use them to write a procedure to find the highest common
factor of two polynomials over GF(p). Include in your report some test output from all
three procedures. Describe an efficient way of using your procedures to compute a large
power of one polynomial modulo another polynomial.
X 2 + X + 41,
X 3 + 2X + 1,
X 3 + X 2 − 2X − 1,
X 4 − 2X 2 + 4,
X 4 − X 3 − 4X + 16,
X 4 − 2X 3 + 5X + 5,
X 4 + 7X 2 + 6X + 7,
X 4 + 3X 3 − 6X 2 − 9X + 7,
X 5 + 36,
X 5 − 5X + 3,
X 5 + X 3 − 3X 2 + 3,
X 5 − 11X 3 + 22X − 11,
X 6 + X + 1,
X 7 − 2X 6 + 2X + 2,
X 7 + X 4 − 2X 2 + 8X + 4,
X 7 + X 5 − 4X 4 − X 3 + 5X + 1.
Your program should tabulate its output in columns, so that the results for this question
take only a few pages in total.
Question 4 Discuss the Galois groups of these polynomials in the light of your output,
with special reference to the reducible polynomials. Assuming, in each case, that the group
is the smallest possible, formulate a conjecture as to the relative frequencies of the various
cycle shapes for a fixed polynomial f as p varies. Do any of the polynomials (especially
those of smaller degree) appear to contradict this conjecture? If so, run your programs
for these polynomials for higher values of p and see if this rectifies the matter.
Programming note
If you use Matlab then you may wish to use the DocPolynom class that is included as an
example in the help browser. To use this you should create a directory @DocPolynom and place
DocPolynom.m into it. This will enable you to define and display (non-zero) polynomials and to
carry out standard algebraic manipulations with them. There is no need to include the class file
in your program listings (assuming you do not modify it). [The latest version requires MATLAB
2022b or later to run.]
References
1 Introduction
Suppose we are given a set of permutations of X = {1, . . . , n}. They generate a finite permu-
tation group G 6 Sn . The aim of this project is to replace the given set of generators of G
with another generating set for G which is of greater utility, hopefully allowing us to deal with
various questions. For programming purposes you do not need to go above n = 20 (although
you are welcome to if you so wish).
2 Permutations
3 Groups
Question 2 Show that the modified set of permutations generates the group G. Give
an upper bound for the size of the modified set of generators and for the number of
operations needed to complete the algorithm. (As a function of n and the size of the
original generating set, noting that, e.g., storing a permutation is O(n) operations.)
Question 4 Write down a bijection between the set of left cosets of Gα in G and the
orbit of α. State the orbit-stabilizer theorem.
Question 5 Write a procedure which computes the orbit with witnesses of a given
element under a permutation group G generated by a given set of permutations. It should
receive as input a set of permutations and an element α ∈ X and should return as a
output a list of elements forming the orbit of α, together with a witness in each case.
Briefly explain how your procedure works.
{ϕ(yt)−1 · y · t | y ∈ Y, t ∈ T }.
Question 9 For the group Sn (throughout this question you may take n > 5), we con-
sider the probability Pn that a pair of elements g, h picked uniformly at random generates
Sn . In other words we have
|{(g, h) ∈ Sn × Sn : hg, hi = Sn }|
Pn = .
|Sn |2
Why do you know from IA that Pn > 0? Give a straightforward argument to show that
there is k < 1 independent of n such that Pn 6 k. What is your value for k? What is the
value of Pn for very small n?
For each of a few moderate values of n, generate 100 or so random pairs of permutations.
Describe how you generate a random permutation. (To do this, you may assume you have
a random number generator which, with input an integer N from 1 to say 100, will output
an integer uniformly at random between 1 and N inclusive.)
Using your previous program, what sort of estimates do you obtain for Pn ?
Question 1 Write a procedure which applies the greedy algorithm to a graph with
a given ordering of the vertices. Test your program on ten members of G(70, 0.5), and
compare the number of colours used when the vertices are ordered in the following ways:
(i) by increasing degree, (ii) by decreasing degree, (iii) where vj has minimum degree in
the graph G − {vj+1 , . . . , vn }, (iv) at random.
Do the same for G3 (70, 0.75).
Question 2 What ordering will guarantee that the greedy algorithm uses no more
than 3 colours for G3 (70, 0.75)? Why do you think the probability 0.75 was chosen here?
For each n give an example of a graph G of order 3n such that χ(G) = 3 but on which
greedy might need n + 2 colours.
2 Cliques
A clique in a graph G is a complete subgraph of largest order in G. (This definition differs from
some in the literature.) Notice that χ(G) is at least as large as the order of a clique.
A greedy-type algorithm for finding a complete subgraph in G would start with a subgraph of
order one (a vertex) and repeatedly try to find a vertex joined to all vertices of the subgraph
selected so far, until no further such vertex could be found.
3 Colouring
An independent set in a graph is a subset of the vertex set which spans no edges. A colouring
is thus just a partition of the vertices into independent sets.
None of the above methods for bounding χ(G) is guaranteed to find χ(G) exactly.
Question 6 Estimate (crudely) the theoretical running times of all the algorithms
used above as functions of n when the input is a typical member of G(n, 0.5). Describe
in outline, but do not implement, a procedure for colouring a graph with exactly χ(G)
colours, and estimate its running time.
References
Question 2 Estimate the theoretical running time of your algorithm as best you can.
Compare the answers for the worst case and an average case.
You will notice that your algorithm rapidly becomes prohibitively expensive as the order of the
graph increases. In this case an “approximation algorithm” can be useful. An approximation
algorithm for the Hamiltonian cycle problem would seek to make a very good attempt at finding
a cycle in a short space of time. If it succeeds, well and good. If it fails, there may have been a
cycle it missed, but it is hoped that the probability of this will be small.
Here is a simple algorithm to search for a Hamiltonian cycle. Construct a sequence of paths
P1 , P2 , . . . , where P1 is just a single vertex v0 . Given a path Pj from v0 to vk , proceed as
follows:
Question 4 Implement this algorithm, and try it on your earlier examples. You should
set a stopping time T for the procedure so that, if it has constructed PT and still found no
cycle, it quits. What functions work well in practice (i.e. fairly reliably find a cycle but
aren’t too expensive)?
References
[1] Bollobas, B., Modern Graph Theory, Springer 1998.
Question 1 Write a procedure to find the minimum distance of a code. Use your
procedure to write a program which generates random codes of length n and size r and
then computes the minimum distance.
Run your program several times with various values of n and r, for each choice finding
the best (i.e., largest) d that you can.
Question 2 Now generate codes of length n and minimum distance d by starting with
an initial code vector, say (0, . . . , 0) and randomly generating further vectors, adding a
new vector to the code if it has distance at least d from all the vectors already in the code.
Run your program several times for each choice of parameters n and d, finding the best
(i.e., largest) r that you can.
Question 3 Take the output from the two previous questions and plot the corre-
sponding points on a graph with information rate and error-control rate as the two axes.
Comment on your results.
We call a code linear if it forms a subspace of the Hamming space, regarded as a vector space
over the field F of 2 elements. The weight w(x) of a vector x is the number of non-zero
components, that is, the Hamming distance d(x, 0). The minimum distance of a linear code is
just the minimum non-zero weight. The rank k of a linear code is the dimension of the code as
a subspace, and the size of a linear code is r = 2k .
Question 4 Write a procedure to find the minimum non-zero weight of a code gener-
ated over F by a set g1 , . . . , gk of k generators. Use your procedure to find linear codes of
given length n and either given rank k or given minimum distance d by considering random
sets of generators. As before, run your programs several times to plot the information
and error-control rates and comment on the results.
[1] C.M. Goldie and R.G.E. Pinch, Communication theory, CUP, 1991.
1 Introduction
This project concerns certain probability models for bond percolation. The book by Grimmett
[1] is a good source to learn more about percolation.
We work on a connected graph G = (V, E), that is, a collection of nodes V connected by edges
E. To each edge e ∈ E, we assign, independently, a uniform random variable Ue ∼ U [0, 1]. We
decide on a value p ∈ [0, 1]; we declare the edge e to be p-open if Ue < p, and we declare it to
be p-closed otherwise.
When p is very small, very few edges are open; but as we increase p, there appear open clusters,
i.e. sets of nodes connected by open edges.
Percolation theory is the study of the geometry of the open clusters. In particular, important
questions are whether or not there exists an infinite cluster of open edges; and if one does exist,
how many infinite clusters there are. Clearly if p = 0 there is none and if p = 1 there is one
open cluster, namely the graph G itself.
Let V , the set of nodes of the graph, consist of finite strings, as follows: V contains the
empty string ‘’ (also known as Eve), and the three strings ‘1’, ‘2’ and ‘3’ (also known as Eve’s
daughters), and also every string that is one of Eve’s daughters followed by a finite sequence
of ‘1’s and ‘2’s. Two nodes are connected by an edge if one can be obtained by appending one
digit to the other. For example, ‘3221’ is connected to ‘322’ (its mother) and to ‘32211’ and
‘32212’ (its two daughters). As before, each edge e is assigned a random variable Ue ∼ U [0, 1].
(We can use this as a crude model to describe the propagation of a defective gene in a popula-
tion.)
Question 1 Let φp be the probability that Eve’s daughter ‘1’ is in an infinite open
cluster consisting of her own descendents. Show that
It can be shown that φp is the maximal solution to this equation. Find φp . (One way to
obtain a ‘merit’ mark in this project, though not the only way, is to show that φp is the
maximal solution.)
Now let θp be the probability that Eve is in an infinite open cluster. Find θp and draw its
graph as a function of p.
Question 2 Show that, for p 6 12 , there are almost surely no infinite clusters.
How many infinite clusters are there if 21 < p < 1? Justify your answer.
The square lattice in two dimensions L2 is a graph with V = Z2 = {(m, n) : m, n ∈ Z}. If the
distance function is
d (k, l), (m, n) = |k − m| + |l − n|,
the edges of the graph are straight lines connecting nodes which are distance 1 apart.
Let us look at two techniques which will help us estimate the critical probability pc above which
there is an infinite cluster and below which there is none.
Lower bound
Let us start at the origin. Let σn be the number of self-avoiding paths (i.e., paths which traverse
each edge at most once) of length n leading away from the origin.
1/n
Question 3 Let λ = lim supn→∞ σn . Show that λ 6 3. Show further that pc > λ−1 .
Upper bound
The general behaviour of large clusters in the subcritical region p < pc is described in the
following result. Some notation first: We say x ↔ y if there exists an open connected path
between x and y. Define the open sphere Sn to be
Sn = x ∈ Z2 : d(x, 0) 6 n .
The boundary ∂Sn consists of the nodes where d(x, 0) = n. Let Pp (0 ↔ ∂Sn ) be the probability
that there exists a p-open path connecting the origin to some node in ∂Sn . It can be shown
that, for p < pc , there exists ψp > 0 such that
The proof is beyond the scope of this project but can be found in [1, Sections 5.2 and 6.1].
Let En be all the edges in E connecting these nodes. We call {(k, l) ∈ Vn : k = 0} the left
boundary, and {(k, l) ∈ Vn : k = n} the right boundary. Let A be the event that some node in
the left boundary is connected to the right boundary via a path consisting of open edges.
Question 4 Show that P1/2 (A) = 12 . Hint. You may find it useful to consider the
dual graph Ḡn , which has nodes at
Vn0 = (k + 12 , l − 21 ) : 0 6 k 6 n − 1, 0 6 l 6 n ,
and edges joining those nodes which are distance 1 apart, and whose edges are open or
closed depending on whether the edges of G are open or closed, in a manner which you
should specify.
Question 5 By constructing n events {Ai } such that A = ∪Ai , and using (1), prove
that pc 6 12 .
We can try and use computing power to estimate pc and θp . Suppose that at time n = 0 you
are an invading force standing at the origin. We will call In the set of nodes you have invaded
by time n. At time n + 1 you invade another node by looking at the edge-boundary of your
territory and walking along the edge with the least value of Ue attached to it. Formally, we
define
e
∂In = e ∈ E : In ↔ Z2 \ In .
e
We use the notation x ↔ y to mean that the edge e connects x and y. You walk from In along
the edge en ∈ ∂In which satisfies
so that In+1 is In with the node at the other side of en added. It can be shown that, almost
surely,
lim sup Uen = pc . (2)
n→∞
(The proof is not hard, and is outlined at the end of this project.) The advantage of using
the sequence Uen to estimate pc is that the amount of memory required to store Uen and to
calculate Uen+1 is O(n). This is true whether we are working in L2 or L47 .
Question 7 Use your program to estimate pc for L2 , and explain your method. Es-
timate pc for L (which is defined like L2 , but using Z3 rather than Z2 ). Include in your
3
Question 8 Explain how the invasion process can be used to estimate θp simultane-
ously for all p. Produce a plot of θp against p for L2 by running the simulation for large n
several times (at least n = 5000, at least 500 times). For what values of p do you expect
your plot to be inaccurate? Why?
Appendix
Then we have (with a positive probability) an infinite cluster, with all but finitely many of its
edges p-open. It follows that (with a positive probability) there exists an infinite p-open cluster.
This contradicts the definition of pc as the critical probability.
References
Introduction
Nearly a century ago the mathematician Erlang, working for the Copenhagen Telephone Com-
pany, devised the first mathematical theories for telecommunications networks. The technology
we have now would seem like science fiction to Erlang, yet his insight into the essential struc-
ture of networks means that his theorems are just as useful for designing an optically-routed
backbone for the Internet as they were for early Danish telephony.
In this project we will deal with certain models for telephone networks, arising from Erlang’s
work. We consider a network to be a collection of links (cables). Each telephone call occupies
a certain amount of space on certain links; for example, a telephone call from a Cambridge
college to a London house might occupy 8 kbit/sec of space on the link from the college to the
university exchange, and on the link from the university exchange to BT’s Cambridge exchange,
and on the link from there to a London exchange, and so on. This space is occupied for the
duration of the call. The links which comprise the national telephone network only have limited
capacity, and when they are full we get a busy signal. Some interesting questions are: What is
the probability of a busy signal? How does it depend on the volume of traffic? Can we reduce
this probability by strategies such as offering multiple routes for a call?
For further reading see [1].
Consider a single link with the capacity to carry C simultaneous calls. Suppose that new calls
arrive as a Poisson process of rate ν, that each call lasts for a duration which is exponential with
mean 1, and that all call durations are independent of each other and of the arrival process. If
a new call arrives when the link is already carrying C calls, then the new call is blocked. This
system is known as the “Erlang link”.
Question 1 Set up a continuous-time Markov model for the Erlang link. Calculate
the equilibrium probability that there are i calls in progress.
Define E(ν, C) to be the equilibrium probability that there are C calls in progress.
We say that an arriving call “sees” the system in state s if the system is in state s just before
it arrives. The PASTA property (Poisson arrivals see time averages) says that the long-run
proportion of arrivals which see the system in state s is equal to the equilibrium probability
that the system is in state s.
Question 2 Show that the PASTA property holds for the Erlang link. Hint. One
approach is to consider the discrete-time Markov chain which records the state of the
system after each event, where an event is an attempted arrival or a departure, and to
Question 3 Write a program to simulate the Erlang link and to measure the blocking
probability. Compare the empirical blocking probability to that given by E(ν, C) for a
range of values of C up to 600 and an appropriate range of values of ν.
2 Alternative Routing
Consider now a network of links. For simplicity, suppose that the network is a complete graph
on K nodes, i.e., there is a link between every pair of nodes {1, . . . , K}, and that each link has
capacity C. Suppose that for every pair of nodes (a, b), calls between a and b arise as a Poisson
process of rate ν. It might seem reasonable, in order to reduce the blocking probability, to offer
an alternative route if the direct link is full. Specifically, suppose that calls between a and b are
routed as follows:
1. If there is spare capacity on the direct link a ↔ b, route the call over that link.
2. Otherwise, pick a new node c uniformly at random from the other K − 2 nodes. If there
is spare capacity on a ↔ c and on c ↔ b, route the call over these two links.
3. Otherwise, the call is blocked.
We will call this the “Alternative Routing” system. One way (but not the only way) to obtain a
‘merit’ mark in this project is to prove that the Alternative Routing system satisfies the PASTA
property.
As you can see, it is possible in principle to set up a Markov process model for this system,
and thereby to calculate the equilibrium distribution; but the number of states is so large
that, even for moderate K, it is not computationally practical to do so. Instead, we can use
a famous approximation called the Erlang fixed point approximation, which is that blocking
occurs independently on different links. This leads to the formula
B = E(ν + 2νB(1 − B), C) (1)
where B is the probability that an incoming call cannot be routed on its chosen direct link.
Question 4 Give a careful intuitive explanation for (1). In what sense could ν +
2νB(1 − B) be called the “offered load” on a link?
Now pick C = 600 and choose ν such that (1) has multiple solutions.
Question 7 Give an intuitive explanation for why there are multiple solutions. It
is a standard result from Markov chain theory that this Markov process has a unique
equilibrium distribution; comment briefly on how this result relates to the existence of
multiple solutions.
Question 8 You should observe that there is one solution to (1) which is not reflected
in your simulations. By considering fixed points of the map B ← E(ν + 2νB(1 − B), C),
suggest why this is so.
Question 9 Use the Erlang fixed-point approximation to find the probabilities that
an incoming call is blocked in the high-blocking regime and in the low-blocking regime.
How do these compare to a network without alternative routing, i.e., in a network in which
a call may only be routed on the direct link?
3 Trunk reservation
A telecoms operator would be alarmed at the situation you investigated in the last section, and
would seek to control the network so that it stays in the low-blocking state. One way to do this
is with a technique called “trunk reservation”. We modify the call admission procedure, so that
calls arising between a and b are routed as follows:
1. Consider the direct link a ↔ b. If this link has spare capacity, route the call over it.
2. Otherwise, pick a new node c uniformly at random from the other K − 2 nodes, and
consider the two links a ↔ c and c ↔ b. If on each of these links the number of calls in
progress is less than C − s, then route the call over these two links.
Question 10 Develop a fixed-point approximation for this system. Hint. First set
up a suitable Markov model for the number of calls in progress on a single link with two
classes of traffic.
Question 11 Compare the blocking probability to those you found in Question 9. Has
trunk reservation improved matters? How large should the trunk reservation parameter
be?
References
[1] F. P. Kelly, Network Routing, Philosophical Transactions of the Royal Society series A, 1991.
http://www.statslab.cam.ac.uk/~frank/loss/
1 Introduction
The interstellar medium surrounding a hot star is ionized by the radiation from the star. In this
project we calculate the size of the ionized region, and gain some insight into how its structure
depends on the nature of the radiation from the star.
Assuming that there is a uniform, static, constant temperature gas surrounding a spherically
symmetric star provides a good approximation, which keeps the essentials of the situation
without allowing unnecessary distractions. Each gas element is assumed to be in ionization
equilibrium, so the ionization rate for each atom is balanced by recombinations. We assume
that the sole source of ionization is by absorption of radiation from the star giving a bound
electron enough energy to escape from the atom. Recombination occurs when a free electron is
captured by an ion with the creation of a photon. Since hydrogen is the most common element
in the Universe, its behaviour will dominate in most cases, so we consider only a pure hydrogen
interstellar medium.
2πh ν3
Iν dν = 2 hν
dν,
c exp( kT ) − 1
∗
where h is Planck’s constant, c is the velocity of light, and k is Boltzmann’s constant (given
below). Thus for a star of radius R
2πh ν3
Lν = 4πR2 .
c2 exp( kT
hν
)−1∗
Question 1 The sun has radius R = 6.96 × 108 metres, and the total luminosity
L = 3.90 × 1026 W. Show, using the above equation, that its surface temperature is close
to 5800 K.
A 7 solar mass star has a surface temperature T∗ = 20, 000K and a luminosity L =
4.0 × 1029 W. What is its radius? What is the radius of a 12 solar mass star which has a
surface temperature T∗ = 25, 000K and a luminosity L = 4.0 × 1030 W.
where nH0 is the number density of neutral hydrogen atoms (i.e. number of neutral hydrogen
atoms per cubic metre). The coefficient aν depends only on the atomic species being considered,
and for neutral hydrogen
ν 3
0
aν = aν0 for ν > ν0
ν
aν = 0 for ν < ν0 ,
where aν0 = 6.3 × 10−22 m2 . If absorption can occur at the gas element, it can also occur in all
the elements between the star and the one under consideration. As we go along a path dr the
radiation is attenuated, so at a given frequency ν,
dIν
= −nH 0 aν Iν ,
dr
and so
Iν (r) = Iν (R)e−τν ,
where τν is defined by
dτν
= nH 0 (r)aν
dr
and τν (R) = 0. τν is referred to as the optical depth at the frequency ν. There is no redistri-
bution of the photons in frequency (they are effectively absorbed from the point of view of this
calculation), so we can determine τν from
dτν0
= nH 0 (r)aν0
dr
and ν 3
0
τν = τν0 .
ν
This absorption of radiation and the ionization it causes must be balanced by recombination of
protons and electrons at the same rate per unit volume. This rate depends on the number den-
sities of the protons (np ) and the electrons (ne ), their relative velocity and a velocity-dependent
cross-section for the interaction which has to be integrated over the velocity distribution at
whatever temperature the gas is at. These velocity-dependent terms are combined into recom-
bination coefficients αB which are tabulated for various gas temperatures T :
A typical interstellar medium hydrogen number density is nH = 106 m−3 , and a typical tem-
perature is T = 104 K.
Question 2 Write a program to solve the ionization equations to obtain the neutral
hydrogen and proton densities as a function of distance from the centre of the star, as-
suming that the interstellar gas has a constant temperature and density. Note that the
coefficients involved have large powers of 10, so for some compilers a rescaling of variables
may be desirable. Describe any transformations used. Are there any advantages to using
τν0 instead of r as the radial coordinate?
4.1 An approximation to r1
Equation (1) can be integrated over volume from r = 0 to r = ∞ by using the definition of τν
to replace dr and assuming that the recombination term is well approximated by np = ne = nH
for r 6 r1 and np = ne = 0 for r > r1 . This r1 is called the Strömgren radius.
where Q(H) is the total number of ionizing photons emitted by the star per second.
Calculate the values of Q(H) for the cases given above and compare the r1 determined
from this approximation with the values you have computed for T = 10, 000K.
Useful constants:
velocity of light c = 2.998 × 108 ms−1
Planck’s constant h = 6.626 × 10−34 Js
Boltzmann’s constant k = 1.381 × 10−23 JK−1
References
[1] Osterbrock, D.E., 1989. Astrophysics of Gaseous Nebulae and Active Galactic Nuclei Uni-
versity Science Books: Mill Valley, CA (Especially chapter 2).
1 Fluid Equations
Accretion discs are composed of fluid orbiting a central object. They evolve viscously so that
matter falls inwards while angular momentum drifts outwards. Accretion discs are found in
binary star systems, around forming stars and in active galactic nuclei. We use cylindrical
polar coordinates (R, φ, z) and assume axisymmetry for this project. When the central object
dominates the gravitational field the angular velocity of the matter in the disc is Keplerian so
that
GM 1/2
Ω= , (1)
R3
where M is the mass of the central object and G is the gravitational constant. The disc is made
up of annuli of matter lying between R and R + ∆R with mass 2πR∆RΣ, where Σ(R, t) is the
surface density (with dimensions ML−2 ) of the disc at time t.
The equation describing conservation of mass is
∂Σ ∂
R + (RΣVR ) = 0, (2)
∂t ∂R
where VR is the radial velocity in the disc. Conservation of angular momentum gives us
∂(ΣR2 Ω) ∂ 1 ∂Γ
R + (RΣVR R2 Ω) = , (3)
∂t ∂R 2π ∂R
where the viscous torque
Γ = 2πRνΣR2 Ω0 (4)
where Ω0 = dΩ
dR and ν(R, Σ) is the viscosity in the disc.
Question 1
Show that, for Ω independent of time,
∂Σ ∂ 1 ∂Γ
R =− . (5)
∂t ∂R 2π(R2 Ω)0 ∂R
This is the basic equation that governs the evolution of the surface density of a Keplerian
accretion disc, which we shall assume for the rest of this project. The viscosity may be a
function of Σ, R and t and so this equation may be non-linear.
Show that
3 ∂
VR = − 1/2
(νΣR1/2 ). (8)
ΣR ∂R
In a steady state disc the accretion rate through the disc is constant. Find, analytically,
the steady state solution for νΣ from equation (6). Use the inner boundary condition
Σ = 0 at R = Rin where ν is finite at R = Rin . Plot νΣ in units of ṁ against R/Rin in
the range 1 < R/Rin < 100.
We can make the problem dimensionless by setting r = R/R0 , τ = t/t0 , σ(r, τ ) = Σ/Σ0 ,
and η(r, σ) = ν/ν0 . Find a condition for t0 such that equation (6) remains the same
for these dimensionless variables. Assume that the accretion rate is small enough that
M ≈ const.
Question 3
Use the substitution X = r1/2 in equation 6 to derive
∂f ∂2g
= (9)
∂τ ∂X 2
where f and g are to be determined. We consider X to be discrete with values Xi where
i = 1, 2, ...100 and X1 = ∆X where ∆X is a constant. We let Xi+1 = Xi + ∆X and time
be τn where n = 1, 2, ... with τn+1 = τn + ∆τ . We have τ1 = 0 and we let fin = f (Xi , τn )
and gin = g(Xi , τn ).
Show that equation (9) can be represented by the difference equation
∆τ
fin+1 = fin + (g n − 2gin + gi−1
n
). (10)
(∆X)2 i+1
Question 4
Write a program to solve equation (9) taking η = 1 and with the boundary conditions
σ(rin , τ ) = 0 and σ(rout , τ ) = 0. Set rin = 0.0004 and rout = 4 and use 100 grid points
equally spaced in X between X = 0.02 and X = 2. For formal stability the timestep must
satisfy
1 f
∆τ 6 (∆X)2 (11)
2 g
at all points in the disc. However you may find that you need to use something smaller.
Evolve from an initial mass distribution
!
(r1/2 − 1)2
σ(r, 0) = exp − . (12)
0.001
Plot the initial surface density, σ(r, 0), against r. Plot σ against r at times τ = 0.002,
0.008, 0.032, 0.128 and 0.512 on the same axes.
Question 6
The evolution of a particle in the disk is given by dR/dt = VR (R, t) where VR is given by
equation 8. Rewrite equation 8 in a form similar to equation 10 so that you can adapt the
code written for question 4 to plot radial velocity (in dimensionless units) as a function
of radius for the timesteps used in question 4 on the same figure, taking care with choice
of axis to show where this is positive and negative.
Question 7
Use the radial velocities found by your code to follow the evolution of particles’ orbits,
and plot that evolution up to τ = 1 for particles initially at r0 = 0.9, 0.95, 1.0, 1.05, 1.1.
Plot the maximum radius attained by a particle as well as the time it took to reach that
distance and, for those particles that do so, the time it takes to reach a boundary as a
function of its initial radius for the range r0 = 0.9 − 1.1.
Question 8
Use the results from question 7 to work out the range of initial radii r0 that have reached
the inner boundary by τ = 0.512 and hence, using equation 12, the fraction of the initial
mass that has reached the inner boundary by this time. Also do the corresponding calcu-
lation for the outer boundary. Compare these values with the fraction of the initial mass
that remains at that time that you derive using the surface density profile from question 4.
Use of the matlab in-build function gradient is prohibited, as its algorithm or accuracy is
not documented.