0% found this document useful (0 votes)
4 views97 pages

Phy 485 Notes 2025

The document consists of lecture notes for PHY485 - Microcomputing for Physical Sciences, focusing on Python programming and its applications in physical sciences. It covers topics such as computer programming fundamentals, Python language features, numerical methods, data reduction, and error analysis. The notes include various figures and tables to illustrate concepts and methods discussed throughout the course.

Uploaded by

Amaranth Madise
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views97 pages

Phy 485 Notes 2025

The document consists of lecture notes for PHY485 - Microcomputing for Physical Sciences, focusing on Python programming and its applications in physical sciences. It covers topics such as computer programming fundamentals, Python language features, numerical methods, data reduction, and error analysis. The notes include various figures and tables to illustrate concepts and methods discussed throughout the course.

Uploaded by

Amaranth Madise
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

PHY485 - Microcomputing for Physical

Sciences - with Python


Lecture Notes

By L. C. Moffat
Physics Department
Faculty of Science
University of Botswana

January 26th , 2025


Contents

List of Figures 3

List of Tables 4

1 INTRODUCTION TO COMPUTER PROGRAMMING & THE PYTHON


LANGUAGE 5
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 What is Computer Programming? . . . . . . . . . . . . . . . . 5
1.1.2 Computer programming languages . . . . . . . . . . . . . . . 6
1.1.3 Converting programming languages into machine code . . . . 8
1.1.4 Integrated Development Environment (IDE) . . . . . . . . . . 9
1.1.5 A special note on programming errors . . . . . . . . . . . . . . 9
1.1.6 Algorithm design . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 An introduction to the Python language . . . . . . . . . . . . . . . . . 12
1.2.1 Why use Python? . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.2 What is Python? . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.3 Python program structure and syntax . . . . . . . . . . . . . . 13
1.2.4 Control Flow Structures . . . . . . . . . . . . . . . . . . . . . . 26
1.2.5 Functions and Modules . . . . . . . . . . . . . . . . . . . . . . 32
1.2.6 Data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.2.7 Numerical computations with Numpy . . . . . . . . . . . . . . 41
1.2.8 Data visualisation . . . . . . . . . . . . . . . . . . . . . . . . . 42

1
1.2.9 File handling (reading/ writing files) . . . . . . . . . . . . . . . 50
1.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2 NUMERICAL METHODS 53
2.1 Iterative Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.1.1 Bisection method . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.1.2 Newton-Raphson (or Newton’s) method . . . . . . . . . . . . 58
2.1.3 Secant method . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.2 Solutions of Integral Eqns. . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.2.1 Trapezoidal rule . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.2.2 Simpson’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.3 Solutions of Diff. Equations . . . . . . . . . . . . . . . . . . . . . . . . 69

3 DATA REDUCTION AND ERROR ANALYSIS 76


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.2 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.2.1 Empirical Distributions . . . . . . . . . . . . . . . . . . . . . . 77
3.2.2 Theoretical Distributions . . . . . . . . . . . . . . . . . . . . . 79
3.2.3 Goodness of fit test . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.3 Error Prop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.4 Mean & Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.5 Least-squares Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

Moffat/PHY485 - Microcomputing for Physical Sciences notes 2


List of Figures

1.1 Computer programming languages. Depending on the level of abstraction, two


major divisions exist: low-level and high-level programming languages. Low-
level programming languages are the oldest and relate to a specific architecture
and hardware of a particular type of computer. High-level languages are not
restricted to a specific computer architecture, they are portable. . . . . . . . . . 7
1.2 The compilation process in the context of C++. . . . . . . . . . . . . . . . . 8
1.3 The IDLE environment for Windows 10 and above. . . . . . . . . . . . . . 10
1.4 Examples of bitwise operations. . . . . . . . . . . . . . . . . . . . . . . 24
1.5 Simple alternative selection control structure. . . . . . . . . . . . . . . . 27
1.6 Line plot with Matplotlib. . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.7 Scatter plot with Matplotlib. . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.8 Histogram plot with Matplotlib. . . . . . . . . . . . . . . . . . . . . . . . 46
1.9 Adding labels and titles in a Matplotlib plot. . . . . . . . . . . . . . . . . . 47
1.10 Adding legends in a Matplotlib plot. . . . . . . . . . . . . . . . . . . . . . 48
1.11 Adding a grid to a Matplotlib plot. . . . . . . . . . . . . . . . . . . . . . . 49

2.1 Illustration of the bisection method for finding the root of a function
f (x). Notice that after each step (iteration N) the interval is halved, i.e.,
(b − a) /2N . If the root r is found within a tolerance ϵ (i.e., |ri − ri−1 | <
ϵ) the number of iterations N can be determined from (b − a)/2N < ϵ. 54

3
LIST OF FIGURES LIST OF FIGURES

2.2 Illustration of the Newton-Raphson method for finding an isolated


root of a function f (x). xn+1 is a better approximation than xn for the
root r of f (x). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.3 Illustration of the Secant method for finding an approximation to the
root of a function f (x). . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.4 Illustration of the trapezoidal rule for finding an approximate solution
Rb
to the definite integral a f (x) dx. . . . . . . . . . . . . . . . . . . . . 64
2.5 Illustration of the Simpson’s rule for finding an approximate solution to
Rb
the definite integral a f (x) dx. . . . . . . . . . . . . . . . . . . . . . . 66

3.1 Histogram showing imaginary test marks for PHY485 students. . . . . 77


3.2 An example of the probability distribution functions for a discrete ran-
dom variable. Figure (a) shows the probability of any outcome occur-
ing when throwing a dice. The corresponding cumulative distribution
is shown in (b). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.3 An example of a probability density function (PDF) f (x) showing the
heights of all women in the world in (a) and its cumulative distribu-
tion function (CDF) in (b). . . . . . . . . . . . . . . . . . . . . . . . . 81
3.4 Bell shape of a normal distribution. Note that 68% of the x values lie
between ±1 s.d. from the mean, 95% of the x values lie between ±2
s.d. from the mean and 99.7% of the x values lie between ±3 s.d. from
the mean. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.5 Observed and Expected absentees for five consecutive lectures for PHY485. 85
3.6 Extract of χ2 distribution table with the critical value shown for our
PHY485 example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.7 Principles of a χ2 -test. This is for the example of absentee numbers
in five consecutive PHY485 lectures. The alternative hypothesis that
the observed and the expected distributions are independent is rejected
since the χ2 value is below the critical value. . . . . . . . . . . . . . . . 87
3.8 A schematic of a car uniformly accelerating from rest down a slope. . . 91
3.9 Least-squares fit of distance versus time-squared plot of a car uniformly
accelerating from rest down a slope. . . . . . . . . . . . . . . . . . . . 93

Moffat/PHY485 - Microcomputing for Physical Sciences notes 4


List of Tables

1.2 Python operator precedence and associativity . . . . . . . . . . . . . . . . . 25


1.3 Common Python modules in academic research and numerical com-
puting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.1 Comparison of the bisection, Newton-Raphson and secant methodsin ap-


proximating the root of the function f (x) = e−x − sin πx 2
, x ∈
[0.4, 0.5]. The tolerance used in all the methods is ∈= 1 × 10−8 . . . . . 63
2.2 Numerical integration of some elementary functions using trapezoidal
and Simpson’s rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.1 Uncertainties of specific mathematical operations. Note that a and b are


constants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5
INTRODUCTION TO COMPUTER
1
PROGRAMMING & THE PYTHON
LANGUAGE

1.1 Introduction
1.1.1 What is Computer Programming?
Before you understand what computer programming is, you need to understand what
a computer program is. A computer program can be used for almost anything. It can
be used to
• help doctors cure diseases, e.g., in ultrasounds, X-rays, CT scans, etc.
• manufacture self-driving cars
• design robots that perform special tasks like diffusing a bomb, or performing
non-invasive surgery
• make very fun games (e.g., simulation, puzzles, action, combat, etc.).
• create office applications like Word, Excel or PowerPoint.
• design websites (e.g., using JavaScript, HTML, CSS, Dreamweaver, Google Web
Designer, Webflow, PHP, SQL, XML, etc.).
• develop graphics for the web or for movie making, etc.

6
CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.1 Introduction PYTHON LANGUAGE

So then, what is a computer program? A computer program, or just a program, is a


sequence of instructions written using a computer programming language to perform a spec-
ified task with a computer. A program can range from one line to millions of lines
of instructions. Computer program instructions are also called program source code and
computer programming is also called program coding. A collection of computer pro-
grams and related data is referred to as computer software. With this background we can
now define what computer programming is:

Computer Programming is the process of writing or editing computer programs.


A person who does this is called a computer programmer or a software developer (generally
deals with operating systems, drivers for hardware components, etc.), sometimes he is
referred to as a coder.

1.1.2 Computer programming languages


The phrase “computer programming language” appears in the definition of a computer
program above. A computer programming language is a means of communication
between a programmer and a computer. There are 100’s of programming languages
out there and they may fall into multiple categories using various criteria. The criteria
used may depend on the level of abstraction, purpose or programming style. If we
consider level of abstraction only, we recognise two major divisions (see figure 1.1):
• Low-level programming languages - they are closer to machine language
than to human language and thus are machine dependent. An Apple, AMD,
IBM, Rockchip, Qualcomm or NVIDIA CPU, for example, has its own lan-
guage that an Intel CPU cannot understand. Low-level languages are typically
used to implement hardware specific tasks such as drivers (small programs that
control devices, e.g., a driver for a USB memory stick), operating systems, etc.

Examples:
(i) machine language - consists of a series of zeroes and ones. It can be executed
directly by the computer.
(ii) assembly language (also called assembly, ASM or symbolic machine code) -
unlike machine language which consists entirely of numbers, assembly lan-
guage allows the programmer to use mnemonics & symbols and it cannot
be directly executed by the computer. It requires an assembler to convert

Moffat/PHY485 - Microcomputing for Physical Sciences notes 7


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.1 Introduction PYTHON LANGUAGE

Machine Language Assembly Language


(0's & 1's) (mnemonics)

Figure 1.1: Computer programming languages. Depending on the level of abstraction, two major
divisions exist: low-level and high-level programming languages. Low-level programming languages
are the oldest and relate to a specific architecture and hardware of a particular type of computer. High-
level languages are not restricted to a specific computer architecture, they are portable.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 8


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.1 Introduction PYTHON LANGUAGE

it to machine code (a set of instructions written in zeroes and ones, i.e.,


binary, that a specific CPU can execute).
• High-level programming languages - they are a class of computer languages
closer to human language. They are machine independent and their develop-
ment have brought portability to computer languages.

Examples: (i) BASIC (updated to Visual Basic by Microsoft) (ii) C (iii) C++ (an
extension of C) (iv) COBOL (v) Fortran (vi) Java (vii) PHP (viii) Python (ix) R
(x) MATLAB, etc.
In the PHY485 (Microcomputing for Physical Sciences) course, we will use the Python
programming language to complete basic tasks assigned in practical exercises and as-
signments.

1.1.3 Converting programming languages into machine code


Computers only understand machine code, therefore a program written in any high-
level language needs to be interpreted or translated into a language a computer under-
stands. There are two ways of doing this,
• through a compiler - the compilation process involves several distinct steps: pre-
processing, compilation and linking (see figure 1.2). The whole process consists of
transformation of source files into object files which are then linked with libraries
to produce an executable file - the final program that can be run by the operating
system.

Preprocessor Compiler
Source file Preprocessed code Object file
Linker
(.cpp) (.i) (.o or .obj) Executable
extension of the Libraries (.exe)
generated file (.dll)

Figure 1.2: The compilation process in the context of C++.

Examples of languages that are compiled include BASIC, C/C++, C#, Objective-
C, Fortran, Java, COBOL, Julia, Ada, Pascal, etc.
• through an interpreter - an interpreter is a program that interprets instructions
of a program one instruction at a time into commands that are to be carried out
by the computer as it happens. Examples of languages that are interpreted include

Moffat/PHY485 - Microcomputing for Physical Sciences notes 9


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.1 Introduction PYTHON LANGUAGE

Python, R, MATLAB, Mathematica, PostScript, PHP, Perl, AWK, Ruby, Excel,


PowerShell, Unix Shell, etc.

1.1.4 Integrated Development Environment (IDE)


Generally, the process of generating an executable program involves the use of some
sort of a coding or programming environment called the integrated development en-
vironment (IDE) which has built-in tools for editing, compiling, running and debug-
ging (checking a program for compilation and runtime errors or simply “bugs”) programs.
There are various kinds of IDEs for every programming language. Several dedicated
Python IDEs include PyCharm (proprietary), Spyder (freeware), IDLE (freeware),
Thonny (freeware), and Jupyter Notebook (freeware). Other powerful coding en-
vironments that support Python development include Visual Studio Code (VS Code,
freeware), Sublime Text (proprietary), and Atom (freeware).
Python can be run in two basic modes: script (or normal) mode and interactive mode.
In normal mode, you run completed programs (with the .py extension) in a Python
interpreter. Interactive mode provides a command-line shell with immediate feed-
back for each statement. An example of IDLE on Windows 10 and above is shown in
Figure 1.3.
While coding on a smartphone or tablet is possible, it is inefficient for complex tasks due
to the small screen and tiny keyboard, as well as the potential absence of some libraries.
However, these devices are adequate for small tasks, such as solving classroom problems
in PHY485. Several Python IDE apps are available on the Google Play Store; some
are excellent, while others are less so. Examples include Pydroid 3 and QPython 3L.
Android also offers code editors that support multiple languages, including Python,
such as Code Editor by Rhythm Software. All these Android applications were free as
of January 2025.

1.1.5 A special note on programming errors


Computer programs generally are not error-free, even for experienced programmers.
Different types of errors can occur at different stages of program development. Because
of this we divide them here into three main types:
(a) Syntax errors - These occur when the code violates the syntax rules of a pro-
gram. The interpreter/compiler detects these errors during the parsing phase
and halts execution. They might include incorrect indentation, missing colons
or parentheses, misspelled variables or keywords, invalid use of operators, etc.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 10


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.1 Introduction PYTHON LANGUAGE

Figure 1.3: The IDLE environment for Windows 10 and above.

(b) Runtime errors (Exceptions) - these occur while the program is running and
the interpreter/compiler has no way of knowing about them at parsing stage,
e.g., division by zero, using an undefined variable, overflow (limited storage
space for generated data), underflow (number smaller than what the device can
handle), opening a file that does not exist, etc.
(c) Logical errors - they are a type of runtime error that simply produces a wrong
output. Examples include using wrong input data, multiplying instead of di-
viding, adding instead of subtracting, etc. These errors are harder to find in a
program than syntax and runtime errors.
In general, the errors become more difficult to find and fix as you move down the
above list.

1.1.6 Algorithm design


Before writing a program you need to device a step-by-step procedure which you will
use to resolve the problem you want to solve. You write a finite set of well-defined in-
structions called an algorithm. Algorithms are presented as pseudocode - English-like
phrases used to describe steps in an algorithm, that is, no particular language is used.

Algorithm 1 is an example of a simple pseudocode to find the largest number from a


list of five numbers. This pseudocode can easily be converted to a Python program or
any other programming language.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 11


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.1 Introduction PYTHON LANGUAGE

Algorithm 1 Pseudocode to find the largest number


1: write "Please enter five numbers"
2: read n1, n2, n3, n4, n5
3:
4: set max to n1
5:
6: if n2 > max then
7: set max to n2
8: end if
9:
10: if n3 > max then
11: set max to n3
12: end if
13:
14: if n4 > max then
15: set max to n4
16: end if
17:
18: if n5 > max then
19: set max to n5
20: end if
21:
22: write "The max is "
23: write max

Moffat/PHY485 - Microcomputing for Physical Sciences notes 12


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

1.2 An introduction to the Python language


1.2.1 Why use Python?
As mentioned earlier, Python will be the programming language used in this course.
While it is one of the slower languages, with relatively high memory usage and dy-
namic typing (no need to declare the type of a variable, Python does it automatically at
runtime) that can occasionally lead to runtime errors, Python remains the most popular
programming language (e.g., IEEE Spectrum, 2024). Here’s why:
• Its syntax is clear, concise, and resembles plain English, making it relatively easy
to learn, even for those with little to no programming experience.
• Python is easy to maintain due to its concise syntax and object-oriented program-
ming features (more on this later).
• It has a vast collection of powerful libraries tailored for data science, including
NumPy, SciPy, Pandas, Matplotlib, Seaborn, and Scikit-learn.
• Python benefits from a large and supportive community of users and developers.
• It is a general-purpose language suitable for a wide range of applications beyond
data science, such as web development, automation, and scripting.
• It is open-source (meaning it is free to use and distribute) and cross-platform.

1.2.2 What is Python?


Python is a high-level programming language similar to C++, Java, and MATLAB. It is
an interpreted language, meaning instructions are executed line by line, sequentially,
unlike compiled languages like C++ and Java. Python is interactive, object-oriented, and
includes an extensive standard library. It is used in various fields, including data analysis,
artificial intelligence, machine learning, scripting, software development, web devel-
opment, robotics, game development, and 3D graphics. Major companies like NASA,
Google, YouTube, BitTorrent, Facebook, and Spotify use Python in their daily op-
erations. IEEE Spectrum ranked Python as the most popular programming language
in 2024 (C++, MATLAB, and Fortran ranked 4th, 24th, and 45th, respectively)(refer
to IEEE Spectrum, 2024).
Unlike MATLAB, where the IDE is integral to the software package, Python’s default
IDE (IDLE) is a standalone application. Several other popular IDEs, such as Spyder,
PyCharm, Thonny, and Jupyter Notebook, are available and often preferred. Some

Moffat/PHY485 - Microcomputing for Physical Sciences notes 13


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

IDEs, like PyCharm, bundle Python with their installation. In Linux environments,
users can use the Python shell with any text editor, such as vi, Emacs, or gedit. For
scientific computing, additional packages like NumPy, SciPy, Matplotlib and Pandas
are commonly used.
In Python, everything — functions, modules, strings, and files — is an object. All objects
have associated methods, accessed using the syntax
object.methodname(parameters). For example, fruits.append("apple") adds "apple"
to the list fruits, where fruits might be defined as fruits = [`mango', `orange',
`banana', `pear', `peach'].
As mentioned, Python is an object-oriented programming language. OOP is a program-
ming style based on self-contained units called objects, which have unique attributes
and behaviors. Consider an everyday example, such as a bicycle. It has attributes (e.g.,
size, color) that represent data and behaviors (e.g., pedaling, braking) that represent
functions. OOP enables programmers to organize code in a more modular, reusable,
and manageable way by focusing on objects with specific characteristics and actions.

1.2.3 Python program structure and syntax


Structure of a Python program
Consider the following simple Python program:
Program 1.1: Hello World Program.
1 # A H e l l o World program
2
3 p r i n t ( ‘ H e l l o world ! ’ )

That’s it! This is a complete, working Python program. We will use it as an entry
point into the basic elements of a Python program. Let us explain it line-by-line:
1. The # symbol indicates that everything following it on the same line is a comment
and is ignored by the interpreter. To create a multi-line comment, you can either
begin each line with # (since Python doesn’t have explicit multi-line comment
syntax) or use a multi-line string literal (enclosed in triple quotes), for example:
"""This is a comment""". This works because Python ignores string literals
that are not assigned to a variable. Comments are crucial for explaining what
your code does, so use them liberally.
2. print(`Hello world!'): This is the core of the program. print() is a function.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 14


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Functions are reusable blocks of code that perform specific tasks. The print()
function’s job is to display output to the console (your screen). `Hello world!'
is a string literal. A string is a sequence of characters (letters, numbers, symbols)
enclosed in quotation marks (either single ` or double "). In this case, it’s the text
we want to display.
There are two main ways to run the Python program described above:
• Save the code to a file with a name ending in .py. Then, open a terminal (Com-
mand Prompt on Windows or a Linux shell) and type python filename.py,
Hello, world!
followed by pressing Enter. The output will be displayed on
your screen.
Note that you can also run the saved .py file directly from your Python IDE if
it provides that functionality.
• Use an interactive interpreter (available in many Python IDEs or directly via a
Python shell). Open a Python terminal, type print(`Hello, world!') at the
Hello, world!
>>> prompt, and press Enter. The output will be displayed
immediately below the command.

Python syntax and indentation


(a) Reserved words and identifiers for Python
Keywords
In Python there are reserved words (also called keywords) that are reserved for
use by the interpreter and as such these words cannot be used as variable names.
If you use them in a program, your IDE will change their colour to make them
visible. As of Python 3.12 there are 35 keywords in Python and these are:
False await else import pass
None break except in raise
True class finally is return
and continue for lambda try
as def from nonlocal while
assert del global not with
async elif if or yield

Moffat/PHY485 - Microcomputing for Physical Sciences notes 15


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Note that all these keywords are in lowercase, except False, None and True,
which are capitalised.
Identifiers
In Python, the name given to a variable, function, class, module, or other object
is called an identifier. An identifier should be purposeful, clearly indicating the
entity’s function. It should also be descriptive; for example, n might be a poor
choice for a variable representing the number of students taking PHY485. A
name like num_students would be much better. There are several rules to follow
when naming identifiers:
• they cannot be Python keywords.
• they can only consist of letters (A-Z, a-z), numbers (0-9), and the under-
score character (_). They cannot contain any other symbols or whitespace.
• an identifier must begin with a letter or an underscore; it cannot start with
a number.
Note that there are specific conventions associated with underscores:
– A single leading underscore (_variable) indicates that a variable is in-
tended for internal use within a module or class.
– A double leading underscore (__variable) makes it more difficult to
override a variable in a subclass (due to name mangling).
– A single trailing underscore (variable_) is used to avoid naming conflicts
with Python keywords (e.g., import_ to avoid conflict with the import
keyword).
– A single underscore by itself (_) has several meanings in Python. It can
be used (i) to store the result of the last executed expression in a Python
shell, (ii) to ignore a variable or list of variables when unpacking a list or
a tuple, (iii) as a counter variable in loops especially when the counter
is not needed in the future and (iv) as digit separator in numbers for
better understanding (e.g., 100_000_000 for 100 million).
Note that Python lists and tuples are ordered collection of variables indexed
by integers. The difference between these two are that whereas ‘lists’ allow
insertion, deletion, and substitution of elements ‘tuples’ do not. ‘tuples’ are
like read only sequences and they are represented by parentheses. ‘Lists’ on
the other hand are represented by square brackets. Both ‘lists’ and ‘tuples’

Moffat/PHY485 - Microcomputing for Physical Sciences notes 16


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

can accommodate elements of various data types, such as integers, floats,


strings, and even other ‘lists’ or ‘tuples’.
• Python is case-sensitive. Therefore, myvariable and myVariable are treated
as two distinct identifiers.
Examples of acceptable and unacceptable identifiers:
acceptable unacceptable
calculate_mean calculate mean
readWindSpeed return
cloud9 3birds
baby digit-1
TRIP $sum
(b) Indentation
Python relies on indentation (whitespace at the beginning of a line) to define
the structure of program blocks. Unlike other programming languages that use
curly braces {} or keywords to delimit blocks, Python strictly enforces indenta-
tion for readability and structure. The following code segment will produce an
indentation error if executed:
Program 1.2: Bad indentation.
1 i f True :
2 p r i n t ( ‘ I am l e a r n i n g i n d e n t a t i o n . ’ )
3

The following code segment, on the other hand, will not produce an error since
correct indentation has been used.
Program 1.3: Good indentation.
1 i f True :
2 p r i n t ( ‘ I am l e a r n i n g i n d e n t a t i o n . ’ )
3

The standard practice is to use four spaces per indentation level. Most editors
automatically insert spaces when the Tab key is pressed after the colon symbol.
Consistent indentation is critical; mixing tabs and spaces will result in an error.
Indentation is required in constructs like:
• Conditionals (if, elif, else).

Moffat/PHY485 - Microcomputing for Physical Sciences notes 17


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

• Loops (for, while).


• Function definitions (def).
• Class definitions (class).
• Context managers (with statement).
Here is an example with multiple indentation levels:
Program 1.4: Multiple indentation.
1 f o r i in range ( 3 ) : # r an ge ( 3 ) = [ 0 , 1 , 2 ]
2 p r i n t ( " Outer l o o p " , i )
3 f o r j in range ( 2 ) :
4 print ( " Inner loop " , j )
5

Indentation errors are a frequent source of frustration for Python programmers.


Fortunately, most modern IDEs designed for Python development provide fea-
tures to assist with indentation. These features typically include automatic inden-
tation, as mentioned above, where the IDE automatically indents the next line
when you press Enter after a line that starts a code block. This helps to ensure
consistent indentation and minimize the risk of introducing errors.

Input and Output in Python


• Printing output
In Python, we use the print() function to display output on the screen. This is
useful for showing results, debugging, and providing user feedback. For exam-
Hello, world!
ple, print("Hello, world!") will output to the screen. It is
possible to print multiple values by separating them with commas, i.e., print("Name:
Name: Thabo, Age: 24
", name, "Age: ", age) will output if the name
and age had already been assigned values of Thabo and 24, respectively.
Sometimes there is a need to format the output before displaying it on the screen
using formatted string literals or F-strings as they are commonly called and raw
strings prefixed with letter "r" or "R".
– F-strings - they allow embedding expressions within string literals using
{}. The following code segments, with the output highlighted below each,
demonstrate their usage.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 18


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Program 1.5: Basic syntax for F-strings.


1 name = " A l i c e "
2 age = 25
3 p r i n t ( f "My name i s {name} and I am { age } y e a r s o l d . " )
4

My name is Alice and I am 25 years old.

Program 1.6: Expression evaluation.


1 p r i n t ( f " 2 + 3 = {2 + 3} " )
2

2+3=5

Program 1.7: Number formatting.


1 pi = 3.14159
2 p r i n t ( f " P i rounded t o 2 d e c i m a l p l a c e s : { p i : . 2 f } " )
3

Pi rounded to 2 decimal places: 3.14

Program 1.8: Padding & alignment.


1 v a l u e = 42
2 p r i n t ( f " L e f t : { v a l u e : < 5 } | Center : { v a l u e : ^ 5 } | Right : { v a l u e : > 5 } " )
3

Left: 42 | Center: 42 | Right: 42

Moffat/PHY485 - Microcomputing for Physical Sciences notes 19


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Program 1.9: Including Braces {}


1 p r i n t ( f " Curly b r a c k e t s : {{ and }} " )
2

Curly brackets: { and }

– Raw strings - they are useful when you want to include backslashes (\) in
your string without them being interpreted as escape characters. They can
be used in regular expressions, file paths and string literals. For instance,
in Matplotlib, raw strings (r"") are commonly used to format text using
LaTeX-style math expressions. LaTeX requires \ (e.g., \alpha for α), which
might be misinterpreted by Python without raw strings. Program 1.10 is a
code segment that shows how raw strings can be used in Matplotlib.
Program 1.10: Raw strings example.
1 p l t . p l o t ( x , y , l a b e l = r " $y = \ s i n ( x ) $ " )
2

r"$y = \sin(x)$" ensures \ is correctly passed to LaTeX.


• Taking user input
Python allows user interaction through the input() function, which reads data
from the user as a string. For example, the following code segment shows how
to receive user input.
Program 1.11: How to receive user input.
1 name = input ( " E n t e r your name : " )
2 p r i n t ( f " Hello , {name } ! " )
3

The example run of this is as follows:

Enter your name: Pako


Hello, Pako!

Note that "Enter your name: " will be displayed first on the screen when you
run it. The second line will be printed after you type your name.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 20


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

By default, input() returns a string. This means that if you want a numerical in-
put such as an integer or float, you will have to convert the received input string
into those data types. For example, to convert to an integer one would write age
= int(input("Enter your age: ")) or height = float(input("Enter your
height in meters: ")) for the case of a float. The input() function can also
handle multiple inputs in a single line. This is done using the function split(),
i.e., x, y = input("Enter two numbers: ").split(). You can then con-
vert them to the desired data type. Program 1.12 shows different ways you can
deal with multiple input from a user:
Program 1.12: How to receive multiple user input.
1 num1 , num2 , num3 , num4 = input ( " E n t e r f o u r numbers : " ) . s p l i t ( )
2
3 # c o n v e r t eac h number t o an i n t e r g e r
4 num1 = i n t ( num1 )
5 num2 = i n t ( num2 )
6 num3 = i n t ( num3 )
7 num4 = i n t ( num4 )
8
9 # ALTERNATIVELY, you can combine t h e above i n s t r u c t i o n s i n t o one l i n e a s
f o l l o w s u s i n g t h e f u n c t i o n map ( )
10 num1 , num2 , num3 , num4 = map( int , input ( " E n t e r f o u r numbers : " ) . s p l i t ( ) )
11
12 p r i n t ( " The numbers e n t e r e d a r e : " , num1 , num2 , num3 , num4 )
13
14 # you can a l s o c r e a t e a i n t e g e r l i s t ( a r r a y ) out o f t h e s e numbers u s i n g t h e
function l i s t ( )
15 numbers = l i s t (map( int , input ( " E n t e r f o u r numbers : " ) . s p l i t ( ) ) )
16
17 p r i n t ( numbers )
18

Data types
A data type refers to the type of data that a variable can hold. In Python, there is
no need to explicitly declare the variable’s type before using it. Python automatically
determines the data type during runtime or upon assignment. This differs from some
languages like C++ and Fortran where you have to explicitly declare the type of a
variable before using it.
Python provides a diverse range of built-in data types, broadly classified into five
major groups (refer to Table 1.1). These data types serve as the fundamental build-
ing blocks for constructing intricate data structures and executing various operations
within Python programs. Data types can also be categorized based on their proper-
ties, such as numeric and non-numeric types. Numeric types handle numerical val-

Moffat/PHY485 - Microcomputing for Physical Sciences notes 21


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

ues, while non-numeric types encompass strings, sequences (lists and tuples), mappings
(dictionaries), sets, and booleans. Another key property distinguishes between mutable
and immutable data types. Mutable data types allow for modification of their values
after creation (e.g., lists, dictionaries, sets), whereas immutable data types remain un-
changed once created (e.g., integers, floats, strings, tuples, frozensets).

Table 1.1: Standard types built into the Python interpreter.

Type Type Name Description Example


None type(None) Represents the absence return None if b == 0
of a value else a / b
Numbers int Integer . . . , −2, −1, 0, 1, 2, . . .
float Floating point 3.04, 4.015, etc. (accurate up
to 15 decimal places)
complex Complex number 1+2j, 3-j
bool Boolean (True or x or y, not x
False)
Sequences str Character string ‘Hello World’, "2.31",
“‘dogs”’, """PHY485"""
list List a = [`apple', `cherry',
`2.5', True]
tuple Tuple b = (1,3,5,7)
range Range of intergers list(range(10))
created by range
Mapping dict Dictionary my_dict = {"name":
"Gorata", "age": 22,
"city": "Gaborone"}
Sets set Mutable set my_set1 = {1, 2, 3, 3,
2}
frozenset Immutable set my_set2 = frozenset({1,
2, 3})

Operators
An operator is a special symbol that tells a compiler or interpreter to perform a specific
operation on one or more operands to produce a final result. There are several types of
built-in operators in Python (see, Python Software Foundation, n.d.; TutorialsPoint,
2025), and these are:

Moffat/PHY485 - Microcomputing for Physical Sciences notes 22


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

• Arithmetic operators

+ addition
− subtraction
* multiplication
/ division
% modulus or remainder
// floor division
** exponent

Examples: (let a = 5, and b = 15)


a + b gives 20
b / a gives 3
b % a gives 0
a // 3 gives 1
a ** 2 gives 25
• Assignmnent operators

= equals assignment
+= addition assignment (note that a+=b =⇒ a = a + b)
−= subtraction assignment
*= multiplication assignment
/= division assignment
%= modulus or remainder
//= floor division assignment e.g., a //= 6
**= exponent assignment e.g., a **= 4

• Comparison Operators

== equal to. (e.g., a == b)


!= not equal to. (e.g., a != b)
< less than.
> greater than.
<= less than or equal to. (e.g., a <= b)
>= greater than or equal to.

• Logical Operators

Moffat/PHY485 - Microcomputing for Physical Sciences notes 23


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

and logical AND. (e.g., a and b)


or logical OR. (e.g., a or b)
not logical NOT. (e.g., not (a > b))

• Bitwise Operators

Bitwise operators modify variables considering the bit patterns that represent
the values they store.

& bitwise AND.


| bitwise inclusive OR.
^ bitwise exclusive OR, (XOR).
∼ unary complement (bit inversion), (NOT).
>> shifts bits right, (SHR). (e.g., 10 >> 2 equals 2)
<< shifts bits left, (SHL). (e.g, -1 << 3 equals −8)

Examples of how bitwise operators modify variables is shown in figure 1.4.


• Membership operators

These test for membership in a sequence, such as strings, lists, or tuples.

in True if it finds a variable in a sequence, False e.g., a in b


otherwise
not in True if it does not find a variable in a sequence, e.g., a not in b
False otherwise

• Identity Operators
These compare the memory locations of two objects.

is True if both variables are the same object, e.g., a is b


False otherwise
is not True if both variables are not the same object, e.g., a is not b
False otherwise

Moffat/PHY485 - Microcomputing for Physical Sciences notes 24


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

(a) (10&2) (b) (10|2)


10 0 0 0 0 1 0 1 0& 10 0 0 0 0 1 0 1 0|
2 0 0 0 0 0 0 1 0 2 0 0 0 0 0 0 1 0
2 0 0 0 0 0 0 1 0 (10&2) = 2 10 0 0 0 0 1 0 1 0 (10|2) = 10

(c) (10^2) (d)


~(10)
10 0 0 0 0 1 0 1 0^
10 0 0 0 0 1 0 1 0~
2 0 0 0 0 0 0 1 0
-11 1 1 1 1 0 1 0 1 ~(10) = -11
8 0 0 0 0 1 0 0 0 (10^2) = 8 Equals to a bit inverter, NOT.
TRUE, only if corresponding bits are opposite.

(e) (10>>2) (f) (10<<2)


10 0 0 0 0 1 0 1 0 10 0 0 0 0 1 0 1 0

0 0 0 0 0 0 1 0 (10>>2) = 2 0 0 1 0 1 0 0 0 (10<<2) = 40
Vacated bit positions Vacated bit positions
filled with zeros. filled with zeros.

(g) (-1<<3) (h) (-1>>3)


-1 1 1 1 1 1 1 1 1 -1 1 1 1 1 1 1 1 1

1 1 1 1 1 0 0 0 (-1<<3) = -8 1 1 1 1 1 1 1 1 (-1>>3) = -1
Vacated bit positions Vacated bit positions
filled with zeros. filled with ones!

Figure 1.4: Example showing how bitwise operators modify variables using their bit patterns
considering an 8-bit representation. Note that (10 >> 2) means shift 10 by two bits to the right. The
other important thing concerns shifting of negative numbers. A 2’s complement is used instead of 1’s
complement (this is beyond the scope of this course!!). For positive numbers, the smallest positive
number is the smallest binary value, i.e., 00000000 = 0. Negative numbers always start with a 1.
Thus, the smallest negative number is the largest binary value, e.g., 11111111 = −1, 11111110 = −2,
11111101 = −3, 11111100 = −4, etc.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 25


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Order of precedence and associativity for Python operators

Recall that in arithmetic certain operations are performed before others, for exam-
ple, in an expression containing multiplication and addition, multiplication will take
precedence. The same is true for Python operators.
Table 1.2 summarises the order of precedence and associativity for Python operators.
Operators with the highest precedence appear at the top of the table while those with
the lowest precedence appear at the bottom. Examples are also included.

Table 1.2: Python operator precedence and associativity


Operator Description Example Associativity
() grouping a = (2 + 3)/5, Output: 1 left to right
** exponent a = 2 ** 3 ** 2, Output: 512 right to left
*, /, //, % multiplication, division, floor di- a = 12 * 2/4 // 4 % 3, Output: 1 left to right
vision, modulus
+, − addition and subtraction a = 6 - 3 + 2, Output: 5 left to right
<<, >> bitwise shift left and bitwise shift a = 2 << 4, Output: 32 left to right
right
& bitwise AND a = 7 & 10, Output: 2 left to right
^ bitwise exclusive OR (XOR) a = 2 ^ 5, Output: 7 left to right
| bitwise inclusive OR a = 2 | 6, Output: 6 left to right
==, !=, >, <, >=, <= comparison operators a = 2 == 6, Output: False left to right
is, is not, in, not in identity and membership opera- a = 3 in [2,3,5], Output: True left to right
tors
not logical NOT is_dark = True, a = not is_dark, right to left
Output: False
and logical AND a = 2, b = 4, c = (a > 3) and (b < 6), left to right
Output: False
or logical OR a = 2, b = 4, c = (a > 3) or (b < 6), left to right
Output: True
= assignment operator a=3 right to left

In general, operators in Python are normally left-associative meaning they are applied
from left to right as can be seen in table 1.2. For example, 2×3+4 will have multiplica-
tion performed first followed by addition giving a result of 10. For right-associativity
consider the example a=b=5; in this example the value of b will first be set to 5 and
subsequently the result of this operation, 5, will be assigned to a. Note that the order
of precedence can be overridden with the inclusion of parentheses as in 2 × (3 + 4) if
14 is the intended result. More examples are shown in table 1.2.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 26


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

1.2.4 Control Flow Structures


The execution of a program follows logically defined rules. These rules (or conditions)
determine what needs to be done, how it should be done, and what to do with the
results once the process is complete.
Program execution is commonly referred to as a run. That is, when a program is
executing, we say it is running. Most programs follow a sequential flow, meaning they
proceed to the next step only after the current instruction has been executed.
Controlling the flow of execution in a program is essential. This is achieved using
control flow structures (for more information, see Colestock, 2017; McKinney, 2018;
Sweigart, 2019), which fall into three main categories:
(a) sequence
(b) selection, and
(c) repetition
The sequence is the one that is explained above, that is, program execution is done line
after line from top to bottom. The general structure of a sequential control structure
looks like the following:
statement 1
statement 2
..
.
statement n

So each statement above will be executed after the preceding one, i.e., execution will
start with statement 1, followed by statement 2 and so on.
In the selection control structure program flow is directed by choice, that is, execution of
the next instruction follows from the selection made. Examples include the following:
1. Simple-alternative decision format
Consider the following examples
(a) Two alternatives
Program 1.13: Two alternatives.
1 i f l a b t i m e _ u s e == 0 :
2 p r i n t ( " I l i k e s p e n d i n g t i m e on s o c i a l media . " )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 27


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Test

Action 1 Action 2

Test

Action

Figure 1.5: Simple alternative selection control structure.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 28


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

3 else :
4 p r i n t ( "My Python programming s k i l l s a r e i m p o r t a n t . " )
5

(b) Single dependent


Program 1.14: Single dependent.
1 i f labtime_use != 0:
2 p r i n t ( " I ’ l l do w e l l i n PHY485 . " )
3

Any of the statements in the two examples can be compound; that is, they
can contain a sequence of statements. The statements are indented to indi-
cate that they belong to the "if" or "else" block. This means that all indented
lines of code following the colon (:) belong to the same block of code and
will be executed together.
2. Multiple-alternative decision format
The two alternatives can be extended by following the if with an else if such that
the choices get expanded. This is commonly referred to as nesting - nesting allows
for more multiple-selection functionality. The length of the if...else result is now
what can be termed as multiple-alternative decision format, that is, there exist a
series of choices to make like in the awarding of grades after an examination.
(a) The if...else if...else (nested if ) statement
The following gives the general structure of the nested if statement
if condition_1:
statement_1
elif condition_2:
statement_2
..
.
elif condition_n:
statement_n
else
statement_n-1

Example program 1.15 illustrates how you can use an if...else if...else state-
ment in a Python program. The program allows the user to select an arith-

Moffat/PHY485 - Microcomputing for Physical Sciences notes 29


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

metic operation to perform on two numbers:


Program 1.15: Implementation of the nested if statement.
1 number = 10
2
3 i f number > 0 :
4 p r i n t ( " The number i s p o s i t i v e . " )
5 e l i f number < 0 :
6 p r i n t ( " The number i s n e g a t i v e . " )
7 else :
8 p r i n t ( " The number i s z e r o . " )
9

(b) The match-case statement


Multiple-alternative decisions can also be implemented using the match-case
control statement. match is coupled with a case which acts as an evaluation
point that replaces the conditions in the if...else statement. The match-case
statement uses a single variable or expression matched to various alternative
outcomes.
The following is the general structure of the match-case statement.
match expression:
case pattern_1:
statement_1
case pattern_2:
statement_2
..
.
case pattern_n:
statement_n
case _: #default case
statement_n-1

In a Python match-case statement, each case is followed by a block of code


that executes only when that specific case pattern matches the value of the
expression. The expression can be of various data types, including integers,
floats, strings, tuples, lists, classes, and more. Each case pattern can be a
literal value, a variable, a pattern (like tuples or classes), or a wildcard (_). If
none of the case patterns match the expression, the code within the optional
default block is executed. If no default block is present, the match statement
does not execute any code.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 30


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

An implementation of the match-case statement is shown in program 1.16


below:
Program 1.16: Colour of fruits using the match-case statement.
1 f r u i t = input ( " E n t e r f r u i t o f your c h o i c e : " )
2
3 match f r u i t :
4 case " apple " :
5 p r i n t ( " A p p l e s a r e r e d or green ! " )
6 c a s e " banana " :
7 p r i n t ( " Bananas a r e y e l l o w ! " )
8 c a s e " orange " :
9 p r i n t ( " Oranges a r e orange ! " )
10 case _ :
11 p r i n t ( " I don ’ t know t h a t f r u i t . " )

Notice that the underscore (_) symbol is used after the last case. This serves
as a wildcard, matching any value entered by the user that does not match
any of the specified case patterns in the match-case statement. The match-
case statement provides the same functionality as the if-else if-else statement
but in a more concise and potentially more readable way, especially for
more complex conditional logic.
Pitfalls to Watch

if...else
(i) Readability and maintenance issues for deeply nested and complex blocks.
(ii) Incorrect indentation.

match-case

The most common mistake in Python’s match-case statements is forgetting


to include a default case (represented by _).
Repetition control structures are called “loops". Python has two loop control state-
ments: for and while. Recall that selection control structures execute statements
if certain conditions are met. In the case of loops certain statements are executed
while certain conditions are met.
(a) The for loop
In Python the for loop iterates over a sequence or collection (like a list, tuple,
string, or range) and executes a block of code for each item in the sequence.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 31


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

The general syntax for a for loop is:


for value in collection:
statements
The following code prints a sequence of prime numbers between 2 and 11
using a for loop.
Program 1.17: Display prime numbers between 2 and 11.
1 primes = [ 2 , 3 , 5 , 7 , 11]
2 f o r v a l u e in p r i m e s :
3 print ( value )
4

(b) The while loop


The while loop repeats a block of code as long as a specified condition is
True. The following shows the general structure of a while loop:
while condition:
statement_1
statement_2
..
.
statement_n

In the while loop as long as the condition is met, the block of statements
will repeatedly be executed. The looping conditions of a while loop can
also be performed using the for loop statement.
Here is an example:
Program 1.18: Use a <while> loop to display "Hello, world!" four times.
1 i = 0 # initialize i
2
3 while i < 4 :
4 p r i n t ( " Hello , world ! " )
5 i = i + 1

The above program first checks whether the condition is met, then it prints
the statement “Hello, world!" to the screen. In this case provided the i has
been initialised as “i = 0", it will print the statement “Hello, world!" four
times each on a new line.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 32


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

In loops, you can use the break, continue, and pass keywords to modify
the loop’s behavior. The break statement completely terminates the loop,
regardless of whether there are more iterations remaining. Program ex-
ecution continues from the line of code immediately following the loop.
The continue statement, in contrast, skips the remaining code within the
current iteration of the loop and immediately moves to the next iteration,
allowing you to exclude specific elements or conditions from processing.
The pass statement is a placeholder that does nothing. It is typically used
as a placeholder for code not yet implemented or to avoid syntax errors in
empty loops or conditionals. Below are examples:
Program 1.19: Skipping specific iterands in a sequence.
1 total = 0
2 primes = [ 2 , 3 , 5 , 7 , 11]
3 f o r v a l u e in p r i m e s :
4 i f v a l u e == 5 :
5 continue # skip value 5
6 t o t a l += v a l u e
7

Program 1.20: Exitting or terminating a loop.


1 total = 0
2 primes = [ 2 , 3 , 5 , 7 , 11]
3 f o r v a l u e in p r i m e s :
4 i f v a l u e == 5 :
5 break # e x i t the loop
6 t o t a l += v a l u e
7

Program 1.21: Empty loops for future use.


1 f o r num in range ( 1 0 ) :
2 i f num == 2 :
3 p a s s # Do n o t h i n g ( new code t o be added h e r e i n t h e f u t u r e )
4 p r i n t ( num )
5

1.2.5 Functions and Modules


Functions
A Python program is built using fundamental components called functions, also known
as subroutines, procedures, subprograms, or methods in other programming languages.
A function is a group of statements that need to be written only once but can be called

Moffat/PHY485 - Microcomputing for Physical Sciences notes 33


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

multiple times within a program to perform a specific task. The general syntax for
defining a function in Python is:
def function_name(parameter1, parameter2, ...):
statements
where:
- def is a keyword used to define a function.
- function_name is the identifier used to call the function.
- parameters are variables that receive input values when the function is called. A
function can have multiple parameters as needed.
- statements form the body of the function, containing the logic to be executed.
For example, suppose we want to multiply two numbers. We can define a function as
follows:
Program 1.22: Function to calculate the product of two numbers.
1 def m u l t i p l i c a t i o n ( a , b ) :
2 " " " M u l t i p l i e s two numbers and r e t u r n s t h e r e s u l t . " " "
3 return a ∗ b
4
5 z = m u l t i p l i c a t i o n (4 , 5) # function c a l l
6 p r i n t ( f " The r e s u l t i s : { z } " )

The result is: 20

In program 1.22, the function multiplication() returns a single value, the product
of two numbers, using the return keyword. The text in line 2 is a docstring, a comment
describing the function’s purpose. The variables a and b are parameters, while the values
4 and 5 are arguments passed to the function during a function call in line 5. The order
of arguments in the function call must match the order of parameters in the function
definition; such arguments are called positional arguments.
Scope of variables in a function
A function can access variables from two different scopes: local and global (McKinney,
2018). This is illustrated in the example below:
Program 1.23: Demonstrating local and global variables.
1 a = 10 # Global v a r i a b l e
2
3 def m u l t i p l i c a t i o n ( ) :

Moffat/PHY485 - Microcomputing for Physical Sciences notes 34


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

4 " " " D e m o n s t r a t e s t h e u s e o f l o c a l and g l o b a l v a r i a b l e s . " " "


5 global a
6 b = 2 # Local v a r i a b l e
7 return a ∗ b
8
9 z = multiplication ()
10 p r i n t ( f " The r e s u l t i s : { z } " )

The result is: 20

Multiple function calls


A function can be called multiple times with different arguments. For example:
Program 1.24: Calling a function multiple times.
1 def m u l t i p l i c a t i o n ( a , b ) :
2 " " " M u l t i p l i e s two numbers and r e t u r n s t h e r e s u l t . " " "
3 return a ∗ b
4
5 x = m u l t i p l i c a t i o n (2 , 20)
6 y = m u l t i p l i c a t i o n ( − 3 , −4)
7 z = m u l t i p l i c a t i o n (4 , 5)
8 p r i n t ( f " The r e s u l t s a r e {x } , {y } , and { z } . " )

The results are 40, 12, and 20.

Returning multiple values


A function can return multiple values, as demonstrated in program 1.25:
Program 1.25: Returning multiple values from a function.
1 def f ( ) :
2 a = 5
3 b = 6
4 c = 7
5 return a , b , c
6
7 a , b, c = f ()
8 print ( a , b , c )

5, 6, 7

Moffat/PHY485 - Microcomputing for Physical Sciences notes 35


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Default parameter values


A function definition can include default parameter values, allowing function calls to
omit those arguments. For example:
Program 1.26: Function with default parameters.
1 def m u l t i p l i c a t i o n ( a , b = 6 ) :
2 " " " M u l t i p l i e s two numbers and r e t u r n s t h e r e s u l t . " " "
3 return a ∗ b
4
5 z = m u l t i p l i c a t i o n ( 2 ) # Only one argument i s p a s s e d ; b d e f a u l t s t o 6
6 p r i n t ( f " The r e s u l t i s : { z } " )

The result is: 12

Arbitrary number of arguments


A function can accept an arbitrary number of arguments using an asterisk (*) before
the parameter name. This allows the function to handle a variable number of inputs:
Program 1.27: Function accepting an arbitrary number of arguments.
1 def sum_numbers ( ∗ a r g s ) :
2 " " " R e t u r n s t h e sum o f a l l g i v e n numbers . " " "
3 r e t u r n sum ( a r g s )
4
5 # Example u s a g e :
6 p r i n t ( sum_numbers ( 1 , 2 , 3 , 4 ) ) # Output : 10
7 p r i n t ( sum_numbers ( 5 . 5 , 2 . 3 , 8 ) ) # Output : 1 5 . 8
8 p r i n t ( sum_numbers ( ) ) # Output : 0

Modules
Most libraries (also referred to as modules) are not included in Python’s standard in-
stallation. They are developed by third parties and provide additional functionality
through collections of functions and tools. To use these libraries, you must first in-
stall them on your computer using Python’s package manager, pip. You can do this
by writing ‘pip install <module>’ on the command prompt (in Windows) or
terminal (Linux or MacOS), where module refers to the required module. If you are
using the Anaconda distribution, many popular libraries are pre-installed or can be in-
stalled automatically through its package manager, conda. After installation, you can
import a library into your working environment using the import statement. For
example, to calculate the square root of a number, you need the Numerical Python

Moffat/PHY485 - Microcomputing for Physical Sciences notes 36


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

(NumPy) library, which provides the sqrt() function. You can import NumPy
with ‘import numpy’. Once imported, you can use the square root function as
numpy.sqrt(number). Alternatively, you can import the sqrt() function di-
rectly with ‘from numpy import sqrt’. This allows you to use the function more
concisely as sqrt(number).
The following code demonstrates how to import a module and use its functions:
Program 1.28: Importing the NumPy module for mathematical operations.
1 import numpy
2
3 a = numpy . a r r a y ( [ 1 , 2 , 6 , 9 ] )
4 b = numpy . s q r t ( a ) # t a k e t h e s q u a r e r o o t o f e v e r y e l e m e n t i n ’ a ’
5
6 print ( b )

[1 1.41421356 2.44948974 3]

import statements are typically placed at the beginning of every script to prevent
errors and enhance code organization and readability. Modules can be categorized
into three types: built-in, third-party, and custom modules. There are several ways to
import modules:
• importing the whole module - e.g., ‘import math’. Every function then has
to have the math. prefix, e.g., math.sqrt(4). The advantage is that this
approach prevents name conflicts.
• importing a specific function or class - e.g., ‘from math import sqrt’. You
can then use the function directly without the math. prefix, e.g., sqrt(4).
• importing with an alias - e.g., ‘import numpy as np’, e.g., np.arange(5).
• importing all functions from a module - e.g., ‘from math import ∗ ’. You
can use the functions without the math. prefix as in the second bullet point
above. However, this is not recommended because it can lead to name conflicts
and reduced code readability.
Table 1.3 lists common modules used in academic research and numerical computing
for data analysis, statistical and time series analysis, modeling and simulations, visual-
ization, and machine learning.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 37


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Table 1.3: Common Python modules in academic research and numerical computing.

Module Focus Example usage


math Basic mathematical opera- math.sqrt(25), math.pi
tions
nympy Arrays, linear algebra, nu- numpy.array([1, 2, 3]),
merical operations numpy.dot(A, B)
scipy Scientific computing (opti- scipy.optimize.minimize(),
mization, integration, stats) scipy.integrate.quad()
matplotlib Plotting and visualization plt.plot(x, y),
plt.imshow(data)
pandas Data manipulation and anal- df = pd.read_csv("data.csv"),
ysis df.describe()
sympy Symbolic mathematics and sympy.solve(eq, x),
algebra sympy.diff(f, x)
statsmodels Statistical modeling sm.OLS(y, X).fit()
sklearn Machine learning model =
sklearn.linear_model.LinearRegression()
seaborn Statistical data visualization sns.boxplot(data=df)
xarray Multi-dimensional labeled ds = xr.open_dataset("data.nc")
data structures

When importing modules, it is often more convenient to use aliases to reduce clutter in
your code. This involves assigning shorter or more meaningful names to the imported
modules. Here are some common aliases adopted by the Python community:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import statsmodels as sm
Therefore, using np.sqrt() clearly indicates that you’re referring to the square root
function within the NumPy library. This convention improves code readability and
maintainability, especially when working with multiple libraries or collaborating with
other developers.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 38


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

1.2.6 Data structures


Python provides several built-in data structures that are useful for organizing and ma-
nipulating data efficiently. The most common ones are:
(a) Lists - a list is an ordered, mutable (modifiable) collection of items. It can contain
elements of different data types. The elements in a list are accessed using zero-
based indexing (i.e., the first element is assigned the index 0).
Below is an example that shows a list and some operations that can be carried out
in a list.
Program 1.29: Python list example.
1 numbers = [ 1 , 2 , 3 , 4 , 5 ]
2 numbers . append ( 6 ) # Adds 6 t o t h e end
3 numbers [ 0 ] = 10 # Modifies the f i r s t element
4 p r i n t ( numbers )
5

[10, 2, 3, 4, 5, 6]

(b) Tuples - a tuple is an ordered, immutable (unchangeable) collection of items.


Like lists, tuples can store elements of different data types. Since tuples are im-
mutable, they are useful for data that should not be modified.
Here is an example:
Program 1.30: Python tuple example.
1 d i m e n s i o n s = ( 1 9 2 0 , 1 0 8 0 ) # number o f rows = 1 9 2 0 , number o f columns = 1080
2 print ( dimensions [ 0 ] )
3

1920

(c) Dictionaries - a dictionary is an unordered collection of key-value pairs. Keys


must be unique and immutable (e.g., strings, numbers, or tuples). Values can be
any data type and are accessed using keys.
Below is an example that shows a dictionary and some operations that can be
carried out in it.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 39


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Program 1.31: Python dictionary example.


1 s t u d e n t = { " name " : " A t l a n g " , " age " : 2 3 , " major " : " E d u c a t i o n " }
2 p r i n t ( s t u d e n t [ " name " ] ) # Output : A t l a n g
3 s t u d e n t [ " age " ] = 21 # Modifying a v a l u e
4

Efficient alternatives to dictionaries include OrderedDict, defaultdict, Counter, and


ChainMap, all of which are available in the collections module. For creating
structured objects, you might consider using the dataclass function from the dat-
aclasses module. If you are working with large datasets, a Pandas DataFrame
is a suitable choice for handling tabular data (for a detailed treatment of pandas
dataframe, see VanderPlas, 2017, chap. 3).
(d) Sets - a set is an unordered collection of unique elements. Sets are useful for
removing duplicates and performing mathematical set operations like union and
intersection. For example:
Program 1.32: Python sets.
1 unique_numbers = { 1 , 2 , 3 , 3 , 4 , 5}
2 p r i n t ( unique_numbers )
3

1, 2, 3, 4, 5

(e) List comprehensions - List comprehensions provide a concise way to create lists
in Python. They are faster and more readable than traditional loops. The general
syntax of a list comprehension is:
new_list = [expression for item in iterable if condition]

Examples:
Program 1.33: Python list comprehension without a condition.
1 s q u a r e s = [ x ∗ ∗ 2 f o r x in range ( 1 0 ) ]
2 print ( squares )
3

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Moffat/PHY485 - Microcomputing for Physical Sciences notes 40


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Program 1.34: Python list comprehension with a condition.


1 even_numbers = [ x f o r x in range ( 1 0 ) i f x % 2 == 0 ] # l i s t o n l y t h o s e
d i v i s i b l e by 2
2 p r i n t ( even_numbers )
3

[0, 2, 4, 6, 8]

(f ) Practical examples with data

Program 1.35: Storing and accessing student records.


1 students = [
2 { " name " : " T l o t l o " , " age " : 22w, " g r a d e s " : [ 8 5 , 9 0 , 9 2 ] } ,
3 { " name " : " M a a t l a " , " age " : 2 3 , " g r a d e s " : [ 7 8 , 8 0 , 7 9 ] } ,
4 ]
5
6 # Accessing data
7 p r i n t ( s t u d e n t s [ 0 ] [ " name " ] ) # Output : T l o t l o
8 p r i n t ( sum ( s t u d e n t s [ 1 ] [ " g r a d e s " ] ) / len ( s t u d e n t s [ 1 ] [ " g r a d e s " ] ) ) # Output :
79.0
9

Program 1.36: Counting word frequency in a text.


1 t e x t = " a p p l e banana a p p l e orange banana a p p l e "
2 words = t e x t . s p l i t ( )
3 word_count = {word : words . count ( word ) f o r word in s e t ( words ) }
4 p r i n t ( word_count )
5

{'banana': 2, 'orange': 1, 'apple': 3}

Program 1.37: Removing duplicates from a list.


1 numbers = [ 1 , 2 , 2 , 3 , 4 , 4 , 5 ]
2 unique_numbers = l i s t ( s e t ( numbers ) )
3 p r i n t ( unique_numbers )
4

[1, 2, 3, 4, 5]

Moffat/PHY485 - Microcomputing for Physical Sciences notes 41


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Program 1.38: Filtering data using list comprehension.


1 ages = [18 , 21 , 16 , 25 , 30 , 17]
2 a d u l t s = [ age f o r age in a g e s i f age >= 1 8 ]
3 print ( a d u l t s )
4

[18, 21, 25, 30]

1.2.7 Numerical computations with Numpy


numpy is a powerful library for numerical computing in Python (e.g, Johansson 2015,
chap. 2; McKinney 2018, chap. 4). It provides support for large, multi-dimensional
arrays and matrices, along with a collection of mathematical functions to operate on
these arrays efficiently. As mentioned earlier, numpy is not bundled with the standard
installation of Python and therefore needs to be installed separately.

Creating NumPy arrays


A numpy array (also known as ndarray for N-dimensional array) is a grid of values,
all of the same type, and is indexed by a tuple of non-negative integers. NumPy arrays
are more efficient and versatile than Python’s built-in lists, especially for numerical
operations. It can be created in several ways including the following:
Program 1.39: Creating arrays using NumPy.
1 import numpy a s np
2
3 # c r e a t e a 1D a r r a y from a Python l i s t
4 a r r = np . a r r a y ( [ 1 , 2 , 3 , 4 ] )
5
6 # c r e a t i n g a r r a y s f i l l e d with z e r o s or o n e s
7 z e r o s = np . z e r o s ( 5 ) # 1D
8 z e r o s = np . z e r o s ( ( 3 , 3 ) ) # 2D
9 o n e s = np . o n e s ( ( 2 , 4 ) )
10
11 # c r e a t i n g an a r r a y with a s p e c i f i c ra ng e
12 a r a n g e _ a r r = np . a r a n g e ( 0 , 1 0 , 2 ) # Output : [ 0 , 2 , 4 , 6 , 8 ]
13
14 # c r e a t i n g an a r r a y o f e v e n l y s p a c e d v a l u e s
15 l i n s p a c e _ a r r = np . l i n s p a c e ( 0 , 1 , 5 ) # Output : [ 0 . , 0.25 , 0.5 , 0.75 , 1. ]

NumPy arrays support element-wise operations, which are significantly more effi-
cient than looping over lists. Some of the basic operations that can be performed
with NumPy arrays include: arithmetic operations, aggregation functions (such as sum,

Moffat/PHY485 - Microcomputing for Physical Sciences notes 42


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

mean, max and mean), vectorisation, broadcasting and more. Common numerical op-
erations using NumPy are demonstrated in program 1.40.
Program 1.40: Common numerical operations with NumPy.
1 import numpy a s np
2
3 # vectorisation
4 a = np . a r r a y ( [ 1 , 2 , 3 ] ) # c r e a t e a v e c t o r o f t h r e e numbers
5 r e s u l t = np . s i n ( a ) # a p p l y s i n e f u n c t i o n t o e ach e l e m e n t
6
7 # c r e a t i n g r a n g e s and s e q u e n c e s
8 np . l i n s p a c e ( 0 , 1 0 , 5 ) # c r e a t e 5 e q u a l l y s p a c e d numbers from 0 t o 10
9 np . a r a n g e ( 0 , 1 0 , 2 ) # c r e a t e v a l u e s from 0 t o 10 with s t e p s i z e 2
10
11 # d o t p r o d u c t and m a t r i x o p e r a t i o n s
12 A = np . a r r a y ( [ [ 1 , 2 ] , [ 3 , 4 ] ] ) # 2D a r r a y
13 B = np . a r r a y ( [ [ 5 , 6 ] , [ 7 , 8 ] ] ) # a n o t h e r 2D a r r a y
14 d o t _ p r o d u c t = np . d o t ( A, B ) # d o t p r o d u c t o f t h e 2D a r r a y s
15
16 # summation and a g g r e g a t i o n
17 a r r = np . a r r a y ( [ [ 1 , 2 , 3 ] , [ 4 , 5 , 6 ] ] )
18 np . sum ( a r r ) # sum o f a l l e l e m e n t s
19 np . sum ( a r r , a x i s = 0 ) # column − w i s e sum ( i . e . , 1 s t d im e ns i on )
20 np . sum ( a r r , a x i s = 1 ) # row− w i s e sum ( i . e . , 2nd d im e ns i o n )
21
22 # c r e a t i n g multi − dimensional a r r a y s
23 a r r a y _ 3 d = np . a r r a y ( [ [ [ 1 , 2 , 3 ] ,
24 [4 , 5 , 6] ,
25 [7 , 8 , 9]] ,
26
27 [ [ 1 0 , 11 , 12] ,
28 [13 , 14 , 15] ,
29 [ 1 6 , 1 7 , 1 8 ] ] ] ) # c r e a t e a 3D a r r a y ( 2 x3x3 t e n s o r )
30 s u m _ a l o n g _ a x i s = np . sum ( a r r a y _ 3 d , a x i s = 0 ) # sum a l o n g t h e f i r s t a x i s ( a x i s = 0)
31

1.2.8 Data visualisation


One of the most attractive aspects of Python is its powerful toolkit for creating static,
animated, and interactive visualizations. A cornerstone of this toolkit is Matplotlib,
a robust library that enables users to generate publication-quality plots. Matplotlib
offers the flexibility to export figures in all common vector and raster formats (‘pdf’,
‘svg’, ‘jpg’, ‘png’, ‘bmp’, ‘gif’, etc.). It has also spawned the development of a number
of powerful data visualization libraries, such as Seaborn (e.g., VanderPlas, 2017, chap.
4), which are beyond the scope of this course.
Note that Matplotlib is typically not included in the standard Python installation and
therefore needs to be installed separately like the numpy library. The ‘pyplot’ sub-
module of Matplotlib provides a user-friendly interface for creating plots. It offers a

Moffat/PHY485 - Microcomputing for Physical Sciences notes 43


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

familiar syntax for those who are already acquainted with MATLAB, simplifying the
creation of various plot types: scatter plots, bar charts, histograms, statistical plots (e.g.,
box plots, errorbar plots, violin plots), gridded data plots, heatmaps, image plots, pie charts,
stem plots, 3D plots, and so on. To use ‘pyplot’, we import it as follows: ‘import
matplotlib.pyplot as plt’. The following programs demonstrate how to use
the Matplotlib module to create some of these plots and customize their appearance.

Generating a line plot


A line plot is used to represent data points connected by straight lines. It is useful for
showing trends over time. Here is an example:
Program 1.41: Line plot with Matplotlib.
1 import m a t p l o t l i b . p y p l o t a s p l t
2
3 # Sample d a t a
4 x = [1 , 2 , 3 , 4 , 5]
5 y = [2 , 3 , 5 , 7 , 11]
6
7 # Create a l i n e p l o t
8 plt . plot (x , y)
9
10 # Display the p l o t
11 p l t . show ( )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 44


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

10

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Figure 1.6: Line plot with Matplotlib.

Generating a scatter plot


A scatter plot is used to display values for typically two variables for a set of data.
It is useful for showing the relationship between two variables. The following code
demonstrate how to generate a scatter plot:
Program 1.42: Scatter plot with Matplotlib.
1 import m a t p l o t l i b . p y p l o t a s p l t
2
3 # Sample d a t a
4 x = [1 , 2 , 3 , 4 , 5]
5 y = [2 , 3 , 5 , 7 , 11]
6
7 # Create a s c a t t e r p l o t
8 plt . scatter (x , y)
9
10 # Display the p l o t
11 p l t . show ( )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 45


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

10

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Figure 1.7: Scatter plot with Matplotlib.

Generating a histogram plot


A histogram is used to represent the distribution of a dataset, for example, marks for
Test 1 of PHY485 course. It is useful for showing the frequency of data points within
specified ranges. Here is an example:
Program 1.43: Histogram plot with Matplotlib.
1 import m a t p l o t l i b . p y p l o t a s p l t
2
3 # Sample d a t a o f t e s t 1 marks f o r PHY485
4 d a t a = [45 , 56 , 58 , 78 , 92 , 83 , 76 , 67 , 66 , 45 , 56 ,
5 65 , 58 , 75 , 89 , 56 , 67 , 60 , 75 , 80]
6
7 # Create a histogram
8 p l t . h i s t ( d a t a , b i n s = 5 , e d g e c o l o r = ’ white ’ )
9
10 # Display the p l o t
11 p l t . show ( )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 46


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

0
50 60 70 80 90

Figure 1.8: Histogram plot with Matplotlib.

Customising plots - adding labels and titles


Labels and titles help in making the plot more informative. This is how these are added
in a plot:
Program 1.44: Adding labels and titles.
1 import m a t p l o t l i b . p y p l o t a s p l t
2
3 # Sample d a t a
4 x = [1 , 2 , 3 , 4 , 5]
5 y = [2 , 3 , 5 , 7 , 11]
6
7 # Create a l i n e p l o t
8 plt . plot (x , y)
9
10 # Add l a b e l s and t i t l e
11 p l t . x l a b e l ( ’X− a x i s L a b e l ’ )
12 p l t . y l a b e l ( ’Y− a x i s L a b e l ’ )
13 p l t . t i t l e ( ’ Sample Line P l o t ’ )
14
15 # Display the p l o t
16 p l t . show ( )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 47


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Sample Line Plot

10

8
Y-axis Label

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0


X-axis Label

Figure 1.9: Adding labels and titles in a Matplotlib plot.

Customising plots - adding legends


Legends are useful when plotting multiple datasets on the same plot. The following
shows how they are added.
Program 1.45: Adding a legend to a plot.
1 import m a t p l o t l i b . p y p l o t a s p l t
2
3 # Sample d a t a
4 x = [1 , 2 , 3 , 4 , 5]
5 y1 = [ 2 , 3 , 5 , 7 , 1 1 ]
6 y2 = [ 1 , 4 , 6 , 8 , 1 0 ]
7
8 # Create l i n e p l o t s
9 p l t . p l o t ( x , y1 , l a b e l = ’ D a t a s e t 1 ’ )
10 p l t . p l o t ( x , y2 , l a b e l = ’ D a t a s e t 2 ’ )
11
12 # Add l e g e n d
13 p l t . legend ( )
14
15 # Display the p l o t
16 p l t . show ( )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 48


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

Dataset 1
Dataset 2
10

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Figure 1.10: Adding legends in a Matplotlib plot.

For seamless addition of legends to plots make sure you include a label for each plot
you make. Otherwise, you will be forced to add the labels manually for each plot in
the legend function.

Customising plots - adding a grid


Grids help in better readability of the plot. This is how they are added:
Program 1.46: Adding a grid to a plot.
1 import m a t p l o t l i b . p y p l o t a s p l t
2
3 # Sample d a t a
4 x = [1 , 2 , 3 , 4 , 5]
5 y = [2 , 3 , 5 , 7 , 11]
6
7 # Create a l i n e p l o t
8 plt . plot (x , y)
9
10 # Add g r i d
11 p l t . g r i d ( True )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 49


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

12
13 # Display the p l o t
14 p l t . show ( )

10

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Figure 1.11: Adding a grid to a Matplotlib plot.

Saving plots
In Matplotlib, you can save figures using the ‘savefig()’ function. This function
takes the filename as the first argument and allows you to specify the file format by
including the extension in the filename. Supported file formats are as mentioned earlier.
Here is a basic example:
Program 1.47: Saving plots from Matplotlib.
1 import m a t p l o t l i b . p y p l o t a s p l t
2
3 # Sample d a t a
4 x = [1 , 2 , 3 , 4 , 5]
5 y = [2 , 3 , 5 , 7 , 11]
6

Moffat/PHY485 - Microcomputing for Physical Sciences notes 50


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

7 # Create a l i n e p l o t
8 plt . plot (x , y)
9
10 # Save t h e p l o t
11 p l t . s a v e f i g ( ’ l i n e _ p l o t . png ’ )
12
13 # Display the p l o t
14 p l t . show ( )

Note that the figure will be saved to the current folder (in MS Windows) or directory
(in Linux).

1.2.9 File handling (reading/ writing files)


Python provides built-in support for handling files, making it easy to read from and
write to different file formats. This is particularly useful in the PHY485 course for
saving and loading data.
To write data to a file, use Python’s built-in open() function with the write ("w")
mode. This will create a new file or overwrite an existing one, e.g.:
Program 1.48: Writing data to a text file.
1 with open ( " example . t x t " , "w" ) a s f i l e :
2 f i l e . w r i t e ( " This i s t h e f i r s t l i n e . \ n " )
3 f i l e . w r i t e ( " This i s t h e s e c o n d l i n e . \ n " )

The with statement ensures the file is properly closed after writing. Each write()
operation adds content to the file, but it does not automatically add a newline. The user
can add a newline using the newline character (\n). If you do not want to overwrite
the existing data, you can use the append mode ("a") as shown below:
Program 1.49: Appending data to a file.
1 with open ( " example . t x t " , " a " ) a s f i l e :
2 f i l e . w r i t e ( " Appending a new l i n e . \ n " )

Reading a file in Python is very easy as well. This is done using the open() function
with read ("r") mode, e.g.:
Program 1.50: Reading a text file.
1 with open ( " example . t x t " , " r " ) a s f i l e :
2 content = f i l e . read ( )
3 print ( content )
4

Moffat/PHY485 - Microcomputing for Physical Sciences notes 51


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.2 An introduction to . . . PYTHON LANGUAGE

5 # r e a d a f i l e l i n e by l i n e
6 with open ( " example . t x t " , " r " ) a s f i l e :
7 f o r l i n e in f i l e :
8 p r i n t ( l i n e . s t r i p ( ) ) # s t r i p f u n c t i o n removes e x t r a n e w l i n e c h a r a c t e r s , \ n
9
10 # read a l l l i n e s into a l i s t
11 with open ( " example . t x t " , " r " ) a s f i l e :
12 lines = file . readlines ()
13 print ( l i n e s ) # Returns a l i s t of s t r i n g s
14
15 with open ( " example . t x t " , " r " ) a s f i l e :
16 l i n e s = [ l i n e . s t r i p ( ) f o r l i n e in f i l e ] # removes n e w l i n e c h a r a c t e r
17 print ( l i n e s )

Files can contain structured data such as Comma Separated Values (CSV). Python pro-
vides the csv module and the pandas library for such files. Programs 1.51 and 1.52
demonstrate how you can write and read to a CSV file, respectively, using the csv
module.
Program 1.51: Writing data to a CSV file.
1 import c s v
2
3 # w r i t i n g d a t a t o a CSV f i l e
4 with open ( " d a t a . c s v " , "w" , n e w l i n e = " " ) a s f i l e :
5 writer = csv . writer ( f i l e )
6 w r i t e r . writerow ( [ "Name" , " Age " , "Town" ] )
7 w r i t e r . writerow ( [ " Tshenolo " , 2 3 , " L o b a t s e " ] )
8 w r i t e r . writerow ( [ " T l o t l o " , 2 2 , " Gabane " ] )

Program 1.52: Reading data from a CSV file.


1 import c s v
2
3 # r e a d i n g d a t a from a CSV f i l e
4 with open ( " d a t a . c s v " , " r " ) a s f i l e :
5 reader = csv . reader ( f i l e )
6 f o r row in r e a d e r :
7 p r i n t ( row )

Using the pandas module is even more convenient to use to read and save data to CSV
files. This is demonstrated below in program 1.53.
Program 1.53: Reading and saving CSV data using pandas.
1 import p a n d a s a s pd
2
3 # r e a d a CSV f i l e with p a n d a s
4 d f = pd . r e a d _ c s v ( " d a t a . c s v " )
5 print ( df )
6

Moffat/PHY485 - Microcomputing for Physical Sciences notes 52


CHAPTER 1. INTRODUCTION TO COMPUTER PROGRAMMING & THE
1.3 Conclusion PYTHON LANGUAGE

7 # s a v e a DataFrame t o a CSV f i l e ( d a t a f r a m e = t a b l e with rows & columns )


8 df . to_csv ( " output . csv " , index = F a l s e )

For large numerical datasets, the NumPy library provides efficient methods for saving
and loading data. The program below demonstrates how this is done:
Program 1.54: Saving and loading data using NumPy.
1 import numpy a s np
2
3 # saving data to a t e x t f i l e
4 d a t a = np . a r r a y ( [ [ 1 , 2 , 3 ] , [ 4 , 5 , 6 ] ] )
5 np . s a v e t x t ( " d a t a . t x t " , d a t a )
6
7 # l o a d i n g d a t a from a t e x t f i l e
8 l o a d e d _ d a t a = np . l o a d t x t ( " d a t a . t x t " )
9 print ( loaded_data )
10
11 # s a v i n g d a t a t o a b i n a r y . npy f i l e
12 np . s a v e ( " d a t a . npy " , d a t a )
13
14 # l o a d i n g d a t a from a b i n a r y . npy f i l e
15 l o a d e d _ d a t a = np . l o a d ( " d a t a . npy " )
16 print ( loaded_data )

Note that using binary .npy files is more efficient for large numerical datasets than text
files.

1.3 Conclusion
In this chapter, we introduced the fundamentals of computer programming and the
Python language, equipping you with the basic skills to write and execute code. These
skills form the foundation for solving computational problems in the physical sciences.
In the next chapter, we will apply programming to numerical methods, focusing on
techniques for solving mathematical problems that lack exact solutions. Specifically,
we will explore iterative methods, the numerical solution of integral equations, and
the numerical solution of differential equations. Using Python, we will implement
these techniques to solve simple mathematical problems efficiently.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 53


NUMERICAL METHODS
2
Numerical methods employ algorithms that make it possible to find a solution to a
problem that would otherwise be impossible to find using analytical means. They do
not seek exact answers but approximate solutions within reasonable bounds or errors.
Some of the most common numerical methods are the root-finding methods. Consider
a polynomial Pn (x) of degree n with the form

Pn (x) = a0 + a1 x + a2 x2 + · · · + an xn (an ̸= 0) .

This polynomial has at least one zero. An analytical solution to this polynomial can be
found only if n < 5, for example, it is easy to find the roots of a quadratic equation
(equation with n = 2) analytically as

−b ± b2 − 4ac
x= .
2a

For n ≥ 5, it is impossible to find such an analytical formula and hence the use of special
root-finding methods. They are iterative in nature. The idea behind these methods is
to find the value r such that f (r) = 0 for a function f (x) continuous on a predefined interval
[a, b]. Two important features associated with iterative methods are convergence and
stopping criterion. Each of these is discussed below under each individual method
(for more information, see Burden & Faires (2011, chapter 2) and Cheney & Kincaid
(2013, chapter 3)).

54
2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

2.1 Iterative Methods


2.1.1 Bisection method
This is a root-finding method that repeatedly bisects an interval until an approximation
to the root is found within a reasonable tolerance or error. Let f be a continuous
function on [a, b] with f (a) f (b) < 0, that is, f (a) and f (b) have opposite signs. Then
there exist some number r, with a < r < b, such that f (r) ≈ 0. Since the root exist
between the end points a and b, they are said to bracket the root and hence the bisection
method is considered a bracketing method. The algorithm 2 below shows how one can
implement the bisection method in a computer. It is based on figure 2.1 which shows
the interval within which the root of a function f (x) is determined using the bisection
root-finding method.
Bisection Algorithm

f (x) f (b)
f (r )

x
0 a r b
f (x) = 0
f (a)

Figure 2.1: Illustration of the bisection method for finding the root of a function f (x). Notice
that after each step (iteration N) the interval is halved, i.e., (b − a) /2N . If the root r is found
within a tolerance ϵ (i.e., |ri − ri−1 | < ϵ) the number of iterations N can be determined from
(b − a)/2N < ϵ.

The error or tolerance, ϵ, can be defined such that


(i) |ri − ri−1 | < ϵ
|ri −ri−1 |
(ii) |ri |
< ϵ, ri ̸= 0

(iii) |f (ri )| < ϵ


(iv) |ri − r| ≤ b−a
2i
,i ≥1

Moffat/PHY485 - Microcomputing for Physical Sciences notes 55


2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

Algorithm 2 Algorithm for the bisection method


1: INPUT: function f, endpoints values a, b, tolerance TOL, maximum number of
iterations N
2: OUTPUT: approximate solution r or message of failure
3:
4: i←1
5:
6: while i ≤ N do
7: r ← a + (b − a) /2
8: if f (r) ← 0 or (b − a) /2 < TOL then
9: Output (r)
10: Stop
11: end if
12: i←i+1
13: if f (a) f (r) > 0 then
14: a←r ▷ set a new interval [r, b]
15: else
16: b←r ▷ set a new interval [a, r]
17: end if
18: end while
19:
20: Output (‘Method failed after, N =’, N ‘iterations’)
21: Stop

Moffat/PHY485 - Microcomputing for Physical Sciences notes 56


2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

The bisection method takes time to converge, though it ultimately converges to a


solution. Program 2.1 shows how the bisection method is implemented to solve one
of the roots of the equation y = 2x2 + 2x − 12. Unfortunately, for an equation with
many roots we can only find one root at a time within a given interval. The above
equation has two roots: 2 and -3. The program below tries to approximate the first
root by looking within the interval [-1,2.5]. The required number of iterations if we
use a tolerance of ϵ = 10−8 is found from

2.5 − (−1) 3.5


N
< 10−8 ⇒ −8 < 2N
2 10
⇒ log 3.5 × 108 < log 2N = N log 2


⇒ N > 28,

i.e., the required number of iterations N should be greater than 28. Notice that the
program uses three functions: one defining the equation to solve, another defining the
bisection algorithm and the other defining the main function. This makes it easy to
follow and debug the program. The other thing to note is that instead of the while
loop shown in the algorithm above the program uses the do-while loop, so we force it
to perform the first iteration no matter what.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 57


2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

Program 2.1: The bisection method to solve one of the roots of the equation y = 2x2 +2x−12.
1 def f ( x ) :
2 # Function f
3 y = 2 ∗ x ∗ ∗ 2 + 2 ∗ x − 12 # This e q u a t i o n h a s two r e a l r o o t s : 2 & −3
4 return y
5
6 def b i s e c t i o n ( a , b , e p s i l o n , N) :
7 # Function b i s e c t i o n
8 print ( f " { ’ i ’: >6} | { ’ r ’: >10} | { ’ f ( r ) ’: >14} | { ’ Error | b − a | / 2 ’ : > 1 0 } " )
9 p r i n t ( " −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− " )
10
11 i = 0
12
13 while i <= N:
14 error = ( b − a ) / 2.0
15 r = a + error
16
17 print ( f " { i : >6} | { r : > 1 0 . 8 f } | { f ( r ) : > 1 4 . 8 f } | { abs ( e r r o r ) : > 1 0 . 8 f } " )
18
19 i f f ( r ) == 0 or abs ( e r r o r ) < e p s i l o n :
20 p r i n t ( f " The r o o t v a l u e i s { r : . 8 f } a f t e r { i + 1} i t e r a t i o n s . " )
21 return
22
23 if f ( a ) ∗ f ( r ) > 0:
24 a = r
25 else :
26 b = r
27
28 i += 1
29
30 p r i n t ( " \ nMethod f a i l e d s i n c e t h e e r r o r " , abs ( e r r o r ) , " i s s t i l l g r e a t e r than " )
31 print ( f " epsilon = { epsilon }. " )
32 p r i n t ( " \ n I n c r e a s e t h e number o f i t e r a t i o n s N and t r y a g a i n . " )
33
34 def main ( ) :
35 # F u n c t i o n main
36 N = i n t ( input ( " E n t e r t h e maximum number o f i t e r a t i o n s , N: \ n " ) )
37 a = f l o a t ( input ( " E n t e r t h e lower l i m i t of the search i n t e r v a l , a : \ n" ) )
38 b = f l o a t ( input ( " E n t e r t h e upper l i m i t of the search i n t e r v a l , b : \ n" ) )
39 e p s i l o n = f l o a t ( input ( " E n t e r t h e tolerance , epsilon : \ n" ) )
40
41 # N = 50
42 # a = −1
43 # b = 2.5
44 # e p s i l o n = 1e −8
45
46 b i s e c t i o n ( a , b , e p s i l o n , N)
47
48 i f __name__ == " __main__ " : # i t i s b e s t p r a c t i c e i n Python t o u s e t h i s c o n s t r u c t
49 main ( )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 58


2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

2.1.2 Newton-Raphson (or Newton’s) method


Newton’s method is one of the most powerful numerical methods for finding the roots
or zeroes of a real valued function f . The method assumes that the function f (x),
where x is a real number, defined within an interval [a, b] is continuous and its deriva-
tive f ′ (x) exits. An initial guess of the root x0 is chosen such that f (x0 ) = 0 and
f ′ (x0 ) ̸= 0. The initial guess is then used to calculate a better approximation x1 given
by

f (x0 )
x1 = x0 −
f ′ (x0 )

The process is repeated with subsequent xi ’s given by

f (xi )
xi+1 = xi −
f ′ (xi )

until a sufficiently accurate value is reached. Algorithm 3 shows how the method is
implemented in a computer. A graphical representation of how the method is imple-
mented is shown in figure 2.2.

f (x) slope = f ´(xn)

f (r ) = 0

x
0 r xn+1 xn

Figure 2.2: Illustration of the Newton-Raphson method for finding an isolated root of a
function f (x). xn+1 is a better approximation than xn for the root r of f (x).

The following tolerance formats can be used for the Newton-Raphson method:
(i) |xi − xi−1 | < ϵ
|xi −xi−1 |
(ii) |xi |
< ϵ, xi ̸= 0

Moffat/PHY485 - Microcomputing for Physical Sciences notes 59


2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

Algorithm 3 Algorithm for the Newton-Raphson’s method


1: INPUT: function f, initial value x0 , tolerance TOL, maximum number of iterations
N
2: OUTPUT: approximate solution x or message of failure
3:
4: i←1
5:
6: while i ≤ N do
7: xi ← x0 − ff′(x 0)
(x0 )
8: if |xi − x0 | < TOL then
9: Output (xi )
10: Stop
11: end if
12: i←i+1
13: x 0 ← xi
14: end while
15:
16: Output (‘Method failed after N =’, N ‘iterations’)
17: Stop

Moffat/PHY485 - Microcomputing for Physical Sciences notes 60


2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

(iii) |f (xi )| < ϵ


Note that Newton-Raphson method can be derived from the Taylor series of the func-
tion f at x0 . We can add a correction h to x0 to obtain the root precisely such that
f (x0 + h) = 0. If the function f is well behaved we can write the Taylor series expan-
sion as

′ h2 ′′
f (x0 ) + hf (x0 ) + f (x0 ) + · · · = 0.
2

To determine h from this expansion, we consider only the first two terms of the Taylor
series, such that

f (x0 ) + hf ′ (x0 ) = 0.

Solving for h yields

f (x0 )
h=− .
f ′ (x0 )

The new approximation is then

f (x0 )
x1 = x0 + h = x0 − .
f ′ (x0 )

2.1.3 Secant method


The need to know the derivative in the Newton-Raphson method provides some diffi-
culty if the function is not easy to differentiate. This problem is overcome by approx-
imating the tangent with the slope of a line. Since the derivative is defined by

f (x + h) − f (x)
f ′ (x) = lim ,
h→0 h

for small h,

f (x + h) − f (x)
f ′ (x) ≈ .
h

Moffat/PHY485 - Microcomputing for Physical Sciences notes 61


2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

This is essentially a forward difference (one form of finite difference) approximation.


Setting x = xn and h = xn−1 − xn gives

f (xn−1 ) − f (xn )
f ′ (x) ≈ .
xn−1 − xn

Using this definition of f ′ (xn ) in Newton’s method we get


 
xn−1 − xn
xn+1 = xn − f (xn ) .
f (xn−1 ) − f (xn )

This is the secant method, thus named because the term multiplying f (xn ) is the
inverse of the slope of a secant line to the graph of f (x) (see figure 2.3). Algorithm 4
shows how the secant method is implemented.

f (x)
secant line

f (r ) = 0

x
0 r xn+1 xn xn-1

Figure 2.3: Illustration of the Secant method for finding an approximation to the root of a
function f (x).

The secant method has two advantages over Newton’s method in that:
(i) No differentiation is required.
(ii) Only one function is evaluated at each step once the computations have started.
Example

Let f (x) = e−x − πx


, x ∈ [0.4, 0.5]. The derivative of this function f ′ (x) =

sin 2
−e − 2 cos 2 . In order to apply the bisection method, we first check whether
−x π πx


f (a) f (b) < 0 is satisfied, i.e.,

Moffat/PHY485 - Microcomputing for Physical Sciences notes 62


2.1 Iterative Methods CHAPTER 2. NUMERICAL METHODS

Algorithm 4 Algorithm for the Secant method


1: INPUT: function f, initial values x0 & x1 , tolerance TOL, maximum number of
iterations N
2: OUTPUT: approximate solution x or message of failure
3:
4: i←2
5: f0 ← f (x0 )
6: f1 ← f (x1 )
7:
8: while i ≤ N do 
−x0
9: xi ← x1 − f1 xf11 −f 0

10: if |xi − x1 | < TOL then


11: Output (xi )
12: Stop
13: end if
14: i←i+1
15: x 0 ← x1
16: f0 ← f1
17: x 1 ← xi
18: f1 ← f (xi )
19: end while
20:
21: Output (‘Method failed after N =’, N ‘iterations’)
22: Stop

Moffat/PHY485 - Microcomputing for Physical Sciences notes 63


2.2 Solutions of Integral . . . CHAPTER 2. NUMERICAL METHODS

f (a) f (b) = f (0.4) f (0.5) = 0.0825 × (−0.1006) = −0.0083 < 0.

Therefore, we can use the bisection method to approximate the root. Table 2.1 shows
the results of the three methods in approximating the root of f (x).

Table 2.1: Comparison of the bisection, Newton-Raphson and secant methods in approximating
the root of the function f (x) = e − sin 2 , x ∈ [0.4, 0.5]. The tolerance used in all the
−x πx


methods is ∈= 1 × 10−8 .

Method Root, x No. of iterations, N


bisection 0.44357353 24
Newton-Raphson 0.44357353 4
secant 0.44357353 5

From table 2.1 it is clear that the Newton’s method is the fastest of the three followed
by the secant method while the bisection comes in last.
We will now look at solutions of integral equations. In this section we will consider
the trapezoidal and Simpson’s rules.

2.2 Solutions of Integral Equations


There are two main reasons why numerical analysis may be needed to find the solution
to an integral equation:
(i) Analytical solution may be impossible such as in the functions f (x) = exp (−x2 )
and f (x) = xx , or
(ii) One may want to integrate tabulated data rather than known functions.
The basic problem in numerical integration is to compute an approximate solution to
a definite integral

Z b
f (x) dx
a

to a given degree of accuracy. Several methods exist that can be used to find an ap-
proximate solution to this integral to a desired precision. Amongst these methods are
the trapezoidal and the Simpson’s rules which are the focus of the discussion here.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 64


2.2 Integral Equations CHAPTER 2. NUMERICAL METHODS

2.2.1 Trapezoidal rule


y
f (x)

f (x1)
f (x0)

x
a = x0 b = x1
Figure 2.4: Illustration of the trapezoidal rule for finding an approximate solution to the defi-
Rb
nite integral a f (x) dx.

The trapezoidal rule works by approximating the curve between (x0 , f (x0 )) and (x1 , f (x1 ))
with a trapezium (see figure 2.4) and calculating its area. The area is given by base ×
average height, i.e.,

1
(x1 − x0 ) × [f (x0 ) + f (x1 )] or
2
1
h [f (a) + f (b)] where h = (b − a) .
2

If we divide the region into n trapezoids, we can then add the areas of these trapezoids
to obtain the total area under the curve in the region [a, b]. The parameter h will
then be defined by h = (b − a) /n. The easier option is to divide the region into equal
small sub-regions which then gives rise to what is known as the compound or composite
trapezoidal rule:
" n−1
#
b
b − a 2 ′′
Z
h X
I= f (x) dx = f (a) + 2 f (xi ) + f (b) − h f (µ),
a 2 i=1 | 12 {z }
error term

Moffat/PHY485 - Microcomputing for Physical Sciences notes 65


2.2 Integral Equations CHAPTER 2. NUMERICAL METHODS

for some µ in [a, b]. The following shows how the trapezoidal rule can be implemented
in a program.

Algorithm 5 Algorithm for the composite trapezoidal rule


1: INPUT: endpoints a & b, even positive integer n
2: OUTPUT: approximation XI of I
3:
4: h ← b−a
n
5: XI ← 21 [f (a) + f (b)]
6:
7: for i = 1 to n − 1 do
8: x ← a + ih
9: XI ← XI + f (x)
10: end for
11: XI ← XI × h
12: Output XI
13: Stop

2.2.2 Simpson’s rule


The Simpson’s rule, rather than fit a straight line to a curve, fits a parabola (see fig-
ure 2.5) and hence is more accurate than the trapezoidal rule. It is applied over two
intervals with partition points a = x0 , a + h = x1 and b = a + 2h = x2 , where
h = (b − a) /2. It is formulated as

Z b    
h a+b
f (x) dx = f (a) + 4f + f (b) , or
a 3 2
Z x2
h b−a
f (x) dx = [f (x0 ) + 4f (x1 ) + f (x2 )] where h = .
x0 3 2

If we increase the intervals of integration to n even sub-intervals, we can have a series


of Simpson’s rule applied to a pair of sub-intervals giving rise to the composite Simpson’s
rule:

Moffat/PHY485 - Microcomputing for Physical Sciences notes 66


2.2 Integral Equations CHAPTER 2. NUMERICAL METHODS

f (x)

x
a = x0 x1 b = x2
Figure 2.5: Illustration of the Simpson’s rule for finding an approximate solution to the definite
Rb
integral a f (x) dx.

n
Z b 2
∼ hX
I= f (x) dx = [f (x2j−2 ) + 4f (x2j−1 ) + f (x2j )] , or
a 3 j=1
 n n

Z b 2
−1 2
h X X b−a 4 4
I= f (x) dx = f (a) + 2 f (x2j ) + 4 f (x2j−1 ) + f (b) − h f (µ),
a 3 j=1 j=1 |180 {z }
error term

for some µ in the interval [a, b]. Just like in the composite trapezoidal rule, h is now
h = (b − a) /n for (n) an even integer. Algorithm 6 shows implementation of the
Rb
composite Simpson’s rule to approximate the integral I = a f (x) dx:
The trapezoidal and Simpson’s rules form part of a class of numerical integration rules
known as the closed Newton-Cotes formulae with h = (xn − x0 ) /n. The following is
a list of the formulae:
• n = 1: Trapezoidal rule

Z x1
h
f (x) dx = [f (x0 ) + f (x1 )]
x0 2

Moffat/PHY485 - Microcomputing for Physical Sciences notes 67


2.2 Integral Equations CHAPTER 2. NUMERICAL METHODS

Algorithm 6 Algorithm for the composite Simpson’s rule


1: INPUT: endpoints a & b, even positive integer n
2: OUTPUT: approximation XI to I
3:
4: h ← b−a
n
5: XI0 ← f (a) + f (b)
6: XI1 ← 0 ▷ Summation of f (x2i−1 )
7: XI2 ← 0 ▷ Summation of f (x2i )
8:
9: for i = 1 to n − 1 do
10: x ← a + ih
11: if i is even then
12: XI2 ← XI2 + f (x)
13: else ▷ i is odd
14: XI1 ← XI1 + f (x)
15: end if
16: end for
17: XI ← h × (XI0 + 2 × XI2 + 4 × XI1) /3
18: Output XI
19: Stop

Moffat/PHY485 - Microcomputing for Physical Sciences notes 68


2.2 Integral Equations CHAPTER 2. NUMERICAL METHODS

• n = 2: Simpson’s rule

Z x2
h
f (x) dx = [f (x0 ) + 4f (x1 ) + f (x2 )]
x0 3

• n = 3: Simpson’s three-eighths rule


Z x3
3h
f (x) dx = [f (x0 ) + 3f (x1 ) + 3f (x2 ) + f (x3 )]
x0 8

Note that Newton-Cotes formulae are considered open if the end points are not in-
cluded in the nodes used to approximate the integral, that is, x0 = a+h and xn = b−h,
therefore h = (b − a) /n + 2. Here are some examples -
(1) The trapezoidal rule for a function f on the interval [0, 2] is

Z 2
f (x) dx ≈ f (0) + f (2) , and
0

(2) For Simpson’s rule in the same interval is

Z 2
1
f (x) dx ≈ [f (0) + 4f (1) + f (2)] .
0 3

The results to three places for some elementary functions are summarised in table 2.2.
Note that Simpson’s rule gives better results compared to trapezoidal rule, furthermore
it gives exact solutions for any polynomial of degree 3 or less.

Table 2.2: Numerical integration of some elementary functions using trapezoidal and Simp-
son’s rules.

f (x) x2 x4 1/ (x + 1) 1 + x2 sin x ex
Exact value 2.667 6.400 1.099 2.958 1.416 6.389
Trapezoidal 4.000 16.000 1.333 3.326 0.909 8.389
Simpson’s 2.667 6.667 1.111 29.64 1.425 6.421

Moffat/PHY485 - Microcomputing for Physical Sciences notes 69


2.3 Solutions of Diff. . . . CHAPTER 2. NUMERICAL METHODS

2.3 Solutions of Differential Equations


Here we will consider two methods: Euler’s and Runge-Kutta methods which we will
be applying to differential equations known as initial-value problem. These are equa-
tions for which the value is known at the starting point. They have the form

′ dy
y = = f (z, y) , for a ≤ z ≤ b,
dz
subject to the initial condition
y (a) = α.
Examples

Equation Initial Value Solution



x = x + 1 x (0) = 0 x = ez − 1

x = 6z − 1 x (1) = 6 x=√ 3z 2 − z + 4
′ z
x = x−1 x (0) = 0 x = z2 + 1 − 1

Differential problems are used to model problems in science and engineering that in-
volve the change of some variable with respect to another. The majority of these
problems require the solution to an initial-value problem, i.e., solution to a differential
equation that satisfies a given initial condition.
The Initial-value Problem

dy
= f (z, y) , a ≤ z ≤ b, y (a) = α,
dz
is said to be a well-posed problem if:
(i) a unique solution, y (z), to the problem exists.
(ii) for any ϵ > 0, there exists a positive constant k (ϵ) with the property that, when-
ever |ϵ0 | < ϵ and δ (z) is continuous with |δ (z)| < ϵ on [a, b], a unique solution,
g (z), to the problem

dg
= f (z, g) + δ (z) , a ≤ z ≤ b, g (a) = α + ϵ0 ,
dz
Moffat/PHY485 - Microcomputing for Physical Sciences notes 70
2.3 Differential Equations CHAPTER 2. NUMERICAL METHODS

exists with
|g (z) − y (z)| < k (ϵ) ϵ, for all a ≤ z ≤ b.

The problem in (ii) above is called a perturbed problem associated with the original
problem in (i). Numerical methods will always be concerned with solving a perturbed
problem, since any error introduced in the representation will result in a problem of
this type.
Let our solution function y be represented by its Taylor series

′ 1 2 ′′ 1 ′′′ 1 1
y (z + h) = y (z)+hy (z)+ h y (z)+ h3 y (z)+ h4 y 4 (z)+· · ·+ hn y n (z)+· · ·
2! 3! 4! n!

Truncating the Taylor series after n + 1 terms enables us to compute y (z + h) rather



accurately if h is small and if y (z), y (z),. . ., y n (z) are known. If the Taylor series
is truncated after two terms, we have what is known as Euler’s method. To find the
approximate values of the solutions to the initial-value problem


y = f (z, y) , a ≤ z ≤ b, y (a) = ya ,
then

y (z + h) ≈ y (z) + hy (z) , that is,
y (z + h) = y (z) + hf (z, y (z))
can be used to step from z = a to z = b with n steps of size h = (b − a) /n. Euler’s
method can thus be expressed as
wi ≈ y (zi )
w0 = y (z0 ) = α
wi+1 = wi + hf (zi , wi ) , for i = 1, 2, . . . , N − 1.

Algorithm 7 approximates the solution of the initial-value problem using Euler’s method:
A pseudocode with prescribed values for N, a, b and α (= y (a) orw) is shown in algo-
rithm 8.
Note that to use the program for pseudocode 8 a code for the function f (z, w) is
needed, an example of which is shown in program 2.2.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 71


2.3 Differential Equations CHAPTER 2. NUMERICAL METHODS

Algorithm 7 Implementation of Euler’s method


1: INPUT: endpoints a & b, integer N, initial condition α
2: OUTPUT: approximation w to y at the N + 1 values of z
3:
4: h ← b−a
N
5: z←a
6: w←α
7: Output (z, w)
8:
9: for i = 1 to N do
10: w ← w + hf (z, w) ▷ compute wi
11: z ← a + ih ▷ compute zi
12: Output (z, w)
13: end for
14: Stop

Algorithm 8 Euler’s method with prescribed values


1: program euler
2: interger N ← 100
3: real a ← 1, b ← 2, w ← −4
4: interger i
5: real a, z, w
6: h ← b−aN
7: z←a
8: Output 0, z, w
9:
10: for i = 1 to N do
11: w ← w + hf (z, w)
12: z ←z+h
13: Output i, z, w
14: end for
15: end program euler

Moffat/PHY485 - Microcomputing for Physical Sciences notes 72


2.3 Differential Equations CHAPTER 2. NUMERICAL METHODS

Program 2.2: Function f (z, w) for the euler algorithm.


1 def f ( z , w) :
2 y = 1 + w∗ ∗ 2 + z ∗ ∗ 3
3 return y

Euler’s method allows to obtain approximate solutions to the initial-value problems.


To obtain better results we can use higher-order Taylor series. However, due to dif-
ferentiation that has to be undertaken, it might not be possible to obtain higher-order
derivatives of the original equation as they might not exist. Other methods that can
provide better accuracy without the complication of having to find the derivatives ex-
ist. These are discussed below.

Runge-Kutta methods

The Runge-Kutta methods have the accuracy of the higher-order Taylor series methods
without the determination of the derivatives. The Runge-Kutta methods are classified
by the order (number of terms/ functions to be evaluated). The Runge-Kutta method
of order 2 has two forms: the modified Euler and Heun’s methods.
(i) Modified Euler Method
It evaluates

w0 = α
h
wi+1 = wi + [f (zi , wi ) + f (zi+1 , wi + hf (zi , wi ))] , for i = 1, 2, 3 . . . , N − 1.
2

(ii) Heun’s Method


It evaluates

w0 = α
  
h 2 2
wi+1 = wi + f (zi , wi ) + 3f zi + h, wi + hf (zi , wi ) , for i = 1, 2, 3 . . . , N − 1.
4 3 3

Higher orders of the Runge-Kutta methods are available, with order 4 being the most
commonly used method:

Moffat/PHY485 - Microcomputing for Physical Sciences notes 73


2.3 Differential Equations CHAPTER 2. NUMERICAL METHODS

Runge-Kutta method of order 4 (RK4)

w0 = α
k1 = hf (zi , wi )
 
h 1
k2 = hf zi + , wi + k1
2 2
 
h 1
k3 = hf zi + , wi + k2
2 2
k4 = hf (zi + 1, wi + k3 )
1
wi+1 = wi + (k1 + 2k2 + 2k3 + k4 ) , for i = 1, 2, 3 . . . , N − 1.
6

The use of k1 , k2 , k3 and k4 is to eliminate the need for successive nesting in the second
variable of f (z, y). Implementation of the RK4 method to approximate the solution
of the initial-value problem

y = f (z, y) , a ≤ z ≤ b, y (a) = α

at N + 1 equally spaced numbers in the interval [a, b] is shown in algorithm 9.


To illustrate algorithm 9, we consider the initial-value problem
(
x′ = 2 + (x − z − 1)2
(2.10)
x (1) = 2

whose exact solution is x (z) = 1 + z + tan(z − 1) (Cheney & Kincaid, 2013). A


Python program to solve this problem on the interval [a = 1, b = 1.5625] by the 4th
order Runge-Kutta method is shown in program 2.3. The step size needed can be
obtained from h = b−a n
, where n = 72, say. The final value of the computed numerical
solution is x (1.5625) = 3.192937699.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 74


2.3 Differential Equations CHAPTER 2. NUMERICAL METHODS

Algorithm 9 Runge-Kutta method of order 4 (RK4)


1: INPUT: endpoints a & b, integer N, initial condition α
2: OUTPUT: approximation w to y at the N + 1 values of z
3:
4: h ← b−a
N
5: z←a
6: w←α
7: Output (z, w)
8:
9: for i = 1 to N do
10: k1 ← hf (z, w)
k2 ← hf z + h2 , w + k21 

11:
12: k3 ← hf z + h2 , w + k22
13: k4 ← hf (z + h, w + k3 )
14: w ← w + (k1 + 2k2 + 2k3 + k4 ) /6 ▷ compute wi
15: z ← a + ih ▷ compute zi
16: Output (z, w)
17: end for
18: Stop

Moffat/PHY485 - Microcomputing for Physical Sciences notes 75


2.3 Differential Equations CHAPTER 2. NUMERICAL METHODS

Program 2.3: Implementation of the 4th order Runge-Kutta method.


1 # rk4method . py
2
3 # Program u s e s Runge− K u t t a method t o s o l v e t h e f o l l o w i n g ODE problem :
4 # −− a p p r o x i m a t e t h e s o l u t i o n t o t h e i n i t i a l v a l u e problem
5 # dx / dz = 2 + ( x − z −1) ^ 2 , 1 <= z <= 1 . 5 6 2 5 , x ( 1 ) = 2 , u s i n g
6 # n = 72 s t e p s .
7
8 def d f ( z , x ) :
9 # D e f i n e t h e ODE: dx / dz = 2 + ( x − z − 1 ) ^2
10 return 2 + ( x − z − 1) ∗ ∗ 2
11
12 def rk4 ( n , a , b , a l p h a , d f ) :
13 # RK4 method i m p l e m e n t a t i o n
14 h = ( b − a ) / n # Step s i z e
15 z = a # I n i t i a l value of z
16 w = alpha # I n i t i a l v a l u e o f x (w r e p r e s e n t s x i n t h e RK4 method )
17
18 # Print header
19 p r i n t ( f " { ’ i ’ : < 4 } { ’ z ’ : < 1 6 } { ’w ’ : < 1 6 } " )
20 p r i n t ( f " {0: <4} { z : < 1 6 . 6 f } {w: < 1 6 . 6 f } " )
21
22 # Perform RK4 i t e r a t i o n s
23 f o r i in range ( 1 , n + 1 ) :
24 k1 = h ∗ d f ( z , w)
25 k2 = h ∗ d f ( z + h / 2 , w + k1 / 2 )
26 k3 = h ∗ d f ( z + h / 2 , w + k2 / 2 )
27 k4 = h ∗ d f ( z + h , w + k3 )
28 w = w + ( k1 + 2 ∗ k2 + 2 ∗ k3 + k4 ) / 6
29 z = a + i ∗ h
30
31 # Print current step
32 p r i n t ( f " { i : < 4 } { z : < 1 6 . 6 f } {w: < 1 6 . 6 f } " )
33
34 # Print f i n a l r e s u l t
35 p r i n t ( f " \ nThe computed v a l u e i s x ( 1 . 5 6 2 5 ) = {w: . 6 f } . " )
36
37 def main ( ) :
38 # S e t p a r a m e t e r s and c a l l RK4 method
39 n = 72 # Number o f s t e p s
40 a = 1.0 # S t a r t of i n t e r v a l
41 b = 1.5625 # End o f i n t e r v a l
42 alpha = 2.0 # I n i t i a l condition x (1) = 2
43
44 rk4 ( n , a , b , a l p h a , d f )
45
46 i f __name__ == " __main__ " :
47 main ( )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 76


DATA REDUCTION AND ERROR ANALYSIS
3
3.1 Introduction
Definitions/ Considerations
• Data Reduction is presentation of the data in a manner that enables meaningful
conclusions to be drawn.
• Distribution: how the error is spread or the differences in measurements of the
same quantity.
• Propagation of errors: the manner in which errors come into measurements and
how they change with changes in observation.
• Estimate of mean and errors: mean is the average value or the centre of the data,
and the errors tell us how spread out the data is.
• Least-squares fit: a technique used to fit functions to experimental data such that
the square of the difference between the fitting function and experimental data
is as small as it can possibly be.
• Goodness of fit: how well the fitting function describes the experimental data. For
example, the R2 value used in Microsoft Excel: it takes a value between 0 and 1
with a value closer to 1 indicating a good fit - e.g., a value of 0.8235 means that
the fit is able to account for 82.35% of the total variation in the data.

77
3.2 Distributions CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

3.2 Distributions
3.2.1 Empirical Distributions
Suppose we have gathered measurements of a parameter x, represented as

xi = {x1 , x2 , · · · , xN } . (3.1)

Organising these data points can unveil underlying patterns. One common method
for visualising data is through a histogram. It is constructed by grouping or arrang-
ing measurements or data points by intervals. For example, a grouping of imaginary
PHY485 students’ Test 1 marks are shown in the form of a histogram in figure 3.1.
The test marks are grouped into intervals 0−9, 10−19, 20−29, 30−39, 40−49, 50−59,
60 − 69 and 70 − 79. The intervals form the horizontal axis and numbers or frequency
form the vertical axis. A histogram provides valuable information on the characteristics
of the data such as central tendency (characterised by the mean and the median), disper-
sion (characterised by the range, the standard deviation and the variance) and general shape.

6
Frequency

0
0 20 40 60 80
PHY485 Test Marks
Figure 3.1: Histogram showing imaginary test marks for PHY485 students.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 78


3.2 Distributions CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

Measures of central tendency

Arithmatic mean (also called mean or average)

N
1 X
x= xi , (3.2)
N i=1

is the sum of all observations divided by number of observations. The arithmetic mean
is
(i) often used as an estimate of the population mean µ for the underlying theoretical
distribution.
(ii) sensitive to outliers (extreme values).
Median
x
e = x(N +1)/2 , (3.3)
x-value in the middle of the data. It is
(i) often used as an alternative measure of central tendency.
(ii) also affected by outliers, but not to the same extent as the arithmetic mean. How-
ever, their absolute values do not influence it, only the position.
Mode
Mode is another important measure of central tendency. It is the most frequent value
of the data. Data has no mode if no value appears more frequent than any of the values.
A unimodal frequency distribution is one with one mode, a bimodal is one with two
modes and lastly a frequency distribution with many modes is called multimodal. Just
like the median the mode is not highly influenced by outliers.
Measures of Dispersion
Range
Range is the difference between the highest and lowest values and it is given by

∆x = xmax − xmin . (3.4)

It is very susceptible to outliers, hence it is not a reliable measure of dispersion.


Standard deviation

Moffat/PHY485 - Microcomputing for Physical Sciences notes 79


3.2 Distributions CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

Standard deviation is the most useful measure of dispersion. It is the average deviation
of each data point from the mean. It is expressed as
v
u N
u 1 X
s=t (xi − x)2 (3.5)
N − 1 i=1

Note that equation 3.5 gives the standard deviation of a sample or empirical distribu-
tion and it is often used as an estimate of the population standard deviation σ. When
calculating the population standard deviation N is used instead of N − 1 in the denom-
inator.
Variance
This is another important measure of dispersion. It is simply the square of the standard
variation. It is defined by

N
1 X
2
s = (xi − x)2 . (3.6)
N − 1 i=1

The variance is commonly used in many applications instead of the standard deviation.

3.2.2 Theoretical Distributions


Before we go any further, let us get some things out of the way. Suppose you are
playing a game in which you roll a dice. The dice has six different number of spots
ranging from one to six on each side. Every time you roll the dice you can get any of
the six values. Essentially the outcome, call it X, is random and data generated from
rolling the dice is discrete because you can only get 1, 2, 3, 4, 5 or 6 and nothing else.
The probability of getting any number on the dice is 1/6. We can define a probability
function f (x) that assigns these probabilities to all the six outcomes of the random
variable X as

pi if x = xi ,

f (x) = P (X = x) = (3.7)
0 otherwise.
In this equation, x is any of the six numbers and it represents the probability that
X = x. We call this function the probability mass function (PMF) since the outcome is
discrete. Clearly, the sum of all probabilities is unity, i.e.,

Moffat/PHY485 - Microcomputing for Physical Sciences notes 80


3.2 Distributions CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

N
X
f (xi ) = 1. (3.8)
i=1

We may also be interested in knowing the probability of getting less than or equal to
any of the numbers on the dice. Writing that as F (x) = P (X ≤ x) defines what we
call the cumulative distribution function (CDF) (see figure 3.2).

Probability Mass Function Cumulative Distribution Function


(PMF) (CDF)
1 1

Cumulative Probability
Probability

0 0
1 2 3 4 5 6 1 2 3 4 5 6
Outcome Outcome
(a) (b)

Figure 3.2: An example of the probability distribution functions for a discrete random vari-
able. Figure (a) shows the probability of any outcome occuring when throwing a dice. The
corresponding cumulative distribution is shown in (b).

Suppose now that you are looking at the height of all women in the world. The global
average height of adult women is about 165 cm with about 95% of heights falling
within two standard deviations of the mean (i.e., within the interval of 1̃51 - 179 cm).
Between any two values, let us say 163 cm and 164 cm, there are an infinite number
of heights, e.g., 163.25 cm, 163.25678 cm, 163.2567896 cm , etc. If we use X the
same way as was used above in the case of discrete probability distribution, there are
an infinite number of values X can assume in this case and the probability of selecting
any one specific value is zero. The best thing one can do is get a value that is very close
to the exact one. Because of this situation, height is considered a continuous variable
and we now consider ranges of values (such as P (a < X < b), P (X > c), etc.) instead

Moffat/PHY485 - Microcomputing for Physical Sciences notes 81


3.2 Distributions CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

of single values. In that sense, if an infinite number of measurements are to be taken of


a distribution it is convenient to consider a probability density function (PDF) f (x) (see
figure 3.3) whose integral within the sample space (or total probability) is unity, i.e.,
the integral of the entire curve is (see, for example, Cheney & Kincaid (2013); Walpole
& Myers (1993)):

Z +∞
f (x) dx = 1. (3.9)
−∞

Its cumulative distribution function F (x) (that is, the probability that the random vari-
able X will take a value less than or equal to x) is given by
Z x
F (x) = P (X ≤ x) = f (t) dt f or − ∞ < x < ∞. (3.10)
−∞

Probability Density Function Cumulative Distribution Function


(PDF) (CDF)
1 1
Cumulative Probability
Probability Density

0.5 0.5

0 0
140 165 190 140 165 190
Height (cm) Height (cm)
(a) (b)

Figure 3.3: An example of a probability density function (PDF) f (x) showing the heights of all
women in the world in (a) and its cumulative distribution function (CDF) in (b).

Note that when x → ∞, the value of F (x) approaches unity (or equals to 1). Precisely,
a PDF specifies the probability of a random variable, call it X, falling within a particular
range of values, say x = a and x = b. The probability is given by

Moffat/PHY485 - Microcomputing for Physical Sciences notes 82


3.2 Distributions CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

Z b
P (a < X < b) = f (x) dx. (3.11)
a

Following from the definitions given in equations 3.10 and 3.11, we can conclude the
following:

P (a < X < b) = F (b) − F (b) and (3.12a)


dF (x)
f (x) = . (3.12b)
dx

Uniform Distribution
A uniform distribution is a distribution with a constant probability. Its corresponding
PDF is defined as

1
f (x) = = const, (3.13)
N
with a cumulative distribution function F (x) = x/N . The variable x in this case has
any of N possible values, that is, it is random.
Binomial Distribution
The binomial distribution gives the discrete probability of x successes out of N ex-
periments or trials, with probability p of success in any given trial. Its PDF is defined
as
 
N x
f (x) = p (1 − p)N −x (3.14)
x
with a cumulative distribution function given by

x  
X N
F (x) = pi (1 − p)N −i (3.15)
i
i=0

where

Moffat/PHY485 - Microcomputing for Physical Sciences notes 83


3.2 Distributions CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

 
N N!
N
Cr = = (3.16)
r r! (N − r)!
is the binomial coefficient. For the binomial distribution the mean µ = N p and the
variance σ 2 = N p (1 − p). Examples of results that can be modelled by a binomial
distribution include the probability of obtaining a tail when tossing a coin, probability
of a drug to cure a disease, the percentage chance of an adult Motswana suffering
from a specific disease and the average percentage of defective computer chips at a
manufacturing plant. The outcome of such trials are said to be dichotomous, you
either get a success or fail.
Poisson Distribution
Is an approximation to the binomial distribution for the special case where the number
of trials N → ∞ and the success probability p → 0 with one single parameter λ = N p.
It characterises events with extremely low occurrence, e.g., floods and storms. The
probability density function is defined as

e−λ λx
f (x) = , (3.17)
x!
and the cumulative distribution function is given by

x
X e−λ λi
F (x) = . (3.18)
i=0
i!

The parameter λ describes both the mean and the variance of this distribution, i.e., the
variance equals the mean of the distribution.
Normal (Gaussian) Distribution
This is used when the mean is the most frequent and most likely value. It is also an
approximation for the binomial distribution for the special case where p = 0.5 (sym-
metric) and N → ∞. Its PDF is defined by
"  2 #
1 1 x−µ
f (x) = √ exp − , (3.19)
σ 2π 2 σ
and the cumulative distribution function is

Moffat/PHY485 - Microcomputing for Physical Sciences notes 84


3.2 Distributions CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

" 2 #
∞ 
y−µ
Z
1 1
F (x) = √ exp − dy, (3.20)
σ 2π −∞ 2 σ
Figure 3.4 shows a sketch of a probability density function for a normal distribution.

2σ σ mean σ 2σ
Figure 3.4: Bell shape of a normal distribution. Note that 68% of the x values lie between ±1
s.d. from the mean, 95% of the x values lie between ±2 s.d. from the mean and 99.7% of the
x values lie between ±3 s.d. from the mean.

When the mean is 0 and standard deviation is 1 we have what is known as standard
normal distribution with a PDF described by
 2
1 x
f (x) = √ exp − . (3.21)
2π 2
The normal distribution has been found to characterise many physical events, for in-
stance, test marks for PHY112 take the form of a normal distribution.

3.2.3 Goodness of fit test


Sometimes one wishes to test the hypothesis that the observations were drawn from a
specific theoretical distribution (Uniform, Binomial, Poisson, Gaussian, etc.). One of
the most common statistical test is the chi-square (χ2 ) test defined as:

k
X (Oi − Ei )2
χ2 = , (3.22)
i=1
Ei
where Oi is the observed number of events in a category/ class or frequency bin, Ei
is the expected number according to some known theoretical distribution (or fitting

Moffat/PHY485 - Microcomputing for Physical Sciences notes 85


3.3 Error Prop. CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

function) and k is the number of categories (or frequency bins). The null hypothesis
(i.e., that Oi ’s are drawn from the population represented by Ei ’s) can be rejected if χ2
has a large value.
The χ2 -test is often compared to a χ2 distribution with df = k − 1 degrees of freedom.
For example, let us say we want to perform a goodness of fit test on an example problem
involving number of absentees in five consecutive lectures of PHY485. We expect 6
students to be absent for each of the 5 lectures. Observed and expected number of
absentees for the five consecutive lectures is shown in the table of figure 3.5 together
with the corresponding χ2 value. We want to test the hypothesis that the two come
from the same theoretical

Figure 3.5: Observed and Expected absentees for five consecutive lectures for PHY485.

distribution (e.g., χ2 distribution), i.e., the null hypothesis, with a 5% chance of in-
dependence (alternative hypothesis). In order to do this we need two parameters, the
degrees of freedom df and the critical χ2 value. For this particular data, df = 4 and the
critical χ2 value is 9.49. The critical χ2 value is obtained from standard χ2 distribution
tables, an extract of which is shown in figure 3.6. Using this information we find that
our χ2 value is 9.17. Since this value is below the critical value, we cannot reject the
null hypothesis! Figure 3.7 summarises this scenario.
The next subsection highlights a way of incorporating errors in meausured quantities
in the computation of the final error in the dependent variable.

3.3 Error Propagation


In any experiment one would want to know how to calculate the final error or un-
certainty in a dependent variable, say z, which is a function of a number of measured
variables, say x and y. In other words, it is important to know how to propagate or
carry over uncertainties in the measured quantities to the final error in the dependent

Moffat/PHY485 - Microcomputing for Physical Sciences notes 86


3.3 Error Prop. CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

Figure 3.6: Extract of χ2 distribution table with the critical value shown for our PHY485
example.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 87


3.3 Error Prop. CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

Figure 3.7: Principles of a χ2 -test. This is for the example of absentee numbers in five con-
secutive PHY485 lectures. The alternative hypothesis that the observed and the expected dis-
tributions are independent is rejected since the χ2 value is below the critical value.

variable. The standard deviation σ of the individual measurements can be combined


in some way to estimate the final uncertainty of the result.
Considering the example of the dependent variable z given above, its error estimate or
standard deviation can be found in terms of the variances σx2 and σy2 and covariances
2
σxy for the variables x and y which are actually measured:
s 2  2   
∂z ∂z ∂z ∂z
σz = σx2 + σy2 + 2σxy
2 . (3.23)
∂x ∂y ∂x ∂y
Equation 3.23 is known as the error propagation equation. The the third term in this
equation is the average of cross terms involving products of deviations in x and y
weighted by the product of the partial derivatives. If the uncertainties in x and y are
uncorrelated, i.e., are not dependent on each other, equation 3.23 reduces to
s 2  2
∂z ∂z
σz = σx2 + σy2 . (3.24)
∂x ∂y
In general, for a dependent variable z determined from an infinite number of variables

Moffat/PHY485 - Microcomputing for Physical Sciences notes 88


3.4 Mean & Errors CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

whose uncertainties are uncorrelated (note that errors can be correlated, one example
is in nutritional epidemiology where errors in the estimation of the intake of different
nutrients are correlated), the final result can be estimated from

 2  2
∂z ∂z
σz2 = σx2 + σy2 + · · · (3.25)
∂x ∂y

Uncertainties for Specific Formulas


Table 3.1 shows a summary of specific error formulas. The parameters a and b are
defined as constants and x and y are variables.

Table 3.1: Uncertainties of specific mathematical operations. Note that a and b are constants.

Operation z σz

addition & subtraction


p 2 2
ax + by ra σx + b2 σy2 + 2abσxy 2
 2 
σx σ2 2
σxy
multiplication axy x2
+ y2y + 2 xy z
r 
σx2 σy2 2
σxy
division a xy x2
+ y2
− 2 xy
z
powers axb b σxx z
exponentials aeby bσx z
abx (b ln a) σx z
logarithms a ln (bx) ab σxx
angle functions a cos (bx) −σx ab sin (bx)
a sin (bx) σx ab cos (bx)

3.4 Estimate of Mean and Errors


As already pointed out earlier the mean of a population µ is not known and has to be
estimated from experimental data, i.e., a sample of the population. The mean can be
estimated as follows:

1 X
µ = x̄ ≡
xi , (3.26)
N
where x̄ is the mean of the sample, and its standard deviation or standard error is ex-
pressed as

Moffat/PHY485 - Microcomputing for Physical Sciences notes 89


3.5 Least-squares Fit
CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

σ s
σµ = √ ≈ √ , (3.27)
N N
with s representing the standard deviation of the sample.

3.5 Least-squares Fit


The method of least-squares (also called linear regression) can be applied analytically to
data as a criterion for optimizing the path of a line that best fits the data points. Let the
equation of the best-fit line and its measurement error or residual be

y = ax + b (3.28a)
δy = y − (ax + b) . (3.28b)

If the measurement errors are normally distributed, the least-squares principle allows
the manipulation of the above equations to determine the values of a and b. The least-
squares criterion states that for N measurements, we can find the values of a and b that
minimize
XN N
X
2
(δyi ) = [yi − (axi + b)]2 = M. (3.29)
i=1 i=1

To obtain a minimum, we set the first derivative of M with respect to a and b to zero
to obtain
N
∂M X
= −2 [yi − (axi + b)] xi = 0, and (3.30a)
∂a i=1
N
∂M X
= −2 [yi − (axi + b)] = 0 (3.30b)
∂b i=1

Equation 3.30 yields two simultaneous equations from which a and b are determined
to be

N N
P PN PN
i=1 yi xi − i=1 yi i=1 xi
a= PN 2 PN 2 , and (3.31a)
N i=1 xi − x
i=1 i

Moffat/PHY485 - Microcomputing for Physical Sciences notes 90


3.5 LSQ Fit CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

PN PN PN PN
x2i yi − xi y i xi
b= i=1
PN
i=1
P i=1
i=1
2 or (3.31b)
2 N
N i=1 xi − i=1 xi
= y − ax (3.31c)

The quality of the fit is determined by the correlation coefficient R2 , i.e.,

P 2
N
i=1 (x i − x) (y i − y)
R 2 = PN 2 PN 2
. (3.32)
i=1 (x i − x) i=1 (y i − y)

Note that R2 is defined in the range 0 ≤ R2 ≤ 1 with R2 ≈ 1 indicating a good fit and
R2 ≈ 0 indicating that the fit does not significantly follow the trend of the data better
than the mean of the data.
The variance on the data points or variable y is given as the sum of the squares of the
residuals, i.e.,

PN
(δyi )2
s2y = i=1
. (3.33)
N −2

Note that the standard deviation sy represents the distribution of δy values about the
best-fit line.
Since a and b are derived from yi and xi , we expect these values to have deviations that
can be expressed in terms of the measured variables. The standard deviations of a and
b are given as

v
u N
sa = sy u
t PN P 2 (3.34a)
2 N
N i=1 xi − i=1 xi
v
u PN
x2i
(3.34b)
u i=1
sb = sy u PN P 2
N
t
2
N i=1 xi − i=1 xi

Moffat/PHY485 - Microcomputing for Physical Sciences notes 91


3.5 LSQ Fit CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

)
sin(
)
cos(

Figure 3.8: A schematic of a car uniformly accelerating from rest down a slope.

Thus, with the standard deviations of the slope a and the intercept b, we can indicate
regions of confidence using the attributes of the Normal Distribution.
As an example, consider a car uniformly accelerating down a slope, such as in fig-
ure 3.8. In the absence of friction, the car travels a distance s (t) = 21 g sin (θ) t2 from
rest. Suppose the angle of the incline is 30◦ and a first year student sets up an ex-
periment to measure the distance traveled after every second from rest. The student
records the distance traveled s (t) as 0, 3.31, 9.79, 15.05, 40.00, 62.01, 90.24, 120.17,
156.96, 200.09 and 245.25 in meters. We can write a program in Python that applies
a linear least-squares fit (with the line defined by y = ax + b - where y = s (t), x = t2 ,
a = 12 g sin (θ) and b = 0) to this data to help the student find the values of a and b
and their associated errors. ‘g’ is then easily computed from the slope of the line, i.e.,
2a
g = sin(θ) . This is illustrated in the program 3.1 below. ‘g’ is found to be ≈9.88 m/s2
and the correlation coefficient is R2 = 0.9972, approximately equals to 1, indicating an
excellent fit (see figure 3.9 for an illustration of this).
Program 3.1: Least-squares fit program for a car moving down an inclined plane.
1 """
2 lsq_demo . py
3
4 Description :
5 Uses l i n e a r l e a s t − s q u a r e s method t o f i n d t h e b e s t f i t t i n g l i n e t o a d a t a s e t .
6
7 Method :
8 Assuming t h e b e s t − f i t l i n e i s d e f i n e d by y = ax + b with r e s i d u a l s
9 d e l _ y = y − ( ax + b ) , we can minimize t h e s q u a r e s o f t h e r e s i d u a l s by
10 c a l c u l a t i n g t h e i r p a r t i a l d e r i v a t i v e s with r e s p e c t t o a and b t h e n s e t t i n g
11 t h a t t o z e r o . This r e s u l t s i n two s i m u l t e n e o u s e q u a t i o n s i n t h e unknown
12 p a r a m e t e r s a and b from which we can e a s i l y d e t e r m i n e them .
13
14 Author :
15 @ 2 0 22 , L . C. Moffat , U n i v e r s i t y o f Botswana
16 Version 1.0 12/02/2025
17
18 Language :

Moffat/PHY485 - Microcomputing for Physical Sciences notes 92


3.5 LSQ Fit CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

19 Python
20 """
21 import math
22
23 def main ( ) :
24 # Data p o i n t s
25 x = [ 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 1 0 ] # Time i n s e c o n d s
26 y = [0 , 3.31 , 9.79 , 15.05 , 40.00 , 62.01 , 90.24 , 120.17 , 156.96 , 200.09 , 245.25]
# Distance in meters
27
28 N = len ( y ) # Length o f a r r a y y
29
30 # Square the time v e c t o r
31 xx = [ x i ∗ ∗ 2 f o r x i in x ]
32
33 # I n i t i a l i z e sums
34 xsum = sum ( xx ) # Sum o f x i
35 ysum = sum ( y ) # Sum o f y i
36 x2sum = sum ( x i ∗ ∗ 2 f o r x i in xx ) # Sum o f x i ^2
37 xysum = sum ( xx [ i ] ∗ y [ i ] f o r i in range (N) ) # Sum o f x i ∗ y i
38
39 # C a l c u l a t e sums f o r R^2
40 xxb = sum ( ( xx [ i ] − xsum / N) ∗ ∗ 2 f o r i in range (N) ) # Sum o f ( x i − x _ b a r ) ^2
41 yyb = sum ( ( y [ i ] − ysum / N) ∗ ∗ 2 f o r i in range (N) ) # Sum o f ( y i − y _ b a r ) ^2
42 xxbyyb = sum ( ( xx [ i ] − xsum / N) ∗ ( y [ i ] − ysum / N) f o r i in range (N) ) # Sum o f
( x i − x_bar ) ∗ ( yi − y_bar )
43
44 # C a l c u l a t e s l o p e ( a ) and i n t e r c e p t ( b )
45 a = (N ∗ xysum − xsum ∗ ysum ) / (N ∗ x2sum − xsum ∗ ∗ 2 )
46 b = ( x2sum ∗ ysum − xsum ∗ xysum ) / (N ∗ x2sum − xsum ∗ ∗ 2 )
47
48 # C a l c u l a t e R^2
49 R2 = ( xxbyyb ∗ ∗ 2 ) / ( xxb ∗ yyb )
50
51 # C a l c u l a t e f i t t e d y v a l u e s and e r r o r s
52 y _ f i t = [ a ∗ xx [ i ] + b f o r i in range (N) ]
53 e r r o r = [ y [ i ] − y _ f i t [ i ] f o r i in range (N) ]
54 del_y2sum = sum ( e ∗ ∗ 2 f o r e in e r r o r )
55
56 # C a l c u l a t e v a r i a n c e and s t a n d a r d d e v i a t i o n s
57 y _ v a r = del_y2sum / (N − 2 ) # V a r i a n c e i n f i t t e d y v a l u e s
58 s a = math . s q r t ( y _ v a r ) ∗ math . s q r t (N / (N ∗ x2sum − xsum ∗ ∗ 2 ) ) # S t a n d a r d
deviation in a
59 s b = math . s q r t ( y _ v a r ) ∗ math . s q r t ( x2sum / (N ∗ x2sum − xsum ∗ ∗ 2 ) ) # S t a n d a r d
deviation in b
60
61 # Print r e s u l t s
62 p r i n t ( f " { ’ Time ( s ) ’ : < 1 0 } { ’ D i s t a n c e ( o b s e r v e d ) ’ : < 2 0 } { ’ D i s t a n c e ( f i t t e d ) ’ : < 2 0 } { ’
Error ’: <10} " )
63 print ( " −" ∗ 60)
64 f o r i in range (N) :
65 p r i n t ( f " {x [ i ] : < 1 0 . 2 f } {y [ i ] : < 2 0 . 2 f } { y _ f i t [ i ] : < 2 0 . 2 f } { a b s ( e r r o r [ i ] ) : < 1 0 . 2 f } " )
66
67 p r i n t ( f " \ nThe l i n e a r f i t l i n e i s o f t h e form : y = { a : . 4 f }x + {b : . 4 f } " )
68 p r i n t ( f " \ nThe c o r r e l a t i o n c o e f f i c i e n t R^2 = {R2 : . 4 f } " )
69 p r i n t ( f " \ nThe v a r i a n c e i n y i s s ^2 = { y _ v a r : . 4 f } m" )
70 print ( f " \ nStandard d e v i a t i o n s : s a = { s a : . 4 f } , sb = { sb : . 4 f }" )
71 p r i n t ( f " \ nTherefore , a c c e l e r a t i o n due t o g r a v i t y g = { ( a ∗ 2 / 0 . 5 ) : . 4 f } m/ s ^2 " )

Moffat/PHY485 - Microcomputing for Physical Sciences notes 93


3.5 LSQ Fit CHAPTER 3. DATA REDUCTION AND ERROR ANALYSIS

72
73 i f __name__ == " __main__ " :
74 main ( )

240 Data Points


Linear Fit
220
200
180
160
Distance (m)

140
120
100
80
60
40
20
0
0 10 20 30 40 50 60 70 80 90 100
Time2 (s 2)

Figure 3.9: Least-squares fit of distance versus time-squared plot of a car uniformly accelerat-
ing from rest down a slope.

The topic of linear least-squares is discussed in detail in books by Devore (2009) and
Walpole & Myers (1993) and also online by Wikipedia Contributors (2024), Weisstein
(n.d.) and others. If you are interested to learn more, and especially for your projects,
consult those texts.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 94


References

Burden, R. & Faires, J. D., 2011. Numerical Analysis, Cengage Learning, 9th edn.
Cheney, E. & Kincaid, D., 2013. Numerical Mathematics and Computing, International
Edition, Cengage Learning, 7th edn.
Colestock, J., 2017. Python - Control Flow, https://www.youtube.com/watch?v=
RpoUAGp7Pcc&t=11s&ab_channel=JamesColestock, Last Accessed: 2025-02-03.
Devore, J. L., 2009. Probability and Statistics for Engineering and the Sciences, Brooks/Cole,
Cengage Learning, 8th edn.
IEEE Spectrum, 2024. The top programming languages 2024, https://spectrum.
ieee.org/transportation-2024, Last Accessed: 2024-12-29.
Johansson, R., 2015. Numerical Python: Scientific Computing and Data Science Applications
with Numpy, Scipy and Matplotlib, Apress, Urayasu, Chiba, Japan, 1st edn.
McKinney, W., 2018. Python for Data Analysis: Data Wrangling with Pandas, NumPy,
and IPython, O’Reilly Media, Sebastopol, CA, 2nd edn.
Python Software Foundation, n.d. Built-in Types, https://docs.python.org/3/
library/stdtypes.html#ranges, Last Accessed: 2025-01-12.
Sweigart, A., 2019. Automate the Boring Stuff with Python: Practical Programming for
Total Beginners, No Starch Press, San Francisco, CA, 2nd edn.

95
References

TutorialsPoint, 2025. Python Operators, https://www.tutorialspoint.com/


python/python_operators.htm, Last Accessed: 2025-01-12.
VanderPlas, J., 2017. Python Data Science Handbook: Essential tools for working with data,
O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472.
Walpole, R. E. & Myers, R. H., 1993. Probability and Statistics for Engineers and Scientists,
Macmillan Publishing Company, 5th edn.
Weisstein, E. W., n.d. Least Squares Fitting, https://mathworld.wolfram.com/
LeastSquaresFitting.html, Last Accessed: 2025-02-12.
Wikipedia Contributors, 2024. Simple Linear Regression, https://en.wikipedia.
org/w/index.php?title=Simple_linear_regression&oldid=1212647874, Last
Accessed: 2025-02-12.

Moffat/PHY485 - Microcomputing for Physical Sciences notes 96

You might also like