Python
Python
Table of contents
Table of contents ............................................................................................................................ 1
Overview ......................................................................................................................................... 3
Installation ...................................................................................................................................... 3
Other resources .............................................................................................................................. 4
Interactive interpreter .................................................................................................................... 4
Everything is an object .................................................................................................................... 6
Basic types....................................................................................................................................... 7
Python as a calculator ..................................................................................................................... 8
Boolean values and comparison operators .................................................................................... 8
Variable assignment........................................................................................................................ 9
Strings ........................................................................................................................................... 10
Special characters in strings .......................................................................................................... 12
String formatting ........................................................................................................................... 12
Lists ............................................................................................................................................... 15
Accessing list elements ................................................................................................................. 18
List comprehensions ..................................................................................................................... 19
List operations and functions........................................................................................................ 20
Tuples and immutable versus mutable objects ............................................................................ 23
Assignment and name binding ..................................................................................................... 24
Multiple assignment ..................................................................................................................... 27
String functions and manipulation ............................................................................................... 29
Dictionaries ................................................................................................................................... 31
If statements ................................................................................................................................. 33
For loops ....................................................................................................................................... 35
While loops ................................................................................................................................... 39
Functions ....................................................................................................................................... 40
Python is somewhat different than languages like C, C++, or Fortran. In the latter, source code
must first be compiled to an executable format before it can be run. In Python, there is no
compilation step; instead, source code is interpreted on the fly in a line-by-line basis. That is,
Python executes code as if it were a script. The main advantage of an interpreted language is
that it is flexible; variables do not need to be declared ahead of time, and the program can adapt
on-the-fly. The main disadvantage, however, is that numerically-intensive programs written in
Python typically run slower than those in compiled languages. This would seem to make Python
a poor choice for scientific computing; however, time-intensive subroutines can be compiled in
C or Fortran and imported into Python in such a manner that they appear to behave just like
normal Python functions.
Fortunately, many common mathematical and numerical routines have been pre-compiled to
run very fast and grouped into two packages that can be added to Python in an entirely
transparent manner. The NumPy (Numeric Python) package provides basic routines for
manipulating large arrays and matrices of numeric data. The SciPy (Scientific Python) package
extends the functionality of NumPy with a substantial collection of useful algorithms, like
minimization, Fourier transformation, regression, and other applied mathematical techniques.
Both of these packages are also open source and growing in popularity in the scientific
community. With NumPy and SciPy, Python become comparable to, perhaps even more
competitive than, expensive commercial packages like MatLab.
This tutorial will cover the Python 3 series language version. The older 2 series is not fully
compatible, although some legacy codes do exist.
Installation
To use Python, you must install the base interpreter. In addition, there are a number of
applications that provide a nice GUI-driven editor for writing Python programs. The freely
available Anaconda distribution includes a base Python installation, a huge array of packages
https://www.anaconda.com/
Download the installation executable and proceed through the automated setup. Most of the
modules that you will need are pre-installed.
Other resources
Python comes standard with extensive documentation. The entire manual, and many other
helpful documents and links, can also be found at:
http://docs.python.org
The Python development community also maintains an extensive wiki. In particular, for
programming beginners, there are several pages of tutorials and help at:
http://wiki.python.org/moin/BeginnersGuide
For those who have had some programming experience and don't need to start learning Python
from scratch, the Dive Into Python website is an excellent tutorial that can teach you most of the
basics in a few hours:
https://diveintopython3.net/
Interactive interpreter
Open an Anaconda Prompt terminal or use the interactive Python terminal in Spyder that is
started automatically. If at the prompt, start Python by typing "python”. You should see
something similar to the following:
Python 3.9.12 (main, Apr 4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)] ::
Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
The ">>>" at the bottom indicates that Python is awaiting your input. This is the interactive
interpreter; Python programs do not need to be compiled and commands can be entered directly,
step-by-step. In the interactive interpreter, Python reads your commands and gives responses:
>>> 1
1
Comments in Python are indicated using the "#" symbol. Python ignores everything after them
until reaching the end of the line.
Long commands in Python can be split across several lines using the line continuation character
"\". When using this character, subsequent lines must be indented by exactly the same amount
of space. This is because spacing in Python is syntactic, as we will discuss in greater depth later.
Here, Python automatically draws the ellipses mark to indicate that the command you are
entering spans more than one line. Alternatively, lines are continued implicitly without using the
"\" character if enclosing characters (parenthesis, brackets) are present:
Typically the use of parenthesis is preferred over the "\" character for line continuation.
It is uncommon in practice, but more than one command can be entered on the same line in a
Python script using the ";" symbol:
>>> 1 + 4 ; 6 – 2
5
4
Avoid using this notation in programs that you write, as it will densify your code at the expense
of legibility.
There is a generic help function in Python that will tell you about almost everything. For
example, it will tell you what the proper arguments for a function are:
>>> help(sum)
Help on built-in function sum in module __builtin__:
Returns the sum of a sequence of numbers (NOT strings) plus the value
of parameter 'start'. When the sequence is empty, returns start.
The help function will even work with functions and variables that you create yourself, and
Python provides a very easy way to add extra descriptive text that the help function can use
(via doc strings), as we will discuss later on.
Python is a case sensitive language. That means that variables and functions must be given the
correct case in order to be recognized. Similarly, the following two variables are different:
>>> Var = 1
>>> var = 2
>>> Var
1
>>> var
2
To exit the Python interactive prompt, we can use the exit() function:
>>> exit()
c:\>
Everything is an object
Python enforces a great democracy: everything in it—values, lists, classes, and functions—are
objects. An object comes with multiple properties and functions that can accessed using dot
notation. For example,
>>> s = "hello"
>>> s.capitalize()
'Hello'
>>> s.replace("lo", "p")
'help'
>>> "hello".capitalize()
'Hello'
The fact that everything is an object has great advantages for programming flexibility. Any object
can be passed to a function; one can send values or arrays, for example, but it is equally valid to
send other functions as arguments to functions. Moreover, almost everything in Python can be
packaged up and saved to a file, since there are generic routines that pack and unpack objects
into strings.
>>> type(1)
<class 'int'>
The type function tells you the Python type of the argument given it. Here, the return value in
this statement tells you that "1" is interpreted as a Python "int" type, the name for an integer.
Python automatically handles the way that integers are stored, such that it will create special
types if the integer is very large:
>>> type(10000000000)
<class 'int'>
>>> type(1.)
<class 'float'>
Floating-point numbers in Python are double-precision reals. Their limitations are technically
machine-dependent, but generally they range in magnitude between 10-308 to 10308 and have up
to 14 significant figures. In other words, when expressed in scientific notation, the exponent can
vary between -308 and 308 and the coefficient can have 14 decimal places.
Python can also handle complex numbers. The notation "j" indicates the imaginary unit:
>>> type(1+2j)
<class 'complex'>
>>> (1+2j)*(1-2j)
(5+0j)
For every type name in Python, there is an equivalent function that will convert arbitrary values
to that type:
>>> int(3.2)
3
>>> float(2)
2.0
>>> complex(1)
(1+0j)
Notice that integers are truncated. The round function can be used to round to the nearest
integer value; it returns a float:
Python as a calculator
Add two numbers together:
>>> 1+1
2
>>> 8/3
2.6666666666666665
>>> 8//3
2
Floating point division returns a float, even if one of the arguments is an integer. When
performing a mathematical operation, Python converts all values to the same type as the highest
precision one:
>>> 8./3
2.6666666666666665
>>> 8**2
64
>>> 8**0.5
2.8284271247461903
>>> 8 % 3
2
>>> 4 % 3.
1.0
The equals comparison involves two consecutive equal signs, "==". A single equal sign is not a
comparison operator and is reserved for assignment (i.e., setting a variable equal to a value).
>>> 1 == 2
False
>>> 2 != 5
True
Alternatively, the not notation turns a True to a False and vice versa:
>>> not 2 == 5
True
The Boolean True and False constants have numerical values of 1 and 0 respectively:
>>> int(True)
1
>>> 0==False
True
Logical operators can be used to combine these expressions. Parenthesis help here:
Variable assignment
Variables can be assigned values. Unlike many other programming languages, their type does
not need to be declared in advance. Python is dynamically typed, meaning that the type of a
variable can change throughout a program:
>>> a = 1
>>> a
1
>>> b = 2
>>> b == a
False
>>> a = 1
>>> a = a + 1
>>> a
2
>>> a += 1
>>> a
3
>>> a -= 3
>>> a
0
>>> a = 2
>>> a *= 4
>>> a
8
>>> a /= 3
>>> a
2.6666666666666665
Notice in the last line that the variable a changed type from int to float, due to the default
floating-point division. We could have also done the equivalent integer division:
>>> a = 8
>>> a //= 3
>>> a
2
Strings
One of Python's greatest strengths is its ability to deal with strings. Strings are variable length
and do not need to be defined in advance, just like all other Python variables.
>>> s = "hello"*3
>>> s
'hellohellohello'
The len function returns the total length of a string in terms of the number of characters. This
includes any hidden or special characters (e.g., carriage return or line ending symbols).
Multi-line strings can be formed using triple quotation marks, which will capture any line breaks
and quotes literally within them until reaching another triple quote:
Bracket notation for strings works just like it does for lists. We’ll talk more about bracket notation
when we discuss lists below.
Since the backslash is a special character for escape sequences, one has to use a double backslash
to include this character in a string:
One can suppress the recognition of escape sequences using literal strings by preceding the
opening quotes with the character "r":
String formatting
Number values can be converted to strings at a default precision using Python's str function:
>>> str(1)
'1'
>>> str(1.0)
'1.0'
>>> str(1+2j)
'(1+2j)'
Notice that each of the return values are now strings, indicated by the single quote marks.
How do we include variable values in string expressions? We can do this easily using format
strings or f-strings. To form an f-string, we preface the string with an ‘f’ character:
This doesn’t reveal anything special about f-strings until we include variable names in braces:
For any f-string, Python converts expressions in brackets to the corresponding string
representation, essentially replacing {x} with str(x) in the expression.
>>> age = 30
>>> weight = 150
>>> f"The patient's age is {age} years and weight is {weight} pounds."
"The patient's age is 30 years and weight is 150 pounds."
What if we want to control the way that numbers are represented in strings, i.e., the number of
significant digits, or perhaps presenting in scientific notation? To exert more control over the
formatting of values in strings, f-strings allow specific numerical formatting via format
specifications:
Notice the use of a colon followed by the format specification in the braces. The latter indicates
the size and precision of the string output. The first number always indicates the total number
of characters that the value will occupy after conversion to string; here it is 8.
The decimal point followed by a 3 tells Python to round to the nearest thousandth. The "f"
character is the final component of the format specification and it tells Python to display the
number as a float.
Note how the format specification lines up decimal points. If a width is specified, Python always
tries to line up at the decimal point. Alternatively, if you want numbers left-justified within the
width, use a “<” sign:
To explicitly show all zeros within the specification width, place a leading zero in the format
specification:
>>> x = 100
>>> print(f"{x:08.3f}")
0100.000
You don’t have to specify the width of the format if you simply want to set the number of digits
after the decimal point:
Python offers many other ways to format floating-point numbers. These are signaled using
different format specifications than "f". For example, exponential notation can be signaled by
"e":
>>> print(f“{1024:10.3e}")
1.024e+03
>>> print(f"{1234:8d}")
1234
Formatting codes work with either floats or integers; Python is smart enough to convert between
them automatically:
>>> print(f"{1:8.3f}")
1.000
Strings can also be values in format specifications, included using the "s" flag:
If you want to specify the width of a format specification using the value of a variable, you use
additional braces in the format specification:
>>> width = 10
>>> print(f"{12345:0{width}d}")
0000012345
In f strings, the "{" and “}” characters are special. If you need to include these in the string, use
a double brace:
Finally, an alternative to f-strings is the format function. Compare the two examples below,
which produce the same result:
Here, the {0} and {1} indicate ordered placeholders corresponding to the arguments provided in
the format function. Alternatively,
Notice that we had to insert a format specification for every argument in the string. Then, the
‘%’ sign after the string is followed by a tuple of values for those arguments.
You still see this type of string formatting in legacy codes. The newer, f-style string approach is
more often preferred in modern code, because it places the arguments in the string itself, rather
than collecting them in a separate, following list.
Notice that by default, print will separate each input by a space. This can be overridden by
specifying the separator explicitly:
The second example uses the line feed to print each string on a new line. By default, the print
function always ends with a line feed, which is apparent when you call it multiple times:
Lists
Python's ability to manipulate lists of variables and objects is core to its programming style. There
are essentially two kinds of list objects in Python, tuples and lists. The difference between the
two is that the former is fixed and can't be modified once created, while the latter allows
>>> l = [1,2,3,4,5]
>>> print l
[1, 2, 3, 4, 5]
Long lists can be spread across multiple lines. Here, the use of the line continuation character
"\" is optional, since Python automatically assumes a continuation until it finds the same number
of closing as opening brackets. It is important, however, that indentation is consistent:
>>> l = [1, 2, 3,
... 4, 5, 6,
... 7, 8, 9]
... <hit return>
>>> l
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Notice that addition does not correspond to vector addition, in which corresponding terms are
added elementwise. For vectors, we will use arrays, described in the tutorial on NumPy.
>>> [1,2]*3
[1, 2, 1, 2, 1, 2]
>>> l1 = [1,2,3]
>>> l2 = [4,5,6]
>>> l1 += l2
>>> l1
[1, 2, 3, 4, 5, 6]
The range function can produce lists made of a sequence of numbers. Technically it produces
something called an iterator, for efficiency of traversing indices in loops (more on that later). So
we have to explicitly convert the iterator to a list:
>>> list(range(4))
[0, 1, 2, 3]
>>> l = [1,4,7]
>>> l[0]
1
>>> l[2]
7
Notice that the first element in a list has index 0, and the last index is one less than the length of
the list. This is different than Fortran, but is similar to C and C++. All sequence objects (lists,
tuples, and arrays) in Python have indices that start at 0.
>>> l[3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
Because lists can be modified, individual elements can be set using bracket notation:
>>> l = [1,4,7]
>>> l[0] = 5
>>> l
[5, 4, 7]
Negative indices can be used to identify elements with respect to the end of a list:
>>> l = [1,4,7]
>>> l[-1]
7
>>> l[-3]
1
>>> l = [1,2,3,4,5]
>>> l[0:4]
[1, 2, 3, 4]
>>> l[0:4:2]
[1, 3]
If lower is omitted, it defaults to 0 (the first element in the list). If upper is omitted, it defaults
to the list length.
>>> l = [1,2,3,4,5]
>>> l[:4]
[1, 2, 3, 4]
>>> l[2:]
[3, 4, 5]
>>> l[::2]
[1, 3, 5]
Negative indices can be used for list slicing as well. To take only the last 3 elements, for example:
>>> l = [1,2,3,4,5]
>>> l[-3:]
[3, 4, 5]
>>> l[:-2]
[1, 2, 3]
In slices, list indices that exceed the range of the array do not throw an error but are truncated
to fit:
>>> l = [1,2,3,4,5]
>>> l[2:10]
[3, 4, 5]
>>> l[-10:3]
[1, 2, 3]
List comprehensions
Python provides a convenient syntax for creating new lists from existing lists, tuples, or other
iterable objects. These list comprehensions have the general form
In the expression above, elements from the list created by the range function are accessed in
sequence and assigned to the variable i. The new list then takes each element and squares it.
Keep in mind that Python creates a new list whenever a list construction is called. Any list over
which it iterates is not modified.
The iterable does not have to be returned by the range function. Some other examples:
More than one iterable can be included in the same list. Python evaluates the rightmost iterables
the fastest. For example, we can create all sublists [j,k] for 0 < j < k ≤ 3:
It is also possible to filter items in list comprehensions using if statements. The general form is:
For example, we could have also written the above list of sublists as:
Here is another example that filters a list for elements containing the letter "l":
>>> 3 in [1, 2, 3]
True
>>> 4 in range(4)
False
>>> l = [9,2,9,3]
>>> del l[1]
>>> l
[9, 9, 3]
>>> l = range(5)
>>> del l[1:3]
>>> l
[0, 3, 4]
>>> l = [1, 2, 3, 2, 1]
>>> l.remove(2)
>>> l
[1, 3, 2, 1]
Items can be added to lists at particular locations using insert(index, val) or slice
notation:
>>> l = [1, 2, 3, 4]
>>> l.insert(2, 0)
>>> l
[1, 2, 0, 3, 4]
>>> l = l[:4] + [100] + l[4:]
>>> l
[1, 2, 0, 3, 100, 4]
>>> l = []
>>> l.append(4)
The difference between the append and extend list methods is that append will add the
argument as a new member of the list, whereas extend will add all of the contents of a list
argument to the end of the list.
You can find the index of the first instance of a list element. If the element is not in the list, an
error is produced.
>>> l = [1, 5, 2, 7, 2]
>>> l.index(2)
2
>>> l.index(8)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: list.index(x): x not in list
>>> l = [4, 2, 7, 4, 9, 1]
>>> sorted(l)
[1, 2, 4, 4, 7, 9]
>>> l
[4, 2, 7, 4, 9, 1]
>>> l.sort()
>>> l
[1, 2, 4, 4, 7, 9]
Notice that the function sorted returns a new list and does not affect the original one, whereas
the list function sort modifies the original list itself.
>>> cmp(1,2)
-1
The sorting functions work with any type for which the cmp function is defined, which includes
strings:
For user-defined types called classes, it is possible to overload the cmp function to tell it how to
sort. We will discuss classes in greater detail later.
If list members are lists themselves, sorting operates using the first element of each sublist, and
subsequent elements as needed:
>>> l = [1, 2, 3]
>>> l.reverse()
>>> l
[3, 2, 1]
>>> t = (1, 2, 3)
>>> t[1] = 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
Like lists, tuples can contain any object, including other tuples and lists:
The advantage of tuples is that they are faster than lists, and Python often uses them behind the
scenes to achieve efficient passing of data and function arguments. In fact, one can write a
>>> 1, 2, 3
(1, 2, 3)
>>> "hello", 5., [1, 2, 3]
('hello', 5.0, [1, 2, 3])
Tuples aren't the only immutable objects in Python. Strings are also immutable:
Floats, integers, and complex numbers are also immutable; however, this is not obvious to the
programmer. For these types, what immutable means is that new numeric values always involve
the creation of a new spot in memory for a new variable, rather than the modification of the
memory used for an existing variable.
>>> a = 1
In other programming languages, this statement might be read as "put the value 1 in the spot in
memory corresponding to the variable a." In Python, however, this statement says something
quite different: "create a spot in memory for an integer variable, give it a value 1, and then point
the variable a to it." This behavior is called name binding in Python. It means that most variables
act like little roadmaps to spots in memory, rather than designate specific spots that they ‘own’.
>>> a = [1, 2, 3]
>>> b = a
>>> a[1] = 0
>>> a
[1, 0, 3]
In the second line, Python bound the variable b to the same spot in memory as the variable a.
Notice that it did not copy the contents of a, and thus any modifications to a subsequently affect
b also. This can sometimes be a convenience and speed execution of a program.
If an explicit copy of an object is needed, one can use the copy module:
Here, the copy.copy function makes a new location in memory and copies the contents of a
to it, and then b is pointed to it. Since a and b now point to separate locations in memory,
modifications to one do not affect the other.
Actually the copy.copy function only copies the outermost structure of a list. If a list contains
another list, or objects with deeper levels of variables, the copy.deepcopy function must be
used to make a full copy.
The copy module should be used with great caution, which is why it is a module and not part of
the standard command set. The vast majority of Python programs do not need this function if
one programs in a Pythonic style—that is, if one uses Python idioms and ways of doing things. If
you find yourself using the copy module frequently, chances are that your code could be
rewritten to read and operate much cleaner.
>>> a = 1
>>> b = a
>>> a = 2
Why did b not also change? The reason has to do with immutable objects. Recall that values are
immutable, meaning they cannot be changed once in memory. In the second line, b points to
the location in memory where the value "1" was created in the first line. In the third line, a new
value "2" is created in memory and a is pointed to it—the old value "1" is not modified at all
because it is immutable. As a result, a and b then point to different parts of memory. In the
previous example using a list, the list was actually modified in memory because it is mutable.
>>> a = 1
>>> b = a
>>> a = []
>>> a.append(1)
>>> a
[1]
>>> b
1
Here in the third line, a is assigned to point at a new empty list that is created in memory.
The general rules of thumb for assignments in Python are the following:
• Assignment using the equals sign ("=") means point the variable name on the left hand
side to the location in memory on the right hand side.
• If the right hand side is a variable, point the left hand side to the same location in memory
that the right hand side points to. If the right hand side is a new object or value, create a
new spot in memory for it and point the left hand side to it.
• Modifications to a mutable object will affect the corresponding location in memory and
hence any variable pointing to it. Immutable objects cannot be modified and usually
involve the creation of new spots in memory.
It is possible to determine if two variable names in Python are pointing to the same value or
object in memory using the is statement:
>>> a = [1, 2, 3]
>>> b = a
>>> a is b
True
>>> b = [1, 2, 3]
>>> a is b
False
One might wonder if Python is memory-intensive given the frequency with which it must create
new spots in memory for new objects and values. Fortunately, Python handles memory
management quite transparently and intelligently. In particular, it uses a technique called
garbage collection. This means that for every spot in memory that Python creates for a value or
object, it keeps track of how many variable names are pointing at it. When no variable name any
longer points to a given spot, Python automatically deletes the value or object in memory, freeing
its memory for later use. Consider this example:
In the last line, there are no longer any variables that point to the first list and so Python
automatically deletes it from memory. One can explicitly delete a variable using the del
statement:
>>> a = [1, 2, 3, 4]
>>> del a
This will delete the variable name a. In general, however, it does not delete the object to which
a points unless a is the only variable pointing to it and Python's garbage-collecting routines kick
in. Consider:
>>> a = [1, 2, 3, 4]
>>> b = a
>>> del a
>>> b
[1, 2, 3, 4]
Generally, there is little need to use the delete statement in Pythonic programs.
Multiple assignment
Lists and tuples enable multiple items to be assigned at the same time. Consider the following
example using lists:
In this example, Python assigned variables by lining up elements in the lists on each side. The
lists must be the same length, or an error will be returned.
Tuples are more efficient for this purpose and are usually used instead of lists for multiple
assignments:
However, since Python will interpret any non-enclosed list of values separated by commas as a
tuple it is more common to see the following, equivalent statement:
Here, each side of the equals sign is interpreted as a tuple and the assignment proceeds as before.
This notation is particularly helpful for functions that return multiple values. We will discuss this
in greater detail later, but here is preview example of a function returning two values:
>>> a, b = f(c)
Technically, the function returns one thing – a tuple containing two values. However, the
multiple assignment notation allows us to treat it as two sequential values. Alternatively, one
could write this statement as:
Because of multiple assignment, list comprehensions can also iterate over multiple values:
In this example, the tuple (a,b) is assigned to each item in l, in sequence. Since l contains tuples,
this amounts to assigning a and b to individual tuple members. We could have done this
equivalently in the following, less elegant way:
>>> a = 1
>>> b = 5
>>> a, b = b, a
>>> a
5
>>> b
1
Keep in mind two very important points with these functions: (1) strings are immutable, so
functions that modify strings actually return new strings that are modified versions of the
originals; and (2) all string functions are case sensitive so that 'this' is recognized as a different
string than 'This'.
Strings can be sliced just like lists. This makes it easy to extract substrings:
Strings can also be split apart into lists. The split function will automatically split strings
wherever it finds whitespace (e.g., a space or a line break):
To remove extra beginning and ending whitespace, use the strip function:
The replace function will make a new string in which all specified substrings have been
replaced:
It is possible to test if a substring is present in a string and to get the index of the first character
in the string where the substring starts:
Sometimes you need to left- or right-justify strings within a certain field width, padding them
with extra spaces as necessary. There are two functions for doing that:
Finally, there are a number of very helpful utilities for testing strings. One can determine if a
string starts or ends with specified substrings:
You can also test the kind of contents in a string. To see if it contains all alphabetical characters,
>>> "string".isalpha()
True
>>> "string.".isalpha()
False
>>> "12834".isdigit()
True
>>> "50 cars".isdigit()
False
Dictionaries
Dictionaries are another type in Python that, like lists, are collections of objects. Unlike lists,
dictionaries have no ordering. Instead, they associate keys with values similar to that of a
database. To create a dictionary, we use braces. The following example creates a dictionary with
three items:
Here, each element of a dictionary consists of two parts that are entered in key:value syntax. The
keys are like labels that will return the associated value. Values can be obtained by using bracket
notation:
>>> d["city"]
'Santa Barbara'
Dictionary keys do not have to be strings. They can be any immutable object in Python: integers,
tuples, or strings. Dictionaries can contain a mixture of these. Values are not restricted at all;
they can be any object in Python: numbers, lists, modules, functions, anything.
>>> d = {}
Items can be added to dictionaries using assignment and a new key. If the key already exists, its
value is replaced:
>>> len(d)
2
To remove all elements from a dictionary, use the clear object function:
One can obtain lists of all keys and values (in no particular order):
Alternatively, one can get a list of (key,value) tuples for the entire dictionary:
>>> d.items()
dict_items([('city', 'Santa Barbara'), ('state', 'CA')])
For all three of these cases, Python returns an iterator that can be converted into a simple list if
needed, using list.
Finally, dictionaries provide a method to return a default value if a given key is not present:
If statements
if statements allow conditional execution. Here is an example:
>>> x = 2
>>> if x > 3:
... print("greater than three")
... elif x > 0:
... print("greater than zero")
... else:
... print("less than or equal to zero")
... <hit return>
greater than zero
A very important concept in Python is that spacing and indentations carry syntactical meaning.
That is, they dictate how to execute statements. Colons occur whenever there is a set of sub-
commands after an if statement, loop, or function definition. All of the commands that are
meant to be grouped together after the colon must be indented by the same amount. Python
does not specify how much to indent, but only requires that the commands be indented in the
same way. Consider:
>>> if 1 < 3:
... print("line one")
... print("line two")
File "<stdin>", line 3
print("line two")
^
IndentationError: unexpected indent
>>> if 1 < 3:
... print("line one")
... print("line two")
... <hit return>
line one
line two
It is typical to indent four spaces after each colon. Ultimately Python's use of syntactical
whitespace helps make its programs look cleaner, easier to read, and standardized.
Any statement or function returning a Boolean True or False value can be used in an if
statement. The number 0 is also interpreted as False, while any other number is considered
True. Empty lists and objects return False, whereas non-empty ones are True.
>>> d = {}
>>> if d:
... print("Dictionary is not empty.")
... else:
... print("Dictionary is empty.")
... <hit return>
Dictionary is empty.
Single if statements (without elif or else constructs) that execute a single command can be
written in one line without indentation:
For loops
Like other programming languages, Python provides a mechanism for looping over consecutive
values. Unlike many languages, however, Python's loops do not intrinsically iterate over integers,
but rather elements in sequences, like lists and tuples. The general construct is:
Notice that anything falling within the loop is indented beneath the first line, similar to if
statements. Here are some examples that iterate over tuples and lists:
Notice that the items in the iterable do not need to be the same type. In each case, the variable
i is given the value of the current list or tuple element, and the loop proceeds over these in
sequence. One does not have to use the variable i; any variable name will do, but if an existing
variable is used, its value will be overwritten by the loop.
Iteration over a dictionary proceeds over its keys, not its values. Keep in mind, though, that
dictionaries will not return these in any particular order. In general, it is better to iterate explicitly
over keys or values using the dictionary functions that return lists of these:
Using Python's multiple assignment capabilities, it is possible to iterate over more than one value
at a time:
In this example, Python cycles through the list and makes the assignment (a,b) = element
for each element in the list. Since the list contains two-tuples, it effectively assigns a to the first
member of the tuple and b to the second.
Multiple assignment makes it easy to cycle over both keys and values in dictionaries at the same
time:
In other programming languages, one might use the following idiom to iterate through items in
a list:
In Python, however, the following is more natural and efficient, and thus always preferred:
Notice that the second line could have been written in a single line since there is a single
command within the loop, although this is not usually preferred because the loop is less clear
upon inspection:
If you need the index of the loop in addition to the iterated element, the enumerate command
is helpful:
Notice that enumerate returns indices that always begin at 0, whether or not the loop actually
iterates over a slice of a list:
It is also possible to iterate over two lists simultaneously using the zip function:
>>> l1 = [1, 2, 3]
>>> l2 = [0, 6, 8]
>>> for (a, b) in zip(l1, l2):
... print( a, b, a+b )
... <hit return>
1 0 1
2 6 8
3 8 11
The zip function can be used outside of for loops. It simply takes two or more lists and groups
them together, making tuples of corresponding list elements. Since zip returns an iterator, use
list to make a list from its results:
This behavior, combined with multiple assignment, is how zip allows simultaneous iteration
over multiple lists at once.
It is possible to skip forward to the next loop iteration immediately, without executing
subsequent commands in the same indentation block, using the continue statement. The
following produces the same output as the previous example using continue, but is ultimately
less efficient because more loop cycles need to be traversed:
One can also terminate the innermost loop using the break statement. Again, the following
produces the same result but is almost as efficient as the first example because the inner loop
terminates as soon as the break statement is encountered:
While loops
Unlike for loops, while loops do not iterate over a sequence of elements but rather continue so
long as some test condition is met. Their syntax follows indentation rules similar to the cases we
have seen before. The initial statement takes the form
The following example computes the first couple of values in the Fibonacci sequence:
>>> k1, k2 = 1, 1
>>> while k1 < 20:
... k1, k2 = k2, k1 + k2
... print(k1)
1
2
3
5
8
13
21
Sometimes it is desired to stop the while loop somewhere in the middle of the commands that
follow it. For this purpose, the break statement can be used with an infinite loop. In the
previous example, we might want to print all Fibonacci numbers less than or equal to 20:
>>> k1, k2 = 1, 1
>>> while True:
... k1, k2 = k2, k1 + k2
... if k1 > 20: break
... print k1
1
2
3
Here the infinite while loop is created with the while True statement. Keep in mind that, if
multiple loops are nested, the break statement will stop only the innermost loop
Functions
Functions are an important part of any program. Some programming languages make a
distinction between "functions" that return values and "subroutines" that do not return anything
but rather do something. In Python, there is only one kind, functions, but these can return single,
multiple, or no values at all. In addition, like everything else, functions in Python are objects.
That means that they can be included in lists, tuples, or dictionaries, or even sent to other
functions. This makes Python extraordinarily flexible.
Here, def signals the creation of a new function named add, which takes two arguments. All of
the commands associated with this function are then indented underneath the def statement,
similar to the syntactic indentation used in loops. The return statement tells Python to do two
things: exit the function and, if a value is provided, use that as the return value.
Unlike other programming languages, functions do not need to specify the types of the
arguments sent to them. Python evaluates these at runtime every time the function is called.
Using the above example, we could apply our function to many different types:
>>> add(1, 2)
3
>>> add("house", "boat")
'houseboat'
>>> add([1, 2, 3], [4, 5, 6])
[1, 2, 3, 4, 5, 6]
>>> add(1, "house")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in add
TypeError: unsupported operand type(s) for +: 'int' and 'str'
In the last example, an error occurs because the addition operator is not defined for an integer
with a string. This error is only thrown when we call the function with inappropriate arguments.
If no return statement is present within a function, or if the return statement is used without
a return value, Python automatically returns the special value None:
None is a reserved, special object in Python, similar to True and False. It essentially means
nothing, and will not appear using the print statement. However, as seen in the above
example, one can test for the None value using conditional equality or the is statement.
If you want a function that modifies its behavior depending on the type of the argument, it is
possible to test for different types using the type function:
Notice that in this example, the type(arg2) statement is also used to return the function that
converts generic objects to the type of arg2, e.g., int, float, or complex. Thus the statement
type(arg2)(arg1) actually runs this type-conversion function on the string arg1 to convert
it to the type of arg2.
Functions can return more than one value using Python's tuple capabilities. To do so, specify a
comma-separated list after the return statement:
Notice in the penultimate line that we needed to specify the unit optional argument explicitly,
since we skipped the optional format one. In general, it is good practice to explicitly specify
optional arguments in this way whether or not one needs to, since this makes it clearer that the
arguments in the call are optional:
Function namespaces
Argument variables and defined within functions exist in their own namespace. This means that
assignment of an argument to a new value does not affect the original value outside of the
function. Consider the following:
What happened here? Because a is an argument variable defined in the def statement, it is
treated as a new variable that exists only within the function. Once the function has finished and
the program exits it, this new a is destroyed in memory by Python's garbage-collecting routines.
The a that we defined outside of the function remains the same.
How, then, does one modify variables using functions? In other programming languages, you
may have been used to sending variables to functions to change their values directly. This is not
a Python way of doing things. Instead, the Pythonic approach is to use assignment to a function
return value. This is actually a clearer approach than the way of many other programming
languages because it shows explicitly that the variable is being changed upon calling the function:
There is one subtlety to this issue. Mutable objects can actually be changed by functions if one
uses object functions and/or element access. Consider the following example that uses both to
modify a list:
The reason for the distinction with mutable objects has to do with Python's name-binding
approach. Consider the following generic construct:
When one calls fn(x), Python creates the new variable arg within the function namespace
and points it to the data residing in the spot of memory to which x points. Setting arg equal to
another value within the function simply has the effect of pointing arg to a new location in
Here, in the second line, the bracket notation tells Python to do the following: find the area in
memory where the indexth element of arg resides and put newvalue in it. This occurs
because the brackets after arg are actually treated as an object function of arg, and thus are
inherently a function of the memory and data to which arg points. A similar case would exist if
we had called some object function that modified its contents, like arg.sort(). In these
cases, x would be modified outside of the function.
Functions as objects
As alluded to previously, functions are objects and thus can be sent to other functions as
arguments. Consider the following:
Here, we sent the squareme function to the applytolist function. Notice that when we
send a function to another function, we do not supply arguments. If we had supplied arguments,
we would have instead sent the return value of the function, rather than the function itself.
Python shows us that a function is an object. Consider, from the above example:
>>> squareme
<function squareme at 0x019F60F0>
The hexadecimal number in the return value simply tells us where in memory this function lies.
We can also test the type:
>>> type(squareme)
<type 'function'>
Function documentation
Functions can be self-documenting in Python. A docstring can be written after the def
statement that provides a description of what a function does. This extremely useful for
documenting your code and providing explanations that both you and subsequent users can use.
The built-in help function uses docstrings to provide help about functions.
a(x, y)
Adds two variables x and y, of any type. Returns single value.
It is typical to enclose docstrings using triple-quotes, since complex functions might require
longer, multi-line documentation.
It is a good habit to ALWAYS write docstrings for your code. Each should contain three pieces of
information: (1) a basic description of what the function does, (2) what the function expects as
arguments, and (3) what the function returns (including the variable types).
Writing scripts
So far, the examples we have covered have involved commands interpreted directly from the
Python interactive prompt. Python also supports scripts, or lists of commands and function
definitions (and any other Python constructs) that are defined in files, similar to source code in
other programming languages. These scripts are no different from the commands and
instructions that you would enter at the command prompt. Python scripts end in the extension
.py in all platforms.
Consider the following contents of a script file called primes.py that finds all primes less than
or equal to 50:
primes.py
We can run this program from the command line by calling Python with an argument that is the
name of our script. Python will run the contents of the file as if we typed them at the interactive
prompt and then exit. Under Windows, for example, this might look something like:
Modules
It is also possible to import scripts from within the Python interpreter. When files of Python
commands are imported in this way they are termed modules. Modules are a major basis of
programming efforts in Python as they allow you to organize reusable code that can be imported
as necessary in specific programming applications. Considering the previous example:
• Scripts are imported using the import command. Upon processing the import
statement, Python immediately executes the contents of the file primes.py file.
• We do not use the .py extension in the import command; Python assumes the file
ends in this and is accessible in the current directory (if unchanged, the same directory
from which Python was started). If Python does not find the script to be imported in the
current directory, it will search a specific path called PYTHONPATH, discussed later.
• When Python executes the imported script, it creates an object from it of type module.
• Any objects created when running the imported file are not deleted but are placed as
members of the module object. In this way, we can access the functions and variables
that were part of the module using dot notation, like primes.l and
primes.nextprime.
By making script objects members of the module, Python gives us a powerful way to write
reusable code, i.e., code with generic functions and variables that we can import into programs.
Modules can also import other modules, so that we can have hierarchies of code with variable
degrees of generality.
Module objects can be created and modified just like any other object in Python:
>>> primes.l = []
>>> primes.l
[]
>>> primes.k = 5 #create new object in primes module
>>> primes.k
5
Sometimes we want scripts to behave differently when we execute them at the command line
versus import them into other programs. Commonly we want the script to execute certain
commands when run from the command line, but need to suppress this behavior when imported.
To achieve this, we need to test whether or not the program has actually been run from the
command line. Consider the following program:
test.py
if __name__ == "__main__":
#only executed if run directly from the command line
print multiply(4, 5)
In the penultimate line, we test to see if the script test.py has been run from the command
line. The variable __name__ is a special variable that Python creates which tells us the name of
the current module. (There are many such special variables, and they are always identified by
preceding and trailing double-underscores.) Python gives the value of "__main__" to the
variable __name__ if and only if that program is the main program and has been called from
the command line (i.e., not imported). Here is the behavior of our program at the command line:
Notice that Python does not execute the multiply(4, 5) command when we import, but
we still have access to any functions or objects defined in test.py.
Standard modules
Python has a "batteries included" philosophy and therefore comes with a huge library of pre-
written modules that accomplish a tremendous range of possible tasks. It is beyond the scope of
this tutorial to cover all but a small few of these. However, here is a brief list of some of these
modules that can come in handy for scientific programming:
A complete listing of all of the modules that come with Python are given in the Python Library
Reference in the Python Documentation. In addition to these modules, scientific computing
makes extensive use of two add-on modules, numpy and scipy, that are discussed in a separate
tutorial. There are also many other add-on modules that can be downloaded from open-source
efforts and installed into the Python base.
data.txt
The following example creates an open file object to data.txt and reads all of its contents into
a string:
Here, the open function took two arguments, the name of the file followed by a string "r"
indicating that we are opening the file for reading. It is possible to omit the second argument, in
which case Python defaults to "r"; however, it is usually a good idea to include it explicitly for
programming clarity.
When we have read the contents, we invoke the close function, which terminates the
operating system link to our file. It is good to close files after using them, as open file objects
consume system resources. In any case, Python automatically closes any open files if there are
no more objects pointing to them using its garbage collecting routines. Consider the following:
This command accomplishes the same result as the last example, but does not create a file object
that persists after execution. That is, open first creates the object, then read() extracts its
contents and places them into the variable s. After this operation, there are no variables pointing
to the file object anymore, and so it is automatically closed. File objects created within functions
(that are not returned) are also always closed upon exiting the function, since variables created
within functions are deleted upon exit.
We don’t have to read all of the contents into a string. We can also read the contents into a list
of lines in the file:
If the file is large, it might not be efficient to read all of its contents at one time. Instead, we can
read one line at a time using the readline function:
This example prints out all of the lines that do not start with "#". The while loop continues as
long as the last readline() command returns a string of length greater than zero. When
Python reaches the end of a file, readline will return an empty string. It is important to know
The read and readline functions can also take an optional argument size that sets the
maximum number of characters (bytes) that Python will read in at a time. Subsequent calls move
through the file until the end of the file is reached, at which point Python will return an empty
string:
The seek function can be used to move to a specific byte location in a file. Similarly, the tell
function will indicate the current byte position within the file:
We end with an example that illustrates some of the elegant ways in which Python can handle
files. Imagine we would like to parse the data in the file above into the list called Data such that:
Data = [[1.0, 1.0, -50.0,], [1.0, 1.5, -27.8], [2.0, 1.0, -14.5], [2.0, 1.5,
-11.2]]
Here, we need to read the data (ignoring the comment), convert it to floats, and structure it into
a list. New Python programmers might take an approach similar to the manner in which this
would be accomplished in other languages:
>>> Data = []
>>> for line in open("data.txt", "r").readlines():
... if not line.startswith("#"):
... l = line.split()
... Pres = float(line[0])
... Temp = float(line[1])
... Ene = float(line[2])
... Data.append([Pres, Temp, Ene])
... <hit return>
Ultimately, however, we can make these operations much more compact using Python's list
comprehensions:
Here, we use two nested list comprehensions: the inner one loops over columns in each line, and
the outer one over lines in the file with a filter established by the if statement.
Writing to files
Writing data to a file is very simple. To begin writing to a new file, open a file object with the "w"
flag:
The write flag "w" tells Python to create a new file ready for writing, and the function write
will write a string verbatim to the current position within the file. Subsequent write statements
therefore append data to the file. Notice that write writes the string text explicitly and so line
breaks must be specified in the strings if desired in the file.
If the "w" flag is used on a file that already exists, Python will overwrite it completely.
Alternatively, one can append data to an existing file using the "a" flag:
The write function only accepts strings. That means that numeric values must be converted to
strings prior to writing to the file. This can be accomplished using the str function, which
formats values into a default precision, or using string formatting:
The first approach is not to store values in a legible format but to write them in a way similar to
their representation in memory. To do so, we must convert a value to a binary representation in
string format. The struct module can be used for this purpose. However, there are some
subtleties to the different data types (struct uses C, rather than Python, types) that can make
this approach a bit confusing.
The second approach is to write to, and subsequently also read from, compressed files. In this
way, numeric data written in human-readable form can be compressed to take up much less
space on disk. This approach is sometimes more convenient because numeric values can still be
read by human eyes when data files are decompressed by various utilities outside of Python.
Conveniently, Python comes with modules that enable one to read and write a number of popular
compressed formats in an almost completely transparent manner. Two formats are
recommended: the Gzip format, which achieves reasonable compression and is fast, and the
Bzip2 format, which achieves higher compression but at the expense of speed. Both formats are
standardized, open, can be read by most common decompression programs, and are single-file
based, meaning they compress a single file, not cabinets or archives of multiple files, which
complicates things.
Here, Python takes care of compression (and decompression) entirely behind the scenes. The
only difference from our earlier efforts is that we have replaced the file function with the
gzip.GzipFile call and we have given the extension ".gz" to the file we create, in order to
indicate that it is a compressed file. In fact, gzip objects behave exactly like file objects, and
implement all of the same functions (read, readline, readlines, write). This makes
it very easy and transparent for storing data in a compressed format. One minor exception,
however, is that the seek and tell functions do not work exactly the same and should be
avoided with compressed files.
In general, compression is only recommended for datasets on disk that are large (e.g., > 1MB)
and that are read or written only a few times during a program. For disk-intensive programs that
are speed-limited by the rate at which they can read and write to disk, compression will incur a
considerable computational overhead and it is probably best to work with an uncompressed file,
and probably in a binary (non-readable) format. In these latter cases, the large datasets can
ultimately be compressed by outside utilities after all programs and analyses have been
performed. For complex datasets, there are many good Python modules available to manage
them, such as pandas.
>>> print("c:/temp/file.txt")
c:/temp/file.txt
>>> print("c:\\temp\\file.txt")
c:\temp\file.txt
The os module contains a large number of useful file functions. In particular, the sub-module
os.path provides a number of functions for manipulating path and file names. For example, a
filename with a path can be split into various parts:
>>> import os
>>> p = "c:/temp/file.txt"
>>> os.path.basename(p)
'file.txt'
>>> os.path.dirname(p)
'c:/temp'
>>> os.path.split(p)
('c:/temp', 'file.txt')
The opposite of the split function is the join function. It is a good idea to always use join
when combining pathnames with other pathnames or files, since join takes care of any
operating-system specific actions. join can take any number of arguments:
If the path name is not absolute but relative to the current directory, there is a function for
returning the absolute version:
>>> os.path.abspath("/temp/file.txt")
'C:\\temp\\file.txt'
Several functions enable testing the existence and type of files and directories:
>>> p = 'c:/temp/file.txt'
>>> os.path.exists(p)
True
>>> os.path.isfile(p)
True
>>> os.path.isdir(p)
False
Here, the isfile and isdir functions test both for the existence of the object as well as their
type.
Several functions in the main os module allow interrogating and changing the current working
directory:
>>> os.getcwd()
'C:\\temp'
>>> os.chdir("..")
>>> os.getcwd()
'C:\\'
Note that the notation ".." signifies the containing directory one level up.
>>> os.mkdir("c:/temp/newdir")
To delete a file:
>>> os.remove("c:/temp/deleteme.txt")
To delete a directory:
>>> os.rmdir("c:/temp/newdir")
To rename a file:
The shutil module provides methods for copying and moving files:
Finally, the glob module provides wildcard matching routines for finding files and directories
that match a specification. Matches are placed in lists:
Here the "*" wildcard matches anything of any length. The "?" wildcard will match anything of
length one character. Multiple wildcards can appear in a glob specification:
>>> glob.glob("c:\\*\\?.dat")
['c:\\temp\\1.dat', 'c:\\temp\\2.dat', 'c:\\temp\\3.dat', 'c:\\dat\\0.dat']
In Windows, if Python is associated with files ending in '.py', we can just write instead:
In Linux, we can accomplish the same behavior by including in the very first line of our program
a comment directive that tells the system to use Python to execute the file:
#!/usr/bin/env python
Either way, we would like to capture the arguments in.txt and out.txt. To do this, we use
the sys module and its member variable argv:
program.py
#!/usr/bin/env python
import sys
print sys.argv
InputFile = sys.argv[1]
OutputFile = sys.argv[2]
Notice that argv is a list that contains the (string) arguments in order. The first argument, with
index 0, is the name of the program that we are executing. Subsequent arguments correspond
to space-separated items that we input on the command line when running the program. The
form of argv is exactly the same whether or not we call Python directly, since the Python
executable is ignored:
There are much more sophisticated ways to process command-line arguments. The argparse
module is particularly extensive and enables easy creation of command line interfaces.
Classes
So far, we have only dealt with built-in object types like floats and ints. Python, however, allows
us to create new object types called classes. We can then use these classes to create new objects
of our own design. In the following example, we create a new class that describes an atom type.
atom.py
class AtomClass:
def __init__(self, Velocity, Element = 'C', Mass = 12.0):
self.Velocity = Velocity
self.Element = Element
self.Mass = Mass
def Momentum(self):
return self.Velocity * self.Mass
We can import the atom.py module and create a new instance of the AtomClass type:
In this example, the class statement indicates the creation of a new class called AtomClass;
all definitions for this class must be indented underneath it. The first definition is for a special
function called __init__ that is a constructor for the class, meaning this function is
automatically executed by Python every time a new object of type AtomClass is created. There
are actually many special functions that can be defined for a class; each of these begins and ends
with two underscore marks.
Notice that the first argument to the __init__ function is the object self. This is a generic
feature of any class function. This syntax indicates that the object itself is automatically sent to
the function upon calls to it. This allows modifications to the object by manipulating the variable
The __init__ function gives the form of the arguments that are used when we create a new
object with atom.AtomClass(2.0, Element = 'O', Mass = 16.0). Like any other
function in Python, this function can include optional arguments.
Object members can be accessed using dot notation, as shown in the above example. Each new
instance object of a class acquires its own object members, separate from other instances.
Functions can also be defined as object members, as shown with the Momentum function above.
The first argument to any function in this definition must always be self; calls to functions
through object instances, however, do not supply this variable since Python sends the object
itself automatically as the first argument.
Many special functions can be defined for objects that tell Python how to use your new type with
existing operations. Below is a selected list of some of these:
Classes can be an extremely convenient way for organizing data in scientific programs. However,
this benefit does not come without a cost: oftentimes stratifying data across a class will slow your
program considerably. Consider the atom class defined above. We could put a separate position
or velocity vector inside each atom instance. However, when we perform calculations that make
intense use of these quantities—such as a pairwise loop that computes all interatomic
distances—it is inefficient for Python to jump around in memory accessing individual position
variables in each class.
Rather, it would be much more efficient to store all positions for all atoms in a single large array
that occupies one location in memory. In this case, we would consider those quantities that
appear in the slowest step of our calculations (typically the pairwise loop) and keep them outside
of the classes as large, easily manipulated arrays and then put everything else that is not accessed
frequently (such as the element name) inside the class definitions. Such a separation may seem
messy, but ultimately it is essential if we are to achieve reasonable performance in numeric
computations.
test.py
Here we have defined a function that performs multiplication that we can call for any type. If
multiplication is not defined for a particular type, an error is thrown that is caught by the except
statement. Rather than stop our program, this error causes our own error-handling code to be
executed. The try statement defines the range of code in which we are testing for this error.
Consider this example:
In the example above, we caught the kind of error called StandardError, which is a broad
category that includes the specific kind of error TypeError. Python has a large hierarchy of
errors that can be caught. Taken from the Python manual:
BaseException
+-- SystemExit
+-- KeyboardInterrupt
+-- Exception
+-- GeneratorExit
+-- StopIteration
+-- StandardError
| +-- ArithmeticError
| | +-- FloatingPointError
| | +-- OverflowError
| | +-- ZeroDivisionError
| +-- AssertionError
| +-- AttributeError
| +-- EnvironmentError
| | +-- IOError
| | +-- OSError
In addition to catching errors, we can also throw errors using the raise statement:
The ability to raise errors is convenient for adding user-defined information when improper calls
to our functions or objects are made. Ultimately this helps us locate bugs in our code.
The time() function of the time module gives the time in seconds as measured from a
reference date called the epoch. Ultimately, we are interested in time differences between two
points in our program and so this exact date is unimportant. Consider the following code snippet
from a script that computes the time required for a particular function ComputeEnergies()
to finish:
t1 = time.time()
ComputeEnergies()
t2 = time.time()
print(f”The time required was {t2-t1:.2f} sec”)
import timeit
elapsed_sec = timeit.timeit( ComputeEnergies(), number=1 )
print(f”The time required was {elapsed_sec:.2f} sec”)
Here, we can set the number argument to a much larger number to obtain average times.
For long programs, adding such statements for each function execution would be very tedious.
Python includes a profiling module that enables you to examine timings throughout your code.
There are two modules: profile and cProfile. These modules are entirely identical except
that cProfile has been written mostly in C and is much faster. cProfile is always
recommended unless you have an older version of Python that doesn't include it.
import cProfile
cProfile.run("ComputeEnergies()")
Notice that we send to the run function in cProfile a string that we want to execute. After
ComputeEnergies() finishes, cProfile will print out a long list of statistics about timings
in for that function and the functions it calls.
To profile a complete script, we can run cProfile on it from the command line:
After running, we get a report that looks something like this (abbreviated):
To the right are names of the modules and functions called by our program. Some might not look
familiar; this is usually the case when modules that have functions that call other functions in the
underlying modules. The numbers in columns give statistics about the program timing: