L1 Chapters 1 2
L1 Chapters 1 2
for
B������������� R�������
E-Learning Edition
Pouria Hadjibagheri
Developed by SysMIC®
2017-19
Contents
Contents iii
1 Ge�ing Started 1
1.1.3.3 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Contents iii
2.1.1.2 Receiving an input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.1.1 Shorthands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3.1.2 Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.2.1 Negation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3 Conditional Statements 45
3.3.2 Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4 Introduction to Arrays 57
4.1 List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.1.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.1.2 Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.3 Slicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
iv Contents
4.1.5 Mutability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1.6.2 Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.1.10.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.1.10.2 Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.1.11 Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.1.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 Tuple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.2 Immutability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.2 for-loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Contents v
5.3.2 Breaking a while-loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
vi Contents
7.1.2.3 String preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
8 Functions 183
Contents vii
End of Chapter Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Index 305
viii Contents
1
P���������� — ��� ������?
Why Python?
G������ S������
In addition, the 2018 developer survey conducted by StackOverflow found that Python is
the � second most popular general-purpose programming language after Java. This means
that it enjoys a vibrant community that support and maintain its development, and the de-
velopment of the libraries associated with it. This is confirmed by the fact that the language
is ranked as the � most popular language on GitHub® , which is the primary code-sharing
platform amongst programmers.
The language is � fastest growing major programming language in the world, and is voted
as the � most wanted programming language by employers. So if you are not convinced
by its popularity or features; you can always look into the � employment prospects. Not to
mention that programming in general is ranked amongst top starting salaries for graduates,
is an in-demand skill for jobs in biosciences.
Learning how to code as a beginner is akin to learning how to walk as a baby. You cannot
learn programming by memorising directives; no matter how well you memorise them.
To learn how to code, one must learn how think algorithmically; that is, how to break
different tasks down into logical procedures. The only way to learn how to think that way
is to practice, make mistakes, and how to overcome those mistakes. It is common to make
the same mistake more than once, especially in different contexts, and that can become
frustrating at times. However, once you get it, you will have it for life.
There are lots materials on the web, both free and paid, to assist with your learning. Use
them to your advantage. Great programmers are not the ones who know every technical
detail by heart; rather, they are the ones who know what they are looking for, and where
they can find the answer.
Although we focus on programming in the context of bioinformatics, throughout this course,
you will acquire many transferable skills that you may employ for other purposes or to learn
other specialities.
Compiled These are languages whose code is translated (compiled) into machine lan-
� I���������� ���� guage en-masse, and in advance, using a designated compiler programme —
Although the default behaviour of Python e.g. C, Rust, Haskell.
is as an interpreted language; its code may
also be compiled using special tools and tech-
niques. See � Cython and � PyPy to find out Interpreted Rely on and must always be accompanied with an interpreter, whose job
more. is to translate the source code into machine language one line at a time —
e.g. Python, R, MATLAB® .
In this subsection, we will learn about different versions of Python, and why should we use
the Anaconda distribution of Python 3.
You may have been told that you should learn Python 2. You may even have seen you
colleagues using or teaching Python 2. It is sometimes claimed that biologist should use
Python 2, because most biology related libraries in Python are written for that version.
This is wrong. There reason why Python 2 is still out there is that following the release
of Python 3.0 in December 2008, the CPython interpreter sustained several problems, and
was not backward compatible. This meant that, any code written in Python 2, could not be
run using Python 3 without modifications.
In 2008, Python was already a relatively popular language amongst professional program-
mers, and had a lot of open source libraries available, and maintained by the community.
This change meant that the existing libraries could no longer be used. This caused a little
bit of disruption, and led to some resistance in the community. Additionally, the release
of Python 2.7 coincided with the release of Python 3.1, and shared its new features; so the
community did not feel the need to move on.
Finally, in 2014, the Foundation announced that the support for Python 2.7 will cease in
2020, and encouraged the users to move on as soon as possible. The results of the Python
Developer’s Survey 2018 revealed that 84% of the community, and 90% of data scientists,
have � already adopted Python 3. It is considerably faster, offers more features, and is the
future of the language. Python 2 is obsolete. Do not use it.
For the purpose of this book, we will be using the Anaconda release of � CPython, the
default Python interpreter released by the � Python Software Foundation, and maintained
by the � Anaconda Cloud.
The Anaconda distribution of CPython automatically installs almost every package you
would need for scientific purposes (over 720 open source packages). It is easier to in-
stall, and it takes care of all the dependencies. This is particularly important because
some of Python’s scientific libraries have Fortran– and C–based dependencies, which may
be somewhat challenging to install for beginners, specially on Microsoft® Windows® and
Mac OS X® .
1.1.3.3 Installation
To install the Anaconda distribution of Python, please visit the � installation instructions as
outlined in the Anaconda Cloud documentations, and follow the instructions for your oper-
ating system. Ensure that you use the Python 3.x graphical installer for Microsoft® Windows® and
Mac OS X® (there is no graphical installer for Linux). Once downloaded, you can proceed
to install the distribution as you would any other application on your computer.
There are two types of programming languages: compiled and interpreted. In a compiled
language, such as the C programming language, the compiler translates our entire code into
a machine language programme, which we can then run. On the other hand, an interpreted
language would adopt one of the following three strategies:
• Parse the code (resolve into meaningful sections) and perform the instructions di-
rectly — e.g. early versions of the LISP programming language.
• Translate the code into a more efficient alternative before executing it — e.g. Python,
MATLAB® , Perl, Java, Julia.
• Explicitly execute a pre-compiled code — e.g. Pascal, JIT compilers (Python, Ju-
lia, . . . ), Java.
It is, however, important to appreciate that whilst compilation and interpretation are the
primary methods for the implementation of programming languages, they are not mutually
exclusive. In other words, an interpreted language such as Python or Java may perform
code translation in a manner that would imitate a compiled language.
To create a Python programme, we write our code in a simple text file and save it with
a .py format. We then use the Python interpreter to run our code. The Python interpreter is
an independent programme that we use to run our code. The Python interpreter checks our
code to ensure that it does not contain any syntactic mistakes. Once confirmed, it starts to
execute the code from the beginning, one line at a time. An important implications of this
principle is that we cannot reference a something — e.g. a variable or a function — before
its definition. We shall review such implications in more depth later in the book.
1.2.1 D������������
The source code of a software application must be accompanied with a text containing
certain information about the software. The text may be embedded within the code, be
placed in a separate document, or both.
How we embed the text within our code depends on which programming language we are
using. In Python, embedded documentations must be placed inside triple quotation marks
("""...""") at the very beginning of the file.
At the very least, our documentation must contain the following information:
• A very brief description of the code.
• The data on which the code was last modified.
• The copyright notice.
• The license.
� R�������
A code whose license is not explicitly indicated must always considered to be protected under exclusive
copyright. There are lots of reasons why you should elect to release your code as free so�ware. Visit
the � Open Source Initiative (OSI) website to find out more about open source, or drop by � ChooseALi-
cense.com to choose a license.
In Python, documentations that are embedded in the code using triple quotation marks are
referred to as docstring text.
It is always important to write a code that is explicit, legible, and easy to follow. However,
more often than not and due to a variety of reasons, the logic of the code may be somewhat
difficult to understand for someone else. To that end, it would be useful — if not essential
— to explain certain parts of the code and provide additional comments. Such explanations
will help us and other fellow programmers understand our code more easily in the future.
To provide comments on a specific line of code in Python, we use the octothorpe sign (#)
followed by a space:
1 This is a comment - it is not a part our code,
2 # and will be ignored by the interpreter.
3 x = 2 + 3
It is also possible to write comments on the same line as our code; however, we should do
so sparingly. To write such comments, we conventionally add two space characters before
the octothorpe.
1 x = 2 + 3 # This is an in-line comment.
Due to certain — primarily historical — issues; when we create a Python file, it is important
that we adhere to two conventional principles:
• Each line in a Python file may contain a maximum number of 79 characters. It is also
advised that we should limit docstrings and comments to a maximum of 72 characters
per line. For additional details, please visit the documentations for � Maximum Line
Length.
• Certain editors and operating systems requires two linebreaks at the end of a file. It
is therefore important to always leave one, and only one blank line at the very end
of a Python file. The blank line combined with the last line of our actual code would
produce two linebreaks at the end of our file.
� R�������
Only what we define within the environment of our application and store in the memory is directly con-
trolled by our application. We may access or take control over other environments through certain media;
however, such interactions are classified as I/O operations. An example of this is interacting with a file
on our computer, which we discuss in chapter 7. Whilst we have complete control over the file while we
are working on it (e.g. reading from it or writing to it), our access to the file and the transmission of data
is in fact controlled and managed by the operating system.
� A������� �����
If you are interested in learning more about
I/O systems and how they are handled at op-
erating system level, you might benefit from In programming, I/O operations include, but are certainly not limit to:
chapter 13 of Operating Systems Concepts, 8th
ed. by Abraham Silberschatz, Greg Gagne, • displaying the results of a calculation to the user;
and Peter Galvin.
• asking the user enter a value;
• any operation that allows our programme to communicate the data stored in the mem-
ory.
In this section, we learn about two very basic but most fundamental methods of I/O oper-
ations in Python. We will be using these methods throughout the course, so it is essential
that you feel comfortable with them and the way they work before moving on. We shall
return to I/O operations later in the course, especially in chapter 7.
� Documentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . � Print
The term “output” in reference to an application typically refers to the data that has either
been generated, or manipulated by that application.
For example; we have two number and we would like to calculate their sum. The action
of calculating the sum is itself a mathematical operation (discussed in section 2.3.1). The
result of our calculation is called an output. Once we obtain the result, we might want to
save it in a file or display it on the screen, in which case we will be performing an I/O
operation.
The simplest, most frequently used method for generating an output in almost every modern
programming language is to display something on the screen.
We normally run our Python scripts and programmes from the terminal; so the typical (and
the easiest) method for producing an output is to display it in the terminal. The way we do
this is by calling a dedicated built-in function called print() .
FUN
� R�������
In programming, a function is essentially an isolated piece of code. It usually to takes some inputs, does
something to or with them, and produces an output. The pair of parenthesis that follow a function are
there so that we can provide the function with the input arguments it needs when we call it, so that it can
do what it is supposed to do using our data. We will explore functions in more details in chapter 8.
The print() function can take several inputs and performs different tasks. Its primary
FUN
objective, however, is to take some values as input and display them in the terminal. Here
is how it works:
Suppose we want to display some text in the terminal. To do so, we write:
In [1]: 1 print('Hello world!')
in your favourite editor or IDE and save it as script_a.py in a file. This is now a fully
functioning python programme that we can run using the Python interpreter.
� R�������
To name a Python script, we must adhere the file naming protocol:
• The file name must not start with a number.
• The file name should ideally not contain capital le�ers.
• Except for underscores or _ , the file name must not contain any special characters or spaces.
• Conventionally, the file name should not be wri�en in camel case — e.g. scriptA.py
• The file name should be kept short, ideally less than 30 characters.
• The file must be given a .py extension — e.g. script_a.py
If you are using an Integrated Development Environment (IDE) — e.g. Visual Studio Code,
you may execute your code using the inernal tools provided by that IDE. The specifics of
how you do so depends on the IDE that you are using.
Alternatively, we may always execute Python scripts manually. To do so, we open the ter-
minal or the command prompt (CMD) in Microsoft® Windows® and navigate to the directory
where we saved script_a.py. � N���
If you don’t know how to navigate in the ter-
Once in the correct directory, we run the script by typing python3 script_a.py in our minal, see the example 1 the end of this sec-
tion.
terminal as follows:
python3 script_a.py
This will call the Python 3 interpreter to execute the code we wrote in script_a.py. Once
executed, which in this case should be instantaneous, we should see the output:
Out [1]: Hello world!
Congratulations. . . you have now successfully written and executed your first programme
in Python.
� R�������
We know that print() is a function because it ends with a pair of parenthesis, and it is wri�en
FUN
entirely in lowercase characters. Conventionally, functions in Python should always be defined using
lower case characters (� PEP-8: Function Names). Some IDEs change color when they encounter built-in
functions in the code so that we won’t accidentally overwrite them. We shall discuss functions in more
details in chapter 8.
Note We never overwrite a build-in function. Doing so would means that we lose that function
during the execution of our programme. For instance, if we overwrite print() , we
FUN
would no longer be able to use it for displaying the outputs of that programme using
print() . This is because as far as Python is concerned, we have redefined the name
FUN
print; which means that it no longer represents the original definition for displaying
outputs.
Python is referred to by its programmers known as an adults’ language. This means that
it does not impose many of the restrictions that exist in other languages. One example
of such restrictions is that in most programming languages, the programmer is unable to
overwrite built-in functions. Conversely, in Python, it is up to the programmer to make
a decision. This makes Python a very powerful language, because it allows the program-
mer to micro-manage the behaviour of the language at its core as and when necessary;
however, it must be used very cautiously for with great power, comes great responsiblity.
To that end, Python relies on a set of conventions (� PEP-8: Function Names) to let the
programmer know that, for instance, a specific name is reserved for a built-in function.
IDEs tend to interpret these conventions in di�erent ways (e.g. by changing the colour).
This is in fact one the reasons why we should use an IDE instead a simple text editor. If
you are interested, changes to the language or its conventions are proposed and then put
to vote by the Python developers community in a process that is faciliated by the � Python
So�ware Foundation. You are now a member of the Python developers community, so feel
free to contribute, vote, or voice your opinion.
We can pass more than a single value to the print() function, provided that they are
FUN
Notice that there is a space between 'Hello' and 'John' even though we did not include
a space in our text. This is the default behaviour of the print() function when it
FUN
� R�������
In Python, values that do not require a name when passed to a function — e.g. Jane, 21, or London in the
last example, are known as positional arguments. On the other hand, values that require their name to be
specified such as sep=... are referred to as keyword arguments. Values that do not require to be specified
when calling a function and have a default value — e.g. sep=' ', are called default arguments.
F������� ����
Input arguments
The input and the outputs of a function are referred to as the function signature.
D� I� Y������� 2.1.1 �
Write a script that displays the following output:
� Documentations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . � Input
Inputs are I/O operations that involve receiving some data from the outside world. This
might include reading the contents of a file, downloading something the Internet, or asking
the user to enter a value.
� N��� The simplest way to acquire an input in Python is to ask the user to enter a value in the
terminal. To do so, we use a dedicated built-in function called input() . The function
FUN
®
In a Unix system (Mac OS X or Linux), a tilde
is an alias for a user’s home directory.
takes a single argument called prompt. Prompt is the text displayed in the terminal to ask
the user for an input. Figure 2.1.1, illustrates a screen shot of my personal computer’s
prompt, where it displays my user name (i.e. pouria) followed by a tilde (˜ ). A terminal
prompt may be different in each computer and operating system.
Here is how we implement the input() function:
FUN
In [6]: 1 input('Please enter your name: ')
If we save one of the above in a file and call it script_b.py; and execute the file using the
Python 3 interpreter as described previously, we shall see the following:
python3 script_b.py
Note Boxes of code that do not have a number (i.e. there is not out [X]:) rep-
resent the intermediary output of the preceding line in terminal where the
programme requires an input by the user before it can continue to produce
an output.
F����� 2.1.1 Terminal window on a Linux computer (le�) and a Mac (right).
The terminal cursor, displayed as an underscore in our example, will be in front of the
prompt (i.e. Please enter your name: ) waiting for a response. Once it receives a re-
sponse, it will proceed to run the rest of the code (if any), or terminate the execution.
� R�������
Python is an interpreted language; that is, the code we write is executed by the Python interpreter one
line at a time. The input() function performs a blocking process. This means that the execution of
FUN
the code by the Python interpreter is halted upon encountering an input() function until the user
FUN
enters a value. Once a value is entered, the interpreter then proceeds to execute the next line.
We may store the user’s response in a variable. Variables are the topic of the next section,
where we shall also review more examples on input() and how we can use it to
FUN
produce results based on the responses we receive from the user.
D� I� Y������� 2.1.2 �
Write a script that asks the user to enter the name of a protein in the terminal.
Terminal is a � command-line interface emulator. It provides an interface to a computer programme in which a user
can issue commands in successive lines to perform specific operations.
How to navigate the terminal in Mac OS X and Linux?
The Unix terminal — i.e. the one used by Mac OS X® and Linux; uses a command language known as � Bash, also
referred to as the Unix Shell. This is not the topic of this book, so we won’t discuss its properties and features in
much detail.
Microso�® Windows® , on the other hand, relies on DOS (Microso�® Disk Operating System® ) commands to nav-
igate the terminal. The terminal environment in Microso�® Windows® is sometimes referred to as the command
prompt or CMD.
The enviroment in which we can use Bash or � DOS commands is known as the terminal. To use a terminal, we
need a terminal emulator. Every operating system has a default terminal emulator, however, you can always use
an alternative emulator if you so wish. Some IDEs (e.g. Visual Studio Code, Sublime Text, PyCharm® ) also o�er a
terminal emulator that is embedded within the IDE.
To navigate the terminal environment, we should first find out where we are. Here is how we do that:
pwd pwd
Remember that the path to an enviroment is displayed (and supplied di�erently in Unix and Microso�® Windows® ).
To that end, the response (output) to these commands may slightly vary:
/home/UserName/Documents C:\Documents
Notice that in Microso�® Windows® , we separate directories using a backslash ( \ ), whilst in Unix, we use a foreslash
( / ) for that purpose.
Note Paths that contain one or more space characters must be encapsulated by double quotation marks
("...").
If we are already in the Documents directory, and want to navigate onto Python_Scripts, we can use the following
shorthand versions of the above method:
P����� ���������
To navigate to the parent directory — i.e. from Python_Scripts to Documents in this example; we do:
cd .. cd ..
H��� ���������
We can also navigate all the way back to the home directory in one go:
cd cd \
C������� �� � ���������
ls dir
We use variables to store data in the memory. Each variable has 3 characteristics: scope,
name, and type. Scope and name must be mutually unique. Starting with name, we will
discuss each of these characteristics in more details throughout this chapter.
Name of a variable is in fact an alias for a location in the memory. You can think of it as a
postbox, which is used as a substitute for an address. Comparably, we use variable names so
we wouldn’t have to use the actual address to the location we want in the memory because
it would look something like 0x106fb8348.
There are some relatively simple rules to follow when defining variable names, which ulti-
mately boil down to:
Invalid 5mins = 5 SEC_IN_MIN Variables cannot start with numbers — raises a SyntaxError.
Invalid if = 2 A�empts to overwrite an internal syntax — raises a SyntaxError.
Worst j = 2 Overwrites the internal identifier for complex numbers.
Worst int = 2 Overwrites the internal definition for integer numbers.
Bad a = 4 Meaningless name + hard to identify in code.
Okay var1 = 0 Meaningless name + hard to distinguish.
Be�er var_1 = 0 Meaningless name.
Good current_state = 0 Good name + easy to distinguish.
Okay five_mins_in_sec = 5 60 Good name, vague value.
Good SEC_IN_MIN = 60 Good name + good value + distinguished as a constant (conventionally, names made
exclusively of capital le�ers signify constants).
Good five_mins2sec = 5 SEC_IN_MIN Good name + good value.
Good five_mins2sec = 5 60 # 5 min 60 sec Good name + vague value clarified by a comment.
Good current_protein = 'DNA Polymerase III' Good name + good value.
� R�������
We should never overwrite an existing, built-in definitions or identifier (e.g. int or j). We will be learning
many such definitions and identifiers as we make progress. Nonetheless, a good IDE would highlight
syntaxes and built-in identifier in a di�erent colour. That is why internal identifiers are displayed in a
di�erent colour in table 2.2.1. The exact colourig scheme depends on the IDE and the theme.
3 print(total_items)
Out [1]: 2
In [2]: 1 total_items = 3
2
3 print(total_items)
Out [2]: 3
Variables containing integer numbers are known as int, and those containing decimal num-
bers are known as float in Python.
In [3]: 1 total_items = 2
2
3 print(total_items)
Out [3]: 2
3 print(total_values)
3 print(temperature)
Variables can contain characters as well; but to prevent Python from confusing them with
meaningful commands, we use quotation marks. So long as we remain consistent, it doesn’t
matter whether we use single or double quotations. These variables are known as string or
str:
In [6]: 1 forename = 'John'
2 surname = "Doe"
3
D� I� Y������� 2.2.1 �
Oxidised low-density lipoprotein (LDL) receptor 1 mediates the recognition, internalisation and degradation of oxida-
tively modified low density lipoprotein by vascular endothelial cells. Using the � Universal Protein Resource (UniProt)
website, find this protein for humans, and identify:
• UniProt entry number.
• Length of the protein for isoform 1 (under Sequences).
• Gene name (under Names & Taxonomy).
Store the information you retrieved, including the protein name, in 4 separate variables.
Display the values of these 4 variables in one line, and separate the items with 3 spaces, as follows:
F����� 2.2.1
Now that we know how to create variables to store values, we can also use them to retain
the value (response) entered by the user when using the input() function:
FUN
In [7]: 1 name = input('Please enter your name: ')
2
3 print('Hi,', name)
If we save this in a file called script_c.py and run it, the output will appear in the terminal
as displayed in figure 2.2.1.
� R�������
The input() function displays a prompt and waits for the user to enter a value. The value entered
FUN
by the user is then returned by the function as a str type. The value returned by a function may also be
referred to as the output of that function. The output of a function may be stored in a variable.
D� I� Y������� 2.2.2 �
1. Write a script that upon execution, asks the user to enter the name of an enzyme and then retains the response in
an appropriately named variable.
2. Use the variable to display an output similar to the following:
ENZYME_NAME is an enzyme.
� R�������
In a dynamically typed language, it is the value of a variable that determines the type. This is because
the types are determined on the fly by the Python interpreter as and when it encouters di�erent variables
and values.
� N���
Why learn about types in a dynamically typed programming language? In a dynamically typed language, the val-
ues determine the type of a variable. This
Now that we know how to define variables in Python, you may have noticed that depending is in contrast with statically typed languages
where a variable must be initialised using a
on the value, there are different forms of variables such as int, float, and str. These are specific type before a value — whose type is
built-in definitions known as variable types. In essence, types tell the computer how much consistent with the initialised variable, can
be assigned to it.
space in the memory should be reserved for the value of a specific variable, and clarify the
operations that may be applied to it.
Python enjoys a powerful type system out of the box. Table 2.2.2 provides a comprehensive
reference for the built-in types in Python. Built-in types are the types that exist in the
language and do not require any third party libraries to implement or use.
Sometimes we might need want to know what is the type of a variable. To do so, we use
the build-in function type() as follows:
FUN
In [1]: 1 total_items = 2
2
3 print(type(total_items))
3 print(type(total_values))
3 print(type(temperature))
F����� 2.2.2 A comprehensive (but non-exhaustive) reference of built-in (native) types in Python 3.
Not discussed in this course — included for reference only.
$
dict is not an iterable by default, however, it is possible to iterate through its keys.
Mutability is an important � concept in programming. For the purpose of this tutorial, a mutable object is an object whose value(s) may be altered.
This will become clearer once we study list and tuple. Find out more about mutability in Python from the � documentations.
√
Complex numbers refer to a � set of numbers that have a real part, and an imaginary part; where the imaginary part is defined as −1. These
numbers are very useful in the study of oscillatory behaviours and flow (e.g. heat, fluid, electricity). To learn more about complex numbers, watch
this � Khan Academy video tutorial.
3 print(type(phase))
3 print(type(full_name))
� R�������
In Python, a variable / value of a certain type may be referred to as an instance of that type. For instance,
an integer value whose type in Python is defined as int is said to be an instance of type int.
D� I� Y������� 2.2.3 �
Determine and display the type for each of these values:
• 32
• 24.3454
• 2.5 + 1.5
• "RNA Polymerase III"
• 0
• .5 - 1
• 1.3e-5
• 3e5
The result for each value should be represented in the following format:
FUN
i.e. the user’s response, in other types. Imagine the following scenario:
We ask our user to enter the total volume of their purified protein, so that we can work out the
amount of assay they need to conduct a specific experiment. To calculate this assay volume
using the volume of the purified protein, we need to perform mathematical calculations based
on the response we receive from our user. It is not possible to perform mathematical operations
on non-numeric values. Therefore, we ought to somehow convert the type from str to a numeric
type.
The possibility of converting from one type to another depends entirely on the value, the
source type, and the target type. For instance; we can convert an instance of type str
(source type) to one of type int (target type) if and only if the source value consists entirely
of numbers and there are no other characters.
� R�������
To convert a variable from one type to another, we use the Type Name of the target type (as described in
table 2.2.2) and treat it as a function.
For instance, to convert a variable to integer, we:
• Look up the Type Name for integer from table 2.2.2
• Then use the type function, whose name is the same as the Type Name) — e.g. int() for
FUN
integers
• Use the function to convert our variable: new_var = int(old_var)
3 print(value_a, type(value_a))
3 print(value_b, type(value_b))
3 print(value_a, type(value_a))
� R�������
In programming, we routinely face errors — also referred to as exceptions, resulting from di�erent mis-
takes. The process of finding and correcting such mistakes in the code is referred to as debugging.
D� I� Y������� 2.2.4 �
We have been given the following snippet wri�en in Python 3:
In [1]: 1 value_a = 3
2 value_b = '2'
3
3 + 2 = 5
F����� 2.2.3 Type evaluation against four di�erent responses to the input() function. An input() function always stores the response as a
str value, no ma�er what the user enters.
When we use input() to obtain a value from the user, the results are by default an
FUN
instance of type str. So for the code we explored in relation to figure 2.2.1, we could
assess the input type as follows:
In [10]: 1 name = input('Please enter your name: ')
2
3 print(name, type(name))
If we save this script in a file called script_d.py and run it, the output will be as displayed
in figure 2.2.3.
� R�������
The input() function always returns a value of type str regardless of the user’s response. In other
FUN
words, if a user’s response to an input() request is numeric, Python will not automatically recognise
FUN
it as a numeric type.
We may use type conversion in conjunction with the values returned by the input()
FUN
function:
In [11]: 1 response = input('Please enter a numeric value: ')
2
3 response_numeric = float(response)
4
5 print('response:', response)
6 print('response type:', type(response))
7 print('response_numeric:', response_numeric)
8 print('response_numeric type:', type(response_numeric))
If we save this as script_e.py as run it, we will be directed to enter numeric values. The
first two attempts displayed in figure 2.2.4 demonstrate the results when we enter numeric
values as directed. If, however, we supply a non-numeric response as demonstrated in the
third attempt, a ValueError will be raised.
F����� 2.2.4
D� I� Y������� 2.2.5 �
We know that each amino acid in a protein is encoded by a triplet of mRNA nucleotides.
With that in mind, alter the script you wrote for DIY 2.2.2 and use the number of amino acids entered by the user to
calculate the number of mRNA nucleotides.
Display the results in the following format:
where NUCLEOTIDES is the total number of mRNA nucleotides that you calculated.
Note: Multiplication is represented using the asterisk ( ) sign.
When defining a variable, we should always consider where in our programme we intent
to use it. The more localised our variables, the better. This is because local variables are
easier to distinguish, and thus reduce the chance of making mistakes — e.g. unintentionally
redefine or alter the value of an existing variable.
To that end, the scope of a variable defines the ability to reference a variable from different
locations in our programmes. The concept of local variables becomes clearer once we
explore functions in programming.
As displayed in figure 2.2.5, the location at or from which a variable can be referenced
depends on the location where the variable is defined.
Although scope and hierarchy appear at first glance as theoretical concepts in programming,
their implications are entirely practical. The definition of these principles tend to vary from
2.3 O���������
Through our experimentations with variable types, we already know that variables may be
subject to different operations.
When assessing type conversions, we also established that the operations we can apply to
each variable depend on the type of that variable. To that end, we learned that although
it is sometimes possible to mix variables from different types to perform an operation —
e.g. multiplying a floating point number with an integer, there are some logical restrictions
in place.
Throughout this section, we will take a closer look into different types of operations in
Python. This will allow us to gain a deeper insight into the concept and familiarise our-
selves with the underlying logic.
To recapitulate on what we have done so far, we start off by reviewing additions — the
most basic of all operations.
Give the variable total_items:
In [1]: 1 total_items = 2
2
3 print(total_items)
Out [1]: 2
3 print(total_items)
Out [2]: 3
Given 2 different variables, each containing a different value; we can perform an opera-
tion on these values and store the result in another variable without altering the original
variables in any way:
In [3]: 1 old_items = 4
2 new_items = 3
3
6 print(total_items)
Out [3]: 7
We can change the value of an existing variable using the value stored in another variable:
In [4]: 1 new_items = 5
2 total_items = total_items + new_items
3
4 print(total_items)
Out [4]: 12
There is also a shorthand method for applying the operation on an existing variable:
In [5]: 1 total_items = 2
2 print(total_items)
Out [5]: 2
In [6]: 1 total_items += 1
2
3 print(total_items)
Out [6]: 3
In [7]: 1 new_items = 5
2 total_items += new_items
3
4 print(total_items)
Out [7]: 8
� R�������
There are 2 general categories of operations in programming, which are used frequently in all program-
ming languages:
• Mathematical operations
• Logical operations
Naturally, we use mathematical operations to perform calculations, and logical operations to perform
tests.
� R�������
As far as mathematical operations are concerned, variables a and b may be an instance of any numeric
type. See table 2.2.2 to find out more about numeric types in Python.
Values of type int have been chosen in our examples to facilitate the understanding of the results.
Addition a + b 22
Subtraction a - b 12
Multiplication a b 85
Division a / b 3.4
Floor quotient a // b 3
Remainder a % b 2
�otient and remainder divmod(a, b) (3, 2)
Power a b 1419857
Absolute value abs(b - a) 12
D� I� Y������� 2.3.1 �
1. Calculate the following and store the results in appropriately named variables:
a. 5.8 × 3.3
180
b. 6
c. 35 − 3.0
d. 35 − 3
e. 21000
Display the result of each calculation – including the type, in the following format:
FUN
Convert the result for 21000 from int to str, then use the aforementioned function to work out the length of
the number — i.e. how many digits is it made of?
If you feel adventurous, you can try this for 210000 or higher; but beware that you might overwhelm your computer and need
a restart it if you go too far (i.e. above 21000000 ). Just make sure you save everything beforehand, so you don’t accidentally
step on your own foot.
Hint: We discuss len() in subsection 4.1.6.2. However, at this point, you should be able to use the o�icial
FUN
2.3.1.1 Shorthands
� I���������� ���� When it comes to mathematical operations in Python, there is a frequently used shorthand
As of Python 3.6, you can use an underscores method that every Python programmer should be familiar with.
( _ ) within large numbers as a separator to
make them easier to read in your code. For Suppose we have a variable defined as total_residues = 52 and want to perform a math-
instance, instead of x = 1000000, you can
write x = 1_000_000.
ematical operation on it. However, we would like to store the result of that operation in
total_residues instead of a new variable. In such cases, we can do as follows:
In [2]: 1 total_residues = 52
In [3]: 1 # Addition:
2 total_residues += 8
3
4 print(total_residues)
Out [3]: 60
In [4]: 1 # Subtraction:
2 total_residues -= 10
3
4 print(total_residues)
Out [4]: 50
In [5]: 1 # Multiplication:
2 total_residues = 2
3
4 print(total_residues)
In [6]: 1 # Division:
2 total_residues /= 4
3
4 print(total_residues)
Out [6]: 25
4 print(total_residues)
Out [7]: 12
In [8]: 1 # Remainder:
2 total_residues %= 5
3
4 print(total_residues)
Out [8]: 2
In [9]: 1 # Power:
2 total_residues = 3
3
4 print(total_residues)
Out [9]: 8
5 total_residues += new_residues
6
7 print(total_residues)
Out [10]: 60
3 print(total_residues)
Out [11]: 84
D� I� Y������� 2.3.2 �
1. Given:
• Circumference: C = 18.84956
• Radius: R = 3
and considering that the properties of a circle are defined as follows:
R
D
O
calculate p using the following equation and store it in a variable named pi:
C
p=
D
Then round the results to 5 decimal places and display the result in the following format:
Note To round floating point numbers in Python, we use round() . This is a built-in function that
FUN
takes 2 input arguments: the first is the variable/value to be rounded, and the second is the number
decimal places. Read more about round() in the � o�icial documentations.
FUN
pi
pi =
(3 mod 2) − 1
where the expression “3 mod 2” represents the remainder for the division of 3 by 2.
� R�������
Always leave a single space between the operators (e.g. +, /, ) and the operands (e.g. 3, 4.5, 2.1j, or
any variable). It is a good practice to do so, and makes your code more legible.
2.3.1.2 Precedence
x = 2+3×9
Such an expression can only be evaluated correctly if we do the multiplication first and then
perform the addition. This means that the evaluation is done as follows:
given ∶ 3 × 9 = 27
⇒ x = 2 + 27
= 29
� R�������
Operator precedence in mathematical operations may be described as follows:
1. Exponents and roots
2. Multiplication and division
3. Addition and subtraction
If there are any parenthesis () in the expression, the expression is evaluated from the innermost paren-
thesis outwards.
x = 2 × �3 + (5 − 1)2 �
x = 2 × (3 + 42 )
= 2 × (3 + 16)
= 2 × 19
= 38
The same principle applies in Python. This means that if we use Python to evaluate the
above expression, the result would be identical:
In [12]: 1 result = 2 (3 + (5 - 1) 2)
2
3 print(result)
Out [12]: 38
D� I� Y������� 2.3.3 �
Display the result of each item in the following format:
EXPRESSION = RESULT
For example:
2 + 3 = 5
c. 5 + 4×3
2
3. Given
1 a = 2
2 b = 5
� R�������
Where the precedence of operations in an expression are identical, the expression is evaluated from le�
to right.
For instance, consider the following statement:
The highest precedence in the above statement belongs to the part wri�en in parenthesis. Therefore, the
expression is initially simplified to result = 3 / 0.5 3.
At this point, we are le� with a division and a multiplication. We know that both of these operations enjoy
the same level of precedence. So at this point, the result of the expression is calculated by moving from
the le�most operation to the rightmost one — i.e. result = (3 / 0.5) 3.
In [14]: 1 print(result)
3 print(result)
= 2057
Precedence
6th
= 2058
5th
=6
4th = 343
3rd
=7
2nd
=6
1st
result = 3 / 0.5 (1 + 2 3) 3 - 1
3 print(result)
5 forename = 'Jane'
6 surname = 'Doe'
7 birthday = '01/01/1990'
8
13 print(data)
� R�������
New line character or '\n' is a universal directive to induce a line-break in Unix based operating systems
(Mac OS X® and Linux). In Microso�® Windows® , we usually use '\r' or '\r\n' instead. These are
known as Escape Sequences, which we explore in more depath under String Operations in chapter 7.
D� I� Y������� 2.3.4 �
The risk of Huntington’s disease appears to increase proportional to the continuous repetition of CAG nucleotides
(glutamine codon) once they exceed 35 near the beginning of the Huntingtin (IT15) gene. The CAG repeats are also
referred to as a polyglutamine or polyQ tract.
Given:
1 glutamine_codon = 'CAG'
1. Create a polynucleotide chain representing 36 glutamine codons. Store the result in a variable called polyq_codons.
Display the result as:
2. Use len() to work out the length of polyq_codons, and store the result in a variable called polyq_codons_length.
FUN
3. Use len() to work out the length of glutamin_codon, and store the result in variable amino_acids_per_codon.
FUN
4. Divide polyq_codons_length by amino_acids_per_codon to prove that the chain contains the codon for exactly
36 amino acids. Store the result in variable polyq_peptide_length.
Display the result in the following format:
An operation may involve a comparison. The result of such operations is either True or
False. This is known as the Boolean or bool data type. In reality, however, computers � I���������� ����
record True and False as 1 and 0 respectively. The Boolean data type is named a�er the
English mathematician and logician George
Operations with Boolean results are referred to as logical operations. Testing the results of Boole (1815–1864).
such operations is known as truth value testing.
Given the two variables a and b as follows:
In [1]: 1 a = 17
2 b = 5
Equivalence a == b False
Non-equivalence a != b True
Greater a > b True
Either greater or equal a >= b True
Smaller a < b False
Either smaller or equal a <= b False
Between 5 < a < 20 True
D� I� Y������� 2.3.5 �
We know that in algebra, the first identity (square of a binomial) is:
(a + b)2 = a2 + 2ab + b2
now given:
1 a = 15
2 b = 4
1. Calculate
• y1 = (a + b)2
• y2 = a2 + 2ab + b2
Display the results in the following format:
y1 = XX
y2 = XX
2. Determine whether or not y_1 is indeed equal to y_2. Store the result of your test in another variable called
equivalence. Display the results in the following format:
2.3.2.1 Negation
We can also use negation in logical operations. Negation means that the reverse of the
statement is to be considered as True. So given:
In [2]: 1 value = 2
2
3 result = value == 2
4
5 print(result)
3 print(negated_result)
As you can see from the code, negation in Python is implemented using not :
SYN
L���� S�������� R�����
D� I� Y������� 2.3.6 �
Using the information from DIY 2.3.5:
1. Without using not , determine whether or not y_1 is not equal to y_2. Display the result of your test and store
SYN
Logical operations may be combined using conjunction with and and disjunction with
SYN
D� I� Y������� 2.3.7 �
Given
In [1]: 1 a = True
2 b = False
3 c = True
1. a == b
2. a == c
3. a or b
4. a and b
5. a or b and c
6. (a or b) and c
7. not a or (b and c)
8. not a or not(b and c)
9. not a and not(b and c)
10. not a and not(b or c)
1. [True/False]
2. [True/False]
...
It may help to break down more complex operations, or use parenthesis to make them easier
to both read and write:
Notice that in the last example, all notations are essentially the same and only vary in terms
of their collective results as defined using parenthesis. Always remember that in a logical
statement:
• The statement in parenthesis does not have precedence over the rest of the state (un-
like mathematical statements). It merely defines an independent part of the operation
whose response is evaluated separately.
• The precedence is established on a first come, first serve basis (from left to right).
• In disjunctive statements — i.e. a > 5 or b > 5, if the first part is True, the second
part is not checked. In other words, if a is greater than 5, the computer does not
proceed to check whether or not b is greater than 5.
• In conjunctive statements — i.e. a > 5 and b > 5, the statement proceeds to the
seconds part if and only if the first part is True. In other words, the result of a
conjunctive statement is only True if and only if both a and b are greater than 5. If a
is False, the entire statement will inevitably be False.
• The longer the statement, the more difficult it would be to understand it properly, and
by extension, the more likely it would be to cause problems.
In [4]: 1 a, b, c = 17, 5, 2 # Alternative method to define variables.
These are only a few examples. There are endless possibilities, try them yourself and see
how they work.
� R�������
Some logical operations may be wri�en in di�erent ways. However, we should always use the notation
that is most coherent in the context of our code. If in doubt, use the simplest / shortest notation.
To that end, you may want to use variables to split complex statements down to smaller
portions:
In [14]: 1 age_a, age_b = 15, 35
2
D� I� Y������� 2.3.8 �
Given
In [1]: 1 a = 3
2 b = 13
• a2 < b
• 3 − a3 < b
• �25 − a2 � > b
• 25 mod a2 > b
• 25 mod a2 > b or 25 mod b < a
• 25 mod a2 < b and 25 mod b > a
• 12
a
and a × 4 < b
where “|. . . |” represents the absolute value, and “n mod m” represents the remainder for the division of n by m.
1. [True/False]
2. [True/False]
...
Vmax [S]
v=
Km + [S]