0% found this document useful (0 votes)
13 views

Practical Python For Beginners - Course Guide

This document provides an overview of a training course on practical Python for beginners with a focus on biochemists. The course aims to teach learners how to use Python for tasks like data analysis and visualization. It covers topics like installing Anaconda and Spyder, using basic data types, creating and manipulating lists, importing and using libraries, reading data files, and performing complex analyses in Jupyter notebooks.

Uploaded by

farihakanwal2021
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Practical Python For Beginners - Course Guide

This document provides an overview of a training course on practical Python for beginners with a focus on biochemists. The course aims to teach learners how to use Python for tasks like data analysis and visualization. It covers topics like installing Anaconda and Spyder, using basic data types, creating and manipulating lists, importing and using libraries, reading data files, and performing complex analyses in Jupyter notebooks.

Uploaded by

farihakanwal2021
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Biochemical Society Online Training Course

Practical Python for beginners: a Biochemist's guide

Course overview - Practical Python for beginners:


a Biochemist's guide
Please do not share this document. This course is © the Authors with an exclusive licence
to publish/reuse ‘Practical Python for beginners: a Biochemist's guide’ belonging to the
Biochemical Society. The course is only supplied for personal use. By registering, delegates
agree not to: copy the material for distribution, reuse it (for purposes other than personal
learning), or share it with third parties without permission from the Biochemical Society.

Learning objectives
After completion of the course, the successful learner will be able to:
• Explain the rationale for scripting
• Install Anaconda and navigate Spyder
• Use the IPython command line as a calculator and to assign variables
• Use the basic data types and some simple functions
• Create lists and select elements from them
• Use For Loops to perform operations iteratively
• Explain what a library is, import a Python library and use functions it contains
• Read biochemical data from a file into a Python
• Understand computing concepts such as what is meant by the working directory, absolute
and relative paths and be able to apply these concepts to data import
• Analyse and visualise biochemical data using powerful Python packages such as NumPy,
Pandas, Sklearn and Matplotlib
• Run examples of more complex analyses in Jupyter notebooks

Definitions
Fill in the below table of definitions as you complete the course.

Command
Script
Comment
Syntax Highlighting
Operator
Operand
Assignment
Function
Integer
Float
String
Reproducibility
List
Indexing a list
Method
For loops

1
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

If
Else
Break
Continue
Elif
Library
Alias
Element-wise
NumPy
Array
Pandas
DataFrame
Working directory
Csv file

2
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Modules
Module 0 - Welcome to the course

The analysis and reporting phase should be entirely reproducible - given the data generated, we
should be able to reproduce every aspect of its processing, analysis and reporting. Scripting, in which
each step is explicitly articulated, is the best way to achieve this.

The more reproducible your analysis, the faster you, and others, will be able to apply similar
analyses in the future and the more transparent and open your work; this leads to better science.

Notes:

3
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 1 - Getting set up

By the end of this module, you will be able to:


• Install anaconda and open a Python IDE called Spyder
• Describe the different parts of Spyder
• Customise the appearance of Spyder
• Understand how to use the Interactive Python console and a script

1.2 - Anaconda

Q. What is Anaconda?

Download and install Anaconda following the instructions on the course site. Start Anaconda
navigator as you normally start a programme i.e. the Start menu (Windows) or Dock (Mac).

1.3 - Spyder

Start Anaconda navigator click on Spyder to launch.


The Spyder window is divided into three panes and each of these has different tabs.
The top right pane has a Help tab, File Explorer tab and a Variable Explorer tab. Click on the File
Explorer tab and navigate to a folder for working.

1.5 - Typing commands in the console

The bottom right pane is the Interactive Python (IPython) console. This is where commands are
executed.
• In [#]: is the prompt, a command is typed after this.
• Pressing enter will send a command and return the result on a line starting: Out [#]
• For the rest of this course, we won't show In [#] or Out [#].

1.6 - Using a script

The pane on the left is a script file.


• To run a selection of code, highlight what you want to run and press the "Run the Current
selection of line" button (or F9).
• Make sure you save your Python scripts with the file extension .py

Q. What are three advantages of using a script?


-
-
-

1.7 – Commenting

It is good practice to comment your code. You can write as much information in comments as you
like by adding a hash, #, to the start of the line.

4
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Example:

# Using python as a calculator


3 + 4
# I get an answer of 7

Q. Why comment your code?

Notes:

5
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 2 - Basic data types and simple functions

By the end of this module, you will be able to:


• Describe the integer, float and string data types for Python variables
• Assign values to variables and explain why this is good practice
• Use functions to discover the type of a variable and print it

2.3 - IPython as a calculator and operators

Example:

In [ ]: 4 + 3
Out[ ]: 7

Here the ‘+’ is known as an operator and both ‘4’ and ‘3’ are known as operands.

Other operators include:


• - subtraction • < is less than
• / division • > is greater than
• * multiplication • == is equal to
• ** exponentiation (“to the power of”) • != is not equal to
• % modulus (remainder after division)

2.4 – Assignment

The = is the assignment operator. To assign the numbers 4 and 3 to variables we can use:

num1 = 4
num2 = 3

2.5 - Your first function

A function has a name followed by brackets. Inside the brackets go arguments which tell the
function what to act on and often how to act on it.
For example the function type() returns the datatype of an object:

type(num1)
Out[ ]: int

Int is short for integer.


If you use a decimal point, even for a whole number, the type will be a float:

num3 = 4.0
type(num3)
Out[ ]: float

2.6 - Basic data types

Integers and floats are two different kinds of numerical data. Words are of the data type ‘string’.

6
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

We can add strings together:

word1 = 'hello'
word2 = 'Emma'
word1 + word2
Out[ ]: 'helloEmma'

You would need to add a space explicitly:

word1 + ' ' + word2


Out[ ]: 'hello Emma'

In this case, writing word1 will print out the word but in other scenarios we must tell python to
print using the print() function:

print(word1)
Out[ ]: hello

2.7 - Why we assign to variables

Q. Why do we use variable assignment?

Example:

pi = 3.14
radius = 5

Using the variables to calculate the circumference:

2 * pi * radius

And the area:

pi * radius**2

Notes:

7
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 3 - Lists and selection, more functions and some methods

By the end of this module, you will be able to:


• Create lists and be able to index them for selection
• Use functions and methods to work with lists
• Appreciate the difference between functions and methods in Python

3.3 - Data structures: list

A list is denoted by using square brackets with a comma between each element.
To make a list called values:

values = [2, 5, 3, 7]

We can use the function type() on a list to reveal its data structure:

type(values)
Out[ ]: list

Some other useful functions for finding the number of elements of a list and the biggest element:

len(values)
Out[ ]: 4
max(values)
Out[ ]: 7

3.5 - List indexing

One important fact to remember is that the index starts at 0.

index 0 1 2 3
values[ 2, 5, 3, 7 ]

We denote the index of an element with square brackets. Thus, the first element of values is
extracted with:

values[0]
Out[ ]: 2

And the third element by:

values[2]
Out[ ]: 3

3.7 - Lists: data types


Lists can contain other data types, such as strings (str)

names = ['maria', 'isaac', 'sam', 'jamie']


type(names)
Out[ ]: list

8
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

type(names) tells us what type of object names is, whereas

type(names[0])
Out[ ]: str

tells us what type of variable is in the first element of the names list.

3.9 - Methods

To use an object’s methods, you use the dot notation.

Here are some examples of methods that do require an argument:


Lists have a count() method used like this:

names.count('maria')
Out[ ]: 1

We can also get the index of ‘maria’

names.index('maria')
Out[ ]: 0

An example of a method that doesn’t require an argument and that changes a list object as well as
returning a value is the pop method.

names.pop()
Out[ ]: 'jamie'

Pop returns the last element of names but it also changes names!

print(names)
Out[ ]: ['maria', 'isaac', 'sam']

Notes:

9
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 4 - For Loops and other control statements

At the end of this module, you will be able to:


• Create For loops to iterate through different data structures
• Use Else, If and Elif statements to change how data is treated
• Understand the difference between Continue and Break statements and when to use them

4.2 - Making your code loopy

For Loops take the general structure:

for element in sequence:


do something to the current element.

Q. What is a For loop? And what data structures do they work with?

Examples:

for i in 'Spam':
print(i)
Out[ ]: S
Out[ ]: p
Out[ ]: a
Out[ ]: m

for letter in 'Spam':


print(letter)
Out[ ]: S
Out[ ]: p
Out[ ]: a
Out[ ]: m

With For Loops you can repeat a function a specified number of times.

for i in range(3):
print('Hello')
Out[ ]: Hello
Out[ ]: Hello
Out[ ]: Hello

4.5 - Combining loops with methods

For Loops are also powerful in their ability to access data from inside data structures in an iterative
fashion, which you then could manipulate, transform, store, or do anything else you can think of.

present = ['kick', 'lick', 'chuck']

10
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

past = []
for verb in present:
past.append(verb + 'ed')
print(past)
Out[ ]: ['kicked', 'licked', 'chucked']

4.8 - If Statements

If statements specify that the code should only be run if a condition is satisfied. They take the
general form:

if condition
code

Example:

present = ['kick', 'lick', 'chuck', 'tie']


past = []
for verb in present:
if 'ck' in verb:
past.append(verb + 'ed')
print(past)
Out[ ]: ['kicked', 'licked', 'chucked']

4.9 - If and Else Statements

Else specifies what code to run if the if statement is not satisfied.

present = ['kick', 'lick', 'chuck', 'tie']


past = []
for verb in present:
if 'ck' in verb:
past.append(verb + 'ed')
else:
past.append(verb + 'd')
print(past)
Out[ ]: ['kicked', 'licked', 'chucked', 'tied']

4.11 - Other control statements: Skip or Stop?

Break will end a loop entirely, whereas continue will end the current iteration of the loop and move
on to the next.

Examples:

for number in range(4):


if number == 1:
break
print(number)
Out[ ]: 0

for number in range(4):


if number == 1:

11
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

continue
print(number)
Out[ ]: 0
Out[ ]: 2
Out[ ]: 3

4.12 - It’s Elif all the way down

Elif (else if) statements can be stacked multiple times, but they need to come after an if statement.

for number in range(5):


if number == 0:
print('Zero')
elif number%2 == 1:
print('Odd')
else:
print('Even')

Notes:

12
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 5 – NumPy

By the end of this module, you will be able to:


• Understand what is meant by the term library
• Import the NumPy library into Python
• Use some functions of the NumPy library
• Use some NumPy functions to handle simple data arrays

5.3 - Why use libraries?

Q. Why do we use libraries?

5.4 - Common libraries

NumPy
It is the fundamental package for scientific computing in Python. It enables the use and manipulation
of data arrays (really important for data analysis), basic algebra and statistics.

Pandas
Pandas is a specialist data analysis library which enables you to import data from .csv files easily and
put them into DataFrame.

Seaborn
This is a data visualisation library which enables you to draw some spectacularly pretty graphs within
Python.

5.7 - Importing NumPy

In order to use an object defined within the NumPy library, we will have to import it. We can do this
by typing import <name of library>.

Example:

import numpy
a = numpy.array([1,2,3])

5.9 - Aliases for library names

Example:

import numpy as np
b = np.array([1,2,3])

5.11 - NumPy array methods

Making a 2-dimensional array


Example:

13
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

a = np.array([(1,2,3), (4,5,6)])
print(a)

Adding arrays together (element-wise):

c = a + b

Other operations:

cc = c * 10
np.log(c)

5.15 - Multidimensional NumPy arrays

Example:
z = np.array([(1,2,3), (4,5,6), (7,8,9), (10,11,12)])

5.16 - NumPy array methods: creating patterns and empty arrays

Creating an array with a pattern of numbers, example:

x = np.arange(0,8,1)

5.18 - Creating a placeholder array

Example:
x = np.zeros((3,3))

Notes:

14
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 6 - Data import using Pandas


By the end of this module, you will be able to:
• Import the Pandas library
• Explain the difference between a Pandas DataFrame and Series
• Use os to find and set a working directory
• Use Pandas to import data from a csv file into a Pandas DataFrame
• Use pandas to select specific parts of the DataFrame using loc and iloc

6.2 - Introduction to Pandas

A Pandas DataFrame object can be made using python code, or by importing data from txt, csv and
excel files. Pandas can also create objects called Series.

6.3 - Making a DataFrame using Pandas

Example:

import pandas as pd
List1 = [('Maria', 98, 70, 11),('Isaac', 20, 87,
34),('Sam', 93, 60, 100),('Jamie', 100, 68, 0)]
df = pd.DataFrame(data = List1)
print(df)

6.5 - DataFrame indices

df is a common name to assign a DataFrame to. The extra column on the left is the index. This is how
to access data from the structure at a particular entry, since each row will have a unique number.
You can also set columns as the indices.

Example:

df = pd.DataFrame(data = List1, columns =['student',


'maths', 'chemistry', 'biology'])
df = df.set_index('student')
print(df)

6.7 - Working Directories

Before importing any files, use the os library to find our working directory.

Import the os library:


import os
And use one of its methods to display (get) our current working directory:
os.getcwd()

6.8 - Creating a new directory

Making a folder/directory:

os.mkdir('python_work')

15
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Change directory:

os.chdir('python_work')

6.9 - Pandas can import different file formats

A common format is Comma Separated Values (csv).

To import from a .csv file we would use:

df = pd.read_csv('filename.csv')

We can also import from text or excel files, examples:

df = pd.read_csv('filename.txt')
df = pd.read_excel('filename.xlsx', sheet_name='Sheet1')

6.13 - Viewing a DataFrame

For bigger data frames you will need to explicitly tell Python to print all of the data.

pd.set_option('display.max_rows', df2.shape[0])
pd.set_option('display.max_columns', df2.shape[1])
print(df2)

6.14 - Using loc and iloc

You can print specific rows and/or columns of the DataFrame using loc and iloc (location and
integer-location).

Example:
print(df.loc['Maria':'Isaac','maths'])
print(df.iloc[0:2,0])

Q. What is the difference between iloc and loc?

Notes:

16
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 7 - Using Jupyter notebooks


By the end of this module, you will be able to:
• Launch a Jupyter notebook
• Add code to a notebook using cells and run that code
• Add cells for text to a notebook

7.2 - Opening Jupyter notebooks

1. Launch Jupyter notebooks from Anaconda navigator.


2. You will find a browser window opens with a web-based file explorer. The files seen are
specific to you as the notebook is running on your machine (not the internet).
3. Start a new notebook using the New button and choosing Python3. This opens a new,
currently untitled, notebook with a single code cell.

7.3 – 7.6 Notebooks have cells

• A notebook has one code ‘cell’ at first. You can type Python code and comments into it.
• Run the code by pressing the Run button which is on the tool bar at the top.
• The output will appear underneath along with an additional code cell.
• A notebook can have as many cells as you like to lay out your work. Cells can be code or text.
You can use a text cell to write about your work.
• To turn a cell from code to text (or vice versa), we use the dropdown menu on the right of
the toolbar and choose the 'markdown' option.
• If you want to alter one of the cells, click on it, edit as you require and run it again.

Notes:

17
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 8 - Summarising, analysing and visualising biochemical data


By the end of this module, you will be able to:
• Subset a Pandas DataFrame
• Use the seaborn package to create scatterplots and regression plots
• Use the scikit-learn package to carry out a linear regression
• Access regression model estimates for use

Notes:

18
Biochemical Society Online Training Course
Practical Python for beginners: a Biochemist's guide

Module 9 – Case study, Exploring Metagenomics Data

By the end of this module, you will be able to:


• Use your understanding of data types and control statements to follow a complex example
• Run complex code in a Jupyter notebook
• Appreciate how to compare genes in a metagenomic dataset to those in specific pathways
using Python

Notes:

19

You might also like