Python Basics: Recycling Nicole's Slides From Year 2016
Python Basics: Recycling Nicole's Slides From Year 2016
basics
57
What is Python?
• Python is a widely used programming language
• First implemented in 1989 by Guido van Rossum
• Free, open-source software with community-based
development
• Trivia: Python is named after the BBC show “Monty Python’s
Van Rossum is known as
Flying Circus” and has nothing to do with reptiles a "Benevolent Dictator
For Life" (BDFL)
Which Python?
• There are 2 widely used versions of Python: Python2.7 and
Python3.x
• We’ll use Python3
• Many help forums still refer to Python2, so make sure
you’re aware which version is being referenced 58
Interacting with Python
There are 2 main ways of interacting with Python:
60
Variables (cont.)
• To save a variable, use =
>>> x = 2
The value of the variable
The name of the variable
Cheatsheet
63
Collections of things
• Why is this concept useful?
• We often have collections of things, e.g.,
• A list of genes in a pathway
• A list of gene fusions in a cancer cell line
• A list of probe IDs on a microarray and their intensity value
• We could store each item in a collection in a separate variable, e.g.,
gene1 = ‘SUCLA2’
gene2 = ‘SDHD’
...
• A better strategy is to put all of the items in one container
• Python has several types of containers
• List (similar to arrays)
• Set
• Dictionary
64
Lists: what are they?
• Lists hold a collection of things in a specified order
• The things do not have to be the same type
• Many methods can be used to manipulate lists.
Index a list
<listname>[<position>] 'SDHD'
65
Lists: where can I learn more?
• Python.org tutorial:
https://docs.python.org/3.4/tutorial/datastructures.html#more-on-
lists
• Python.org documentation:
https://docs.python.org/3.4/library/stdtypes.html#list
66
Doing stuff to variables
• There are 3 common tools for manipulating variables
• Operators
• Functions
• Methods
67
Operators
• Operators are a special type of function:
• Operators are symbols that perform some mathematical or logical operation
• Basic mathematical operators:
Operator Description Example
+ Addition >>> 2 + 3
5
- Subtraction >>> 2 - 3
-1
* Multiplication >>> 2 * 3
6
/ Division >>> 2 / 3
0.6666666666666666 68
Operators (cont.)
You can also use operators on strings!
Operator Description Example
Is it a bird? Is it a
+ Combine strings together >>> 'Bio' + '5488' plane? No it’s a
'Bio5488' string!
Strings and ints
>>> 'Bio' + 5488 cannot be combined
Traceback (most recent call
last):
File "<stdin>", line 1, in
<module>
TypeError: Can't convert
'int' object to str
implicitly
* Repeat a string multiple times >>> 'Marsha' * 3
'MarshaMarshaMarsha' 69
Relational operators
• Relational operators Operator Description Example
compare 2 things < Less than >>> 2 < 3
True
• Return a boolean <= Less than or equal to >>> 2 <= 3
True
> Greater than >>> 2 > 3
False
>= Greater than or equal to >>> 2 >= 3
False
== is used to test
for equality == Equal to >>> 2 == 3
= is used to assign False
a value to a
variable
!= Not equal to >>> 2 != 3
True 70
Logical operators
• Perform a logical function on 2 things
• Return a boolean
Operator Description Example
and Return True if both arguments are true >>> True and True
True
>>> True and False
False
or Return True if either arguments are true >>> True or False
True
>>> False or False
False
71
Functions: what are they?
• Why are functions useful?
• Allow you to reuse the same code
• Programmers are lazy!
• A block of reusable code used to perform a specific task
Take in Return
Do
arguments something
something
(optional) (optional)
73
Python functions: where can I learn more?
• Python.org tutorial
• User-defined functions:
https://docs.python.org/3/tutorial/controlflow.html#defining-functions
• Python.org documentation
• Built-in functions: https://docs.python.org/3/library/functions.html
74
Methods: what are they?
• First a preamble...
• Methods are a close cousin of functions
• For this class we’ll treat them as basically the
same
• The syntax for calling a method is different
than for a function
• If you want to learn about the differences,
google object oriented programming (OOP)
• Why are functions methods useful?
• Allow you to reuse the same code
75
String methods
Syntax Description Example
<str>.upper() • Returns the string with all letters uppercased >>> x = "Genomics"
>>> x.upper()
'GENOMICS'
<str>.lower() • Returns the string with all letters lowercased >>> x.lower()
'genomics'
<str>.find(<pattern>) • Returns the first index of <pattern> in the string >>> x.find('nom')
• Returns -1 if the if <pattern> is not found 2
77
Conditional statement syntax
Syntax Example Output
If
if <condition>: x is positive
# Do something
If/else
if <condition>: x is NOT positive
# Do something
else:
# Do something else
If/else if/else
if <condition1>: x is negative
# Do something
Indentation matters!!!
elif <condition2>:
Indent the lines of code
# Do something else
that belong to the same
else:
code block
# Do something else
Use 1 tab
78
Commenting your code
• Why is this concept useful?
• Makes it easier for--you, your future self, TAs J, anyone unfamiliar with your
code--to understand what your script is doing
• Comments are human readable text. They are ignored by Python.
• Add comments for
The how The why
• What the script does • Biological relevance
• How to run the script • Rationale for design and methods
• What a function does • Alternatives
• What a block of code does
80
Commenting your code (cont.)
• Commenting is extremely important!
• Points will be deducted if you do not comment your code
81
Comment syntax
Syntax Example
Block comment
# <your_comment>
# <your_comment>
In-line comment
<code> # <your_comment>
82
Python modules
• A module is file containing Python definitions and statements for a
particular purpose, e.g.,
• Generating random numbers
• Plotting
• Modules must be imported at the beginning of the script
• This loads the variables and functions from the module into your script, e.g.,
import sys
import random
• To access a module’s features, type <module>.<feature>, e.g.,
sys.exit()
83
Random module
• Contains functions for generating random numbers for various
distributions
• TIP: will be useful for assignment 1
Function Description
random.choice Return a random element from a list
84
https://docs.python.org/3.4/library/random.html
How to repeat yourself
(for loops)
• Why is this useful?
• Often, you want to do the same thing over
and over again
• Calculate the length of each chromosome in a
genome
• Look up the gene expression value for every gene
• Align each RNA-seq read to the genome
• A for loop takes out the monotony of doing
something a bazillion times by executing a
block of code over and over for you
• Remember, programmers are lazy!
• A for loop iterates over a collection of things
• Elements in a list
• A range of integers
• Keys in a dictionary
85
Indentation matters!!!
Indent the lines of code
that belong to the same
For loop syntax code block
Use 1 tab
Syntax Example Output
for <counter> in <collection_of_things>: Hello!
Hello!
# Do something Hello!
Hello!
Hello!
Hello!
• The <counter> variable is the value Hello!
of the current item in the collection Hello!
of things Hello!
Hello!
• You can ignore it
• You can use its value in the loop 0
• All code in the for loop’s code block 1
2
is executed at each iteration 3
• TIP: If you find yourself repeating 4
something over and over, you can 5
6
probably convert your code to a for 7
loop! 8
9
86
A
Which option would
you rather do?
87
How to repeat yourself (cont.)
• For loops have a close cousin called while loops
• The major difference between the 2
• For loops repeat a block of code a predetermined number of times (really, a
collection of things)
• While loops repeat a block of code as long as an expression is true
• e.g., while it’s snowing, repeat this block of code
• While loops can turn into infinite while loops à the expression is never false so the loop
never exits. Be careful!
• See http://learnpythonthehardway.org/book/ex33.html for a tutorial on while loops
88
Command-line arguments
• Why are they useful?
• Passing command-line arguments to a Python script allows a script to be
customized
• Example
• make_nuc.py can create a random sequence of any length
• If the length wasn’t a command-line argument, the length would be hard-
coded
• To make a 10bp sequence, we would have to 1) edit the script, 2) save the script, and 3)
run the script.
• To make a 100bp sequence, we’d have to 1) edit the script, 2) save the script, and 3) run
the script.
• This is tedious & error-prone
• Remember: be a lazy programmer!
89
90
Command-line arguments
• Python stores the command-line arguments as a list called sys.argv
• sys.argv[0] # script name
• sys.argv[1] # 1st command-line argument
• …
• IMPORTANT: arguments are passed as strings!
• If the argument is not a string, convert it, e.g., int(), float()
• sys.argv is a list of variables
• The values of the variables, e.g., the A frequency, are not “plugged in” until the script
is run
• Use the A_freq to stand for the A frequency that was passed as a command-line
argument
91
Reading (and writing) to files in Python
Why is this concept useful?
• Often your data is much larger than just a
few numbers:
• Billions of base pairs
• Millions of sequencing reads
• Thousands of genes
• It’s may not feasible to write all of this data
in your Python script
• Memory
• Maintenance
How do we solve this problem?
92
Reading (and writing) to files in Python
Input file
The solution:
• Store the data in a separate file Python
script 1
• Then, in your Python script
• Read in the data (line by line)
Output
• Analyze the data file 1
• Write the results to a new output file or print
them to the terminal
Python
• When the results are written to a file, other script 2
scripts can read in the results file to do more
analysis Output
file 2
93
Reading a file syntax
Syntax Example
with open(<file>) as <file_handle>:
for <current_line> in open(<file>) , ‘r’):
<current_line> = <current_line>.rstrip()
# Do something
Output
>chr1
ACGTTGAT
ACGTA
94
The anatomy of a (simple) script
• The first line should always be
#!/usr/bin/env python3
• This special line is called a shebang
• The shebang tells the computer
how to run the script
• It is NOT a comment
95
The anatomy of a (simple) script
96
The anatomy of a (simple) script
• This is a comment
• Comments help the reader better
understand the code
• Always comment your code!
97
The anatomy of a (simple) script