Python Lectures 4rev
Python Lectures 4rev
STRUCTURES
PYTHON
FOR
GENOMIC
DATA
SCIENCE
Lists
A
list
is
an
ordered
set
of
values:
['gene', 5.16e-08, 0.000138511, 7.33e-08]
'gene'
5.16e-08
0.000138511
7.33e-08
Lists
A
list
is
an
ordered
set
of
values:
['gene', 5.16e-08, 0.000138511, 7.33e-08]
gene_expression
gene
5.16e-08
0.000138511
7.33e-08
You
can
access
individual
list
elements:
>>> print(gene_expression[2])
0.000138511
>>> print(gene_expression[-1])
7.33e-08
4
Modifying
Lists
0
gene_expression
Lif
gene
5.16e-08
0.000138511
7.33e-08
Modifying
Lists
0
gene_expression
Lif
gene
5.16e-08
0.000138511
7.33e-08
Unlike
strings,
which
are
immutable,
lists
are
a
mutable
type!
Slicing
Lists
gene_expression
Lif
gene
5.16e-08
0.000138511
7.33e-08
The
following
special
slice
returns
a
new
copy
of
the
list:
>>> gene_expression[:]
['Lif', 5.16e-08, 0.000138511, 7.33e-08]
Slicing
Lists
gene_expression
Lif
gene
5.16e-08
0.000138511
7.33e-08
Assignment
to
slices
is
also
possible,
and
this
can
change
the
list:
>>> gene_expression[1:3]=[6.09e-07]
0
gene_expression
Lif
gene
1
6.09e-07
5.16e-08
2
7.33e-08
0.000138511
Lif
gene
6.09e-07
5.16e-08
7.33e-08
0.000138511
The
built-in
function
len()
also
applies
to
lists:
>>> len(gene_expression)
3
Lif
gene
6.09e-07
5.16e-08
7.33e-08
0.000138511
The
built-in
function
len()
also
applies
to
lists:
>>> len(gene_expression)
3
10
Lists
As
Objects
The
list
data
type
has
several
methods.
Among
them:
a
method
to
extend
a
list
by
appending
all
the
items
in
a
given
list:
>>> gene_expression.extend([5.16e-08, 0.000138511])
>>> gene_expression
['Lif', 7.33e-08, 5.16e-08, 0.000138511]
11
Lists
As
Objects
The
list
data
type
has
several
methods.
Among
them:
a
method
to
extend
a
list
by
appending
all
the
items
in
a
given
list:
>>> gene_expression.extend([5.16e-08, 0.000138511])
>>> gene_expression
['Lif', 7.33e-08, 5.16e-08, 0.000138511]
>>>
print(gene_expression.count('Lif'),gene_expression.count('gene'))
1 0
You
can
Tind
all
the
methods
of
the
list
object
using
the
help()
function:
>>> help(list)
12
Lists
As
Stacks
The
list
methods
append
and
pop
make
it
very
easy
to
use
a
list
as
a
stack,
where
the
last
element
added
is
the
Tirst
element
retrieved
(last-in,
Tirst-out).
>>> stack=['a','b','c','d]
4
3
2
1
0
d
c
b
a
stack
elem
13
Lists
As
Stacks
The
list
methods
append
and
pop
make
it
very
easy
to
use
a
list
as
a
stack,
where
the
last
element
added
is
the
Tirst
element
retrieved
(last-in,
Tirst-out).
>>> stack=['a','b','c','d']
2
1
0
d
c
b
a
stack
elem
14
Sorting
Lists
There
are
two
ways
to
sort
lists:
one
way
uses
the
sorted()
built-in
function:
>>>
>>>
[1,
>>>
[3,
mylist=[3,31,123,1,5]
sorted(mylist)
3, 5, 31, 123]
mylist
31, 123, 1, 5]
>>>
mylist.sort()
15
Sorting
Lists
There
are
two
ways
to
sort
lists:
one
way
uses
the
sorted()
built-in
function:
>>>
>>>
[1,
>>>
[3,
mylist=[3,31,123,1,5]
sorted(mylist)
3, 5, 31, 123]
mylist
31, 123, 1, 5]
>>>
mylist.sort()
>>> mylist
[1, 3, 5, 31, 123]
Tuples
A
tuple
consists
of
a
number
of
values
separated
by
commas,
and
is
another
standard
sequence
data
type,
like
strings
and
lists.
>>> t=1,2,3
>>> t
We
may
input
tuples
may
with
or
(1, 2, 3)
without
surrounding
parentheses.
>>> t=(1,2,3)
>>> t
(1, 2, 3)
17
Tuples
A
tuple
consists
of
a
number
of
values
separated
by
commas,
and
is
another
standard
sequence
data
type,
like
strings
and
lists.
>>> t=1,2,3
>>> t
We
may
input
tuples
may
with
or
(1, 2, 3)
without
surrounding
parentheses.
>>> t=(1,2,3)
>>> t
(1, 2, 3)
Tuples
have
many
common
properties
with
lists,
such
as
indexing
and
slicing
operations,
but
while
lists
are
mutable,
tuples
are
immutable,
and
usually
contain
an
heterogeneous
sequence
of
elements.
18
Sets
A
set
is
an
unordered
collection
with
no
duplicate
elements.
Set
objects
support
mathematical
operations
like
union,
intersection,
and
difference.
>>> brca1={'DNA repair','zinc ion binding','DNA
binding','ubiquitin-protein transferase activity', 'DNA
repair','protein ubiquitination'}
>>> brca1
{'DNA repair','zinc ion binding','DNA binding','ubiquitin-protein
transferase activity', 'DNA repair','protein ubiquitination'}
19
Sets
A
set
is
an
unordered
collection
with
no
duplicate
elements.
Set
objects
support
mathematical
operations
like
union,
intersection,
and
difference.
>>> brca1={'DNA repair','zinc ion binding','DNA
binding','ubiquitin-protein transferase activity', 'DNA
repair','protein ubiquitination'}
>>> brca1
{'DNA repair','zinc ion binding','DNA binding','ubiquitin-protein
transferase activity,'protein ubiquitination'}
>>> brca2={'protein binding','H4 histone acetyltransferase
activity','nucleoplasm', 'DNA repair','double-strand break
repair', 'double-strand break repair via homologous
recombination'}
20
union
intersection
difference
21
Dictionaries
A
dictionary
is
an
unordered
set
of
key
and
value
pairs,
with
the
requirement
that
the
keys
are
unique
(within
one
dictionary).
TF_motif
"SP1"
'gggcgg'
"C/EBP"
'attgcgcaat'
"ATF"
'tgacgtca'
"c-Myc"
'cacgtg'
"Oct-1"
'atgcaaat'
keys: can be
values: can be
any type.
>>> TF_motif =
{'SP1' :'gggcgg',
'C/EBP':'attgcgcaat',
'ATF':'tgacgtca',
'c-Myc':'cacgtg',
'Oct-1':'atgcaaat'}
22
23
Updating
A
Dictionary
>>> TF_motif={'SP1' : 'gggcgg', 'C/EBP':'attgcgcaat',
'ATF':'tgacgtca','c-Myc':'cacgtg'}
24
one:
in function len():
>>> len(TF_motif)
6
The lists found as above are in arbitrary order, but if you want
>>> sorted(TF_motif.keys())
['AP-1', 'ATF', 'C/EBP', 'Oct-1', 'SP1', 'c-Myc']
>>> sorted(TF_motif.values())
['atgcaaa', 'attgcgcaat', 'cacgtg', 'gggcgg', 'tga(g/c)tca',
'tgacgtca']
26
Strings
Lists
Dictionaries
Creation
[a, b, ..., n]
Access to an element
s[i]
L[i]
D[key]
Membership
c in s
e in L
key in D
Remove en element
Not
Possible
s
=
s[:i1]+s[i+1:]
del L[i]
del D[key]
Change an element
Not
Possible
s=s[:i1]+new+s[i+1:]
L[i]=new
D[key]=new
Add an element
Not
Possible
s=s
+
new
L.append(e)
D[newkey]=val
Remove
consecutive
elements
Not
Possible
s=s[:i]+s[k:]
del L[i:k]
Change
consecutive
elements
Not
Possible
s=s[:i]+news+s[k:]
L[i:k]=Lnew
Not Possible
Not
Possible
s=s+news
L.extend(newL)
or
L
=
L
+
Lnew
D.update(newD)
27