0% found this document useful (0 votes)

23 views33 pages

Harvard Python For Research

The document provides an overview of Python programming concepts, including the Fibonacci function, mutable vs immutable objects, and the use of modules. It explains various data types such as lists, tuples, sets, and dictionaries, along with their characteristics and operations. Additionally, it covers basic programming structures like loops and statements, emphasizing the importance of object identity, value, and type in Python.

Uploaded by

mavagoncalves

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views33 pages

Harvard Python For Research

Uploaded by

mavagoncalves

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 33

Video 1.0.

The Fibonacci function, as you may have guessed, computes the first
terms of the Fibonacci sequence.

The code underneath the function calls to function 10,000 times, asking
Python to compute the first 10,000 numbers in the Fibonacci sequence.

Then, finally, it adds those numbers up.

For mutables, = makes the right object refer to the left one. In contrast,
the Python shortcut with indexing ":" makes a new copy of a mutable with
its containing elements. Therefore, a and b are different objects, but each
with the same elements.

x = "Hello, world!"
y = x[5:]

What is the value of y?

', world!' correcto
incorrecto

Explanation
This indexing returns all characters in the position 5 or later.

Video 1.1.1 Python basics

The interactive mode is meant for experimenting your code one line or
one expression at a time.

In contrast, the standard mode is ideal for running your programs from
start to finish.
Video 1.1.2

The value of some objects can change in the course of program execution.

Objects whose value can change are said to be mutable objects, whereas
objects whose value is unchangeable after they've been created are
called immutable.

The bulk of the Python library consists of modules.

In order for you to be able to make use of modules in your own code, you
first need to import those modules using the import statement.

These characteristics are called object type, object value, and object
identity.

Object value is the data value that is contained by the object. This could
be a specific number, for example.

Finally, you can think of object identity as an identity number for the
object. Each distinct object in the computer's memory will have its own
identity number.

Most Python objects have either data or functions or both associated with
them.

These are known as attributes. The name of the attribute follows the
name of the object.

The two types of attributes are called either data attributes or methods.

A data attribute is a value that is attached to a specific object.

In contrast, a method is a function that is attached to an object. In other

words, depending on the type of the object.

Different methods may be available to you as a programmer. For

example, you could have two strings. They may have different values
stored in them, but they nevertheless support the same set of methods.

Syntax : mean([data-set])
# list of positive integer numbers
data1 = [1, 3, 4, 5, 7, 9, 2]

x = statistics.mean(data1)

# Printing the mean

print("Mean is :", x)

x.mean() ---- mean function

x.shape -- data attribute

Video 1.1.3

Python modules are libraries of code and youcan import Python

modules using the import statements.

What is a namespace?

Well namespace is a container of names shared by objects that typically

go together. And its intention is to prevent naming conflicts.

What exactly happens when you run the Python import statement?

Three things happen. The first thing that happens is Python creates a new
namespace for all the objects which are defined in the new module.

So, in abstract sense, this is our new namespace. That's the first step. The
second step that Python does is it executes the code of the module and it
runs it within this newly created namespace.

The third thing that happens is Python creates a name-- let's say np for
numpy-- and this name references this new namespace object.

You can do this in two different ways. We can use to dir, dir function, to
get a directory of the methods. I can use the object type.

We're then going to import the numpy module as np. Now, the math
module has a square root method, sqrt, but numpy also has a square root
method, sqrt. What is the difference between these two functions? Well,
let's try an example. If I type math.sqrt, I can ask Python to calculate the
value of the square root of 2. I can do the same exact thing using the
square root function from the numpy module. So far, it appears that these
two functions are identical, but actually these two functions are quite
separate and they exist in different namespaces. It turns out that the
numpy square root function can do things that the math square root
function doesn't know how to do.

1.1.4 Numbers and basic calculations

And Python, in fact, provides three different numeric types.

These are called integers, floating point numbers, and complex numbers.
Python integers have unlimited precision.
That means your integer will never be too long to fit into Python's integer
type.
We can also raise a number to a power.** loor division, or integer division.
This is accomplished by using two slash signs.

It then rounds that number to the closest integer, which is less than the
actual floating point answer. If I hit underscore, Python is returning the
value of the latest operation.

math.factorial.
import math
def fact(n):
return(math.factorial(4))

num = int(input("Enter the number:"))

f = fact(num)
print("Factorial of", num, "is", f)

1.1.5 Random Choice

I can’t figure it out, check later

Video 1.1.6

Expression is a combination of objects and operators that computes a

value.
Many expressions involve what is known as the boolean data type. Objects
of the boolean type have only two values. These are called True and False.
There are only three boolean operations, which are "or", "and", and "not".

There are a total of eight different comparison operations in Python.

Although these are commonly used for numeric types,

==  identical in content
=!  They are the same object
2  integer
2.0  floating point
1.2.1 Sequences

A sequence is a collection of objects ordered by their position.

In Python, there are three basic sequences, which are lists, tuples, and so-
called "range objects".

S[0:2] 0: start location 2:stop location

Python is going to return a slice to which consists of the objects in locations

0 and 1, but it will not return to you the object at location 2.

1.2.2 Lists

Lists are mutable sequences of objects of any type. And they're typically
used to store homogeneous items. If we compare a string and a list, one
difference is that strings are sequences of individual characters, whereas
lists are sequences of any type of Python objects.

If we compare a string and a list, one difference is that strings are

sequences of individual characters, whereas lists are sequences of any type
of Python objects.

In Python, indexes start at zero.

Number[-1] last element of my list

Numbers.append(10)

To show the content of the list : numbers

Another operation we commonly would like to do is to concatenate two or

more lists together
Numbers + x

List + list

Reverse the content of the list  numbers.reverse ()

Sort the content  names.sort()

Sorted: we're actually asking Python to construct a completely new list.

It will construct this new list using the objects in the previous list in such a
way that the objects in the new list will appear in a sorted order

sorted_names = sorted()
Finally, if you wanted to find out how many objects our list contains, we can
use a generic sequence function, len. So we can type len(names), and
Python tells us that our list contains four objects.

Q:Consider a list x=[1,2,3]. Enter the code below for how you
would use the append method to add the number 4 to the end
of list x.
A: x.append(4)

1.2.3 Tuples

Tuples are immutable sequences typically used to store

heterogeneous data.

The best way to view tuples is as a single object that consists of

several different parts.

Because tuples are sequences, the way you access different objects
within a tuple is by their position.
T = (1,3,5,7)
>>> len(T)
4
>>> T + (9,11)
(1, 3, 5, 7, 9, 11)

1. how to pack tuples

x = 35
>>> y = 78
>>> coordinate=(x,y)
>>> type(coordinate)
<class 'tuple'>

2. how you unpack a tuple.

>>> coordinate
(35, 78)
>>> (x,y) = coordinate
>>> x
35
But what if you just have one object within your tuple? To construct a
tuple with just one object, we have to use the following syntax.

We start by saying c is equal to. We put our tuple parentheses. We put

it in our number 2. And we add the comma.

>>> c=(2,3)
>>> type(c)
<class 'tuple'>
>>> c=(2,)
>>> type(c)
<class 'tuple'>

1.2.4 Ranges

Ranges are immutable sequences of integers,

and they are commonly used in for loops.

>>> range(5)
range(0, 5)
>>> list(range(5))
[0, 1, 2, 3, 4]

Ranges require less memory so don’t turn them into list before using them
1.2.5 Strings

Strings are immutable sequences of characters.

In Python, you can enclose strings in either single quote

\\OPERATIONS WITH STRINGS

>>> s = "Python"
>>> len(s) len()function
6
>>> s[0]
'P'
>>> s[-1]
'n'

Slicinng

>>> s[0:5]
'Pytho'

>>> s[-3]
'h'
>>> s[-3:]
'hon'

>>> "y" in s
True. membership

polymorphism means that what an operator does depends on the type

of objects it is being applied to.

I can also add two strings together. In that case, the operation is not
called addition, but concatenation.

1. >>> "hello" + "world"

2. 'helloworld
3.
4. >>> s = "pytgon"
5. >>> 3* s
6. 'pytgonpytgonpytgon'
Dir(str) : Python gives me a long list of different attributes that are
available for strings.
Str.replace? : Python will give a short definition of the method

Because strings are immutable objects, Python doesn't actually modify

your string. Instead what it does -- it returns a new string to you.

The split method takes a string and breaks that down into substrings .

1.2.6 sets

Sets are unordered collections of distinct hashable objects. But what

does it mean for an object to be hashable?

In practice, what that means is you can use sets for immutable objects
like numbers and strings, but not for mutable objects like lists and
dictionaries.

One type of set is called just "a set". And the other type of set is called
"a frozen set". The difference between these two is that a frozen set is
not mutable once it has been created. In other words, it's immutable. In
contrast, your usual, normal set is mutable.

One of the key ideas about sets is that they cannot be indexed. So the
objects inside sets don't have locations.

Another key feature about sets is that the elements can never be
duplicated

>>> ids
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> males = set([1,3,5,7,8])
>>> females= ids- males
>>> type(females)
<class 'set'>
>>> females
{0, 2, 4, 6, 9}
>>> males
{1, 3, 5, 7, 8}
>>> everyone = males | females
>>> everyone
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
>>> everyone & set([1,2,3])
{1, 2, 3}
>>> word="antidisestablishmentarianism"

>>> letter = set(word)

>>> len(letter)
12

>>> x.symmetric_difference(y)
{1, 4}

1.2.7 Dictionaries

Dictionaries are mappings from key objects to value objects.

Dictionaries consists of Key:Value pairs, where the keys must be

immutable, Dictionaries themselves are mutable so this means once
you create your dictionary, you can modify its contents on the fly.

is that they are not sequences, and therefore do not maintain any type
of left-right order.

Uses
age = {"Tim":24,"Jenna":37,"Jim":3}
age["Jim"]
"age["Tim"]+= 3"

The type of the returned object is what's called a "view object". View
objects do precisely what you might imagine that they do. They
provide a dynamic view of the keys or values in the dictionary.

A key point here is that as you update or modify your dictionary, the
views will also change correspondingly.

1.3.1 Dynamic Typing

What a type does is two things. First, it tells a program, you should be
reading these sequences in chunks of, let's say, 32 bits. The second
thing that it tells computer is, what does this number
here, this sequence of bits, represent?

Does it represent a floating-point number, or a character, or a piece of

music, or something else?

If you move data from one variable to another, if the types of these
variables do not match, you could potentially lose information.

Static typing means that type checking is performed during compile

time, whereas dynamic typing means that type checking is performed
at run time.

A key point to remember here is that variable names always link to

objects, never to other variables.

Remember, mutable objects, like lists and dictionaries, can be modified

at any point of program execution.

In contrast, immutable objects, like numbers and strings, cannot be

altered after they've been created in the program.

Remember, a variable cannot reference another variable. A variable

can only reference an object.

Each object in Python has a type, value, and an identity. Mutable

objects in Python can be identical in content and yet be actually
different objects.

Another way to create a copy of a list is to use the slicing syntax.

M = L[:]

1.3.2 Copies

The copy module, which you can use for creating identical copies of
object. There are two types of copies that are available.

A shallow copy constructs a new compound object and then insert its
references into it to the original object.

In contrast, a deep copy constructs a new compound object and then

recursively inserts copies into it of the original objects.
1.3.3 Statements

Statements are used to compute values, assign values, and modify

attributes, The return statement is used to return values from a
function.

Another example is the import statement, which is used to import

modules.

Finally, the pass statement is used to do nothing in situations where we

need a placeholder for syntactical reasons.

Compound statements contain groups of other statements, and they

affect or control the execution of those other statements in some way.
A compound statement consists of one or more clauses, where a clause
consists of a header and a block or a suite of code.

The close headers of a particular compound statement start with a

keyword, end with a colon, and are all at the same indentation level.

A block or a suite of code of each clause, however, must be indented to

indicate that it forms a group of statements that logically fall under
that header.

Remember, the absolute value tells us how far two numbers are from
one another.

1.3.4 For and While Loops

The For Loop is a sequence iteration that assigns items in sequence to

target one at a time and runs the block of code for each item.

Unless the loop is terminated early with the break statement, the block
of code is run as many times as there are items in the sequence.
However, remember that the key value pairs themselves don't follow
any particular ordering inside the dictionary.

The Python while is used for repeated execution of code as long as a

given expression is true.

For a While Loop you're testing some condition some number of times.
When you enter that loop you don't know how many times exactly
you'll be running through that loop. This is in contrast with For Loops
where when beginning the loop, you know exactly how many times you
would like to run through the block of code.

for bear in bears:

if bears[bear]==
print("Hello, "+bear+" bear!")
else:
print("odd")

1.3.5 List comprehensions

to take an existing list, apply some operation to all of the items on the
list, and then create a new list that contains the results.

In Python, there is an operator for this task known as a "list

comprehension".

>>> numbers=range(10)
>>> squares=[]
>>> for number in numbers:
... square=number**2
... squares.append(square)
...
>>>
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> squares2=[number**2 for number in numbers]
>>> squares2
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

sum(x for x in range(1,10) if x % 2)

1.3.6

I'm going to do a re-assignment to line by typing "line=line.restrip()". I

now have line.rstrip().split().

Inside the split, as an argument, I have to provide the character that I

want to use for splitting the line that the string split method returns not
a string but a list.

It has split every line, wherever there is a whitespace, and it returns a

list. When writing a file, we need a second argument, which tells
Python that we would like to create a file object for writing, not for
reading.

We indicate this by providing that second argument as a string, and

the content of the string is simply "w". What this does is it creates a file
object for writing
F.write()
F.close()

1.3.7 Functions

Functions are devices for grouping statements so that they can be

easily run more than once in a program.

Functions maximize code reuse and minimize code redundancy.

Functions enable dividing larger tasks into smaller chunks, an approach
that is called procedural decomposition.

Functions are written using the def statement. You can send the result
object back to the caller using the return statement
>>> def add(a,b):
... mysum=a+b
... return mysum
...
>>> add(12,15)
27

To modify the value of a global variable from inside a function, you can
use the global statement.
Arguments to Python functions are matched by position.
>>> def add_and_sub(a,b):
... mysum=a+b
... mydiff=a-b
... return(mysum, mydiff)
...
>>> add_and_sub(20,15)
(35, 5)

A function is not executed until the given function is called using the
function name followed by parentheses syntax.

The def statement creates an object and assigns it to a name. This

means that we can later in the code reassign the function object to
another name.

>>> newadd= add

>>> def modify(mylist):

... mylist[0]+=10
...
>>> L=[1,2,4,7]
>>> modify(L)
>>> L
[11, 2, 4, 7]

1.3.8

>>> intersect([1,2,3,4,5],[3,4,5,6,7])
[3, 4, 5]

# Program that creates passwords

def password(length):
pw = str()
characters = "abcdefghijklmnopqrstuvwxyz" + "1234567890"
for i in range(length):
pw = pw + random.choice(characters)
return pw

1.3.9 Common mistake

Whenever you're accessing objects in a sequence, make sure you know

how long that sequence is.
Remember in a dictionary, a given key object is always coupled with its
value object, but the key value pairs themselves can appear in any
order inside the dictionary.

So, the lesson here is, make sure you know the type of the object you
are working with, and you know what are the methods that the object
supports.

Therefore, whenever accessing dictionaries, make sure you know the

type of your key objects.

The fundamental problem here is strings are immutable objects.

Therefore their content cannot be modified.

Therefore, make sure that you always know the type of your objects.

2.1.1 Scope Rules

1. L stands for "local," E stands for "enclosing function," G for "global,"

2. and B stands for "built-in."

3. In other words, local is the current function you're in.

4. Enclosing function is the function that called the current function, if any.

5. Global refers to the module in which the function was defined.

6. And built-in refers to Python's built-in namespace.

Name error

Video 2.1.2: Classes and Object-Oriented Programming

1. Inheritance means that you can define a new object

2. type, a new class, that inherits properties from an existing object

3. type.

List.sort()
Class name(list):

1. So another way to state what I just said is that the class statement doesn't

2. create any instances of the class.

Min(list)
List.remove()
Dir(x) methods availables
List.remove_min

Video 2.2.1: Introduction to NumPy Arrays

1. NumPy arrays are n-dimensional array objects

2. and they are a core component of scientific and numerical

computation

3. in Python.

4. NumPy arrays are an additional data type provided by NumPy,

5. and they are used for representing vectors and matrices.

6. Unlike dynamically growing Python lists, NumPy arrays

7. have a size that is fixed when they are constructed.

8. Elements of NumPy arrays are also all of the same data

9. type leading to more efficient and simpler code

10. than using Python's standard data types.

11. By default, the elements are floating point numbers.

12.>>> import numpy as np
13.
14.zero_vect>>>
15.>>> zero_vector = np.zeros(5)
16.>>> zero_matrix=np.zeros((5,3))
17.>>> zero_vector
18.array([0., 0., 0., 0., 0.])
19.>>> zero_matrix
20.array([[0., 0., 0.],
21. [0., 0., 0.],
22. [0., 0., 0.],
23. [0., 0., 0.],
24. [0., 0., 0.]])

>>> x = np.array([1,2,3])
>>> y = np.array ([2,4,6])
>>> [[1,3],[5,9]]
[[1, 3], [5, 9]]
>>> np.array([[1, 3], [5, 9]])
array([[1, 3],
[5, 9]])
>>> A = np.array([[1, 3], [5, 9]])
>>> A.transpose()
array([[1, 5],
[3, 9]])

2.2.2 Slicing Numpy arrays

1. With one-dimension arrays, we can index a given element

2. by its position, keeping in mind that indices start at 0.

3. With two-dimensional arrays, the first index

4. specifies the row of the array and the second index

5. specifies the column of the array.

2.2.3 Indexing numpy arrays

Ind = [elements]

Name of the array[ind]

1. NumPy arrays can also be indexed using logical indices,

Boolean arrays can also be index – logical arrays

1. When you slice an array using the colon operator, you get a view of the
object.

2. This means that if you modify it, the original array will also be modified.

3. This is in contrast with what happens when you index an array, in which case

4. what is returned to you is a copy of the original data.

5. In summary, for all cases of indexed arrays, what is returned

6. is a copy of the original data, not a view as one gets for slices.

2.2.4 Building and Examing Numpy Arrays

Np.linspace(starting point, ending point, number of points I want
to have in my array)
Np.logspace(log of the starting point, endpoint of the array,
number of elements in our array)

Arrayname.shape
Arrayname.size
Np.random.random(10)
Np.any(x < 0.9)
Np.all ()

x%i == 0 tests if x has a remainder when divided by i . If this is not

true for all values strictly between 1 and x , it must be prime!

2.3.1 introduction to matplotlib and Pyplot

1. Matplotlib is a Python plotting library that

2. produces publication-quality figures.

3. It can be used both in Python scripts and when

4. using Python's interactive mode.

5. Pyplot is a collection of functions that make matplotlib work like Matlab,

6. which you may be familiar with.

7. Pyplot is especially useful for interactive work,

8. for example, when you'd like to explore a dataset

9. or visually examine your simulation results.

import matplotlib.pyplot as plt

Plt.plot([list])

Plt.show in the terminal

1. In short, a keyword argument is an argument

2. which is supplied to the function by explicitly naming each parameter

3. and specifying its value.

Plt.plot(x,y, “bo- ”, linewidth=2, markersize=12)

The first letter is the color so blue – b green – g
The second letter is the shape o – circle , s- square

red_patch = mpatches.Patch(color='red', label='The red data')

ax.legend(handles=[red_patch])

2.3.2 customizing your plots

You can use latex functions

Loc=location of the level

1. The working directory is the directory where you have launched your Python.
2.3.3 Plotting using Logarithmic axes

1. In some plots, it's helpful to have one or both axes be logarithmic.

2. This means that for any given point to be plotted, its x or y-coordinate,

3. or both, are transformed using the log function.

Semilogx()
Semilogy()
Loglog()

1. semilogx() plots the x-axes on a log scale and the y in the original scale;

2. semilogy() plots the y-axes on the log scale and the x in the original scale;

3. loglog() plots both x and y on logarithmic scales.

4. So the lesson here is that functions of the form y is equal to x to power alpha
5. show up as straight lines on a loglog() plot.

6. The exponent alpha is given by the slope of the line.

2.3.4 generating histagrams

Np.random.normal(size = 1000)
Plt.hist(x, normed= true, bins= np.linspace (-5, 5 , 21));

To have 20 you need 20 + 1

Subplot(numbers of the rows, number of columns, plot number)
The first integer describes the number of subplot rows, the
second integer describes the number of subplot columns, and the
third integer describes the location index of the subplot to be
created, where indices are laid out along the rows and columns in
the same order as reading Latin characters on a page.

Np.random.gamma(2, 3, 100000)
Comulative = true, histtype = “step”
Plt.figure()

import matplotlib.pyplot as plt # importing matplotlib

import numpy as np # importing numpy
%matplotlib inline # see plot in Jupyter notebook
x=np.arange(0,10,0.5) # define x
y1=2*x+3 # define y1
y2=3*x # define y2
plt.figure(1,figsize=(12,12)) # create a figure object
plt.subplot(331) # divide the figure 3*3, 1st item
plt.plot(x,y1,'r-') # Functional plot
plt.title('Line graph') # define title of the plot
plt.subplot(332) # divide the figure 3*3, 2nd item
plt.plot([1,3,2,7],[10,20,30,40]) # Functional plot
plt.title('Two-dimensional data') # define title of the plot
plt.subplot(333) # divide the figure 3*3, 3rd item
plt.plot([10,20,30,40],[2,6,8,12], label='price') # set label name
plt.axis([15,35,3,10]) # axis scale - x:15-35, y:3-10
plt.legend() # show legends on plot
plt.title('Axis scaled graph') # define title of the plot
plt.subplot(334) # divide the figure 3*3, 4th item
plt.plot(x,y1,'r-',x,y2,'b--') # y1 - red line, y2 - blue line
plt.title('Multi line plot') # define title of the plot
plt.subplot(335) # divide the figure 3*3, 5th item
x=np.random.normal(5,3,1000) # normal distribution - mean: 5,
variance: 3, number of data: 1000
y=np.random.normal(5,3,1000) # normal distribution - mean: 5,
variance: 3, number of data: 1000
plt.scatter(x,y,c='k') # Functional plot
plt.title('Scatter plot')
plt.subplot(336) # divide the figure 3*3, 6th item
player=('Ronaldo','Messi','Son')
goal=[51,48,25]
plt.bar(player,goal,align='center',color='red',width=0.5)
plt.title('Bar plot') # define title of the plot
plt.show() # show plot

import numpy as np # importing numpy

import matplotlib.pyplot as plt # importing matplotlib
from mpl_toolkits.mplot3d import Axes3D # importing Axes3D, 3d plot
fig = plt.figure() # create a figure object
ax = Axes3D(fig) # defining 3d figure ax
X = np.arange(-4, 4, 0.25) # defining X
Y = np.arange(-4, 4, 0.25) # defining Y
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X ** 2 + Y ** 2)
Z = np.sin(R)
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=plt.cm.hot)#plot
ax.contourf(X, Y, Z, zdir='z', offset=-2, cmap=plt.cm.hot)
ax.set_zlim(-2, 2) # set z-axes limit
plt.show()

import matplotlib.pyplot as plt # importing matplotlib

%matplotlib inline # see plot in Jupyter notebook
x = [0, 2, 4, 6] # define x
y = [1, 3, 4, 8] # define y
plt.plot(x,y) # functional plot
plt.xlabel('x values') # define x label
plt.ylabel('y values') # define y label
plt.title('plotted x and y values') # define title
plt.legend(['line 1']) # define legend# save the figure
plt.savefig('plot.png', dpi=300, bbox_inches='tight')
fig.savefig('E:\Bioinformatics/foo.png')plt.show()

import matplotlib.pyplot as plt # importing matplotlib

%matplotlib inline # see plot in Jupyter notebook
x = np.array([0, 2, 4, 6]) # define x
fig = plt.figure() # create a figure object
ax = fig.add_axes([0,0,1,1]) # axes with (L,B,W,H) value
ax.plot(x, x**2, label = 'X2', color='red') # functional plot
ax.plot(x, x**3, label = 'X3', color='black') # functional plot
ax.set_xlim([0, 5]) # set x-axes limit
ax.set_ylim([0,100]) # set y-axes limit
ax.legend() # show legend
plt.show() # show plot
2.4.1 simulating randomness

1. we often use randomness when modeling complicated systems

2. to abstract away those aspects of the phenomenon for which we do not

3. have useful simple models.

Import random
Random.chice([list])
Random.choice(range(1,7))

Choosing a die
Random.choice(random.choice([range(1,7), ranger(1,9),
range(1,11)]))

2.4.2 examples involving randomness

1. 1. Our first example is to roll the die 100 times

2. and plot a histogram of the outcomes, meaning

3. a histogram that shows how frequent the numbers from 1 to 6

4. appeared in the 100 samples.

Import random

Rolls =[]
For k in range(100):
Rolls.append(Random.choice([1,2,3,4,5,6]))

Plt.hist(rolls, bins=np.linspace(0.5, 6.5,7)

Ys =[ ]
For rep in range(100):
Y=0
For k in range(10):
X = Random.choice([1,2,3,4,5,6]))
Y=y+x
Ys.append(y)
The central limit theorem states that if you have a population with mean μ and
standard deviation σ and take sufficiently large random samples from the
population with replacement , then the distribution of the sample means will be
approximately normally distributed.

2.4.3 numpy random module

Import numpy as np
Np.random.random(size of the 1d array)
Np.random.random((number of rows, number of culumns)) – as a
tuple

1. Np.random.normal(The first argument is the mean of the distribution, in

this case 0.

2. And the second argument is the standard deviation, which is equal to 1.

3. from the same distribution, we can specify the length of the 1d array

4. as the third argument.

1. Finally, we can use the same function to generate 2d, or even

2. 3d arrays of random numbers.

3. In that case, we need to insert another pair of parentheses

4. because the dimensions of the array will be added as a tuple.

1. The only problem is that we don't know how to generate

2. an area of random integers in NumPy.

Np.random.randint(low, high =, size=)

X = np.random.randint (1,7(100,10))
X.shape()

Np.sum(X)
Np.sum(X , axis =1)
X = np.random.randint (1,7(100,10))

Y = np.sum(X, axis=1)
Plt.hist(Y);

1. And we can see that the histogram looks smoother.

As we increase

2.4.4 measuring time

Import time
Start_time= time.clock()
End_time = time.clock()
Print(End_time – start_time)

Time / time
How many times faster the second one is

2.4.5 random walks

X (t=k)= xo + delta x (t = 1) + ...... delta x(t= k)

delta_X = Np.random.normal(0,1(2,5))

plt.plot(delta_X[0], delta_X[1], “go”)

#cumulative sum
X = Np.cumsum(delta_x , axis = 1)

Np.random.normal(0,1(2,5))
X = Np.concatenate((X_0, np.cumsum(delta_X , axis = 1)), axis = 1)

plt.plot(delta_X[0], delta_X[1], “ro-”)

plt.safefig(“name.pdf”)
X_0 = np.array(([0],[0]))

X = Np.concatenate((X_0, np.cumsum(delta_X , axis = 1)), axis = 1)

2468
2 6 12 20
Topics:
 Array
 Linked List
 Stack
 Queue
 Binary Tree
 Binary Search Tree
 Heap
 Hashing
 Graph
 Matrix
 Misc
 Advanced Data Structure

3.1.1 DNA translation

Adenine
Cytosine
Guanine
Thymine

1. The so called central dogma of molecular biology

2. describes the flow of genetic information in a biological system.

3. Instructions in the DNA are first transcribed into RNA

4. and the RNA is then translated into proteins.

3.1.2 ncbi
3.1.3 import dna data into python

Inputfile = “dna.txt”

F = open(inputfile, “r”)

Seq = f.read()
Seq
Print(seq)

To remove /n

Seq = Seq.replace(“/n”,” “)

Remove visible or non visible extra charater

Seq = Seq.replace(“/r”,” “)
3.1.4 translating the dna sequence

table = {
'ATA':'I', 'ATC':'I', 'ATT':'I', 'ATG':'M',
'ACA':'T', 'ACC':'T', 'ACG':'T', 'ACT':'T',
'AAC':'N', 'AAT':'N', 'AAA':'K', 'AAG':'K',
'AGC':'S', 'AGT':'S', 'AGA':'R', 'AGG':'R',
'CTA':'L', 'CTC':'L', 'CTG':'L', 'CTT':'L',
'CCA':'P', 'CCC':'P', 'CCG':'P', 'CCT':'P',
'CAC':'H', 'CAT':'H', 'CAA':'Q', 'CAG':'Q',
'CGA':'R', 'CGC':'R', 'CGG':'R', 'CGT':'R',
'GTA':'V', 'GTC':'V', 'GTG':'V', 'GTT':'V',
'GCA':'A', 'GCC':'A', 'GCG':'A', 'GCT':'A',
'GAC':'D', 'GAT':'D', 'GAA':'E', 'GAG':'E',
'GGA':'G', 'GGC':'G', 'GGG':'G', 'GGT':'G',
'TCA':'S', 'TCC':'S', 'TCG':'S', 'TCT':'S',
'TTC':'F', 'TTT':'F', 'TTA':'L', 'TTG':'L',
'TAC':'Y', 'TAT':'Y', 'TAA':'_', 'TAG':'_',
'TGC':'C', 'TGT':'C', 'TGA':'_', 'TGG':'W',
}

If len(seq) % 3 ==0:
For I in range(0, len(seq), 3):
Codon = seq[I : I+3]
Protein += table[codon]
Return protein
Slicing a string
Seq[0:3]

3.1.5 comparing your translation

How to use a with statement to read a full file

Def read_seq(inputfile):

“””reads and returns the input sequence with special characters removed”””
With open(inputfile, “r”) as f:

Seq = f.read()
Seq = Seq.replace(“/n”,” “)
Seq = Seq.replace(“/r”,” “)
Return seq

Dna = read_seq(“filename”)

3.2.1 language processing

Project Gutenberg
3.2.2 counting words

Before we start coding the function itself

it's helpful to create a test string.
I'm going to call that text, and I'll just
copy paste a short string that I wrote previously.
The purpose behind having a text string like this
is to be able to test our function as we make progress with it.
Since this function will keep track of all unique words
and count their frequencies, I'm going to call this function count_words.
It's a function so we'll need the def statement
and the input argument is going to be just simply text in this case.
We agreed that using a dictionary would be
a good solution for this specific task.
I'm going to create an empty dictionary called word_counts.
The next step for us is to break the text down into words.
To accomplish that we'll be using the split method and the character
we want to use for splitting is just an empty space.
This will give us a list that we can loop over so this calls for a for loop.
Because the items in the sequence or list are words,
I'm going to be using word as my lop variable.
So for word in text.split.
And now we're ready to loop.
There are two possible things that can happen as we loop over our text.
We can either come across a word that we've seen before, in which case,
we have to increase the counter associated with that word by one.
In case we see a word we haven't seen before,
we have to establish that entry in the dictionary
and initialize the counter to be equal to 1.
So let's divide this into two subtasks.
We have the case where we have a non-word
and the second case is where we have an unknown word.
Let's deal with the non-word case first.
What we'd like to test for is, whether this word
has appeared in this dictionary before.
This calls for an if statement.
If word in word_counts.
So now we know that we have seen this word before.
What we need to do is we need to access our dictionary word_counts
and we want to increase the counter that's
associated with this specific word.
We want to increase that by 1.
And I'm using the shorthand operation here.
This is the non-word case.
The other situation is where we come across a previously unseen word.
We can use the else statement here.
In this case we still like to access the word_counts dictionary.
But in this case we have to set that counter to be equal to 1,
because this is the first instance of the word that we're seeing.
That deals with the second case, the unknown word.
At this point, we are ready to return the dictionary
to whoever called the function.
We need one more statement in our code, in our function,
which is the return statement.
So we need to return word_counts.
Before we move on, let's make sure to add a docstring in our function.
Now that we have defined the function, let's run it.
And the function has now been defined.
We also want to make sure we have our test text defined
and now we can try running the function and see what happens.
And as expected Python returns a dictionary to us
where the keys are words, unique words,
and the values associated with these keys
are the number of times each word occurs in the text.
Having some test data handy is very useful.
Looking at the dictionary, one obvious shortcoming of our current routine
is that it includes punctuation like periods, or full stops,
as part of the word.
This would lead to an inflation of the word count
because, for example, a word that appears in the middle of the sentence
will be counted separately from the same word
if it appears at the end of a sentence and is immediately
followed by a period.
Another problem is that if the word appears at the beginning of a sentence,
its first letter is capitalized, again giving rise
to double counting of some words.
To address these issues, we're first going to turn the text into lower case.
This means that any word, whether capitalized or not,
will count as one word.
Addressing punctuation is a bit more complex.
Our strategy is to first specify all the punctuation marks
that we'd like to skip, and then loop over that container
and replace every occurrence of a punctuation mark with an empty string.
As the first task we need to turn the text into a lower case.
We can do that using the lower method and then
we just have to recapture that new text.
So we're typing text=text.lower.
The second thing we need to do is, we need
to define the characters that we will be skipping
as we're looping over the text.
We'll construct a list for this purpose, and we
can include a few of the most common punctuation marks
here that we'd like to skip.
For example we can include period, comma, semi-colon,
colon, have single quote, and we can also include double quote.
In this case we have to use single quotes for Python's own string.
The reason we cannot use double quotes for the last string is because double
quotes are also used to begin and end a string.
This is why we'll be using single quotes, which
surround the character that we really want to represent,
which is a double quote.
The next step for us, is loop over all of the skip characters
and replace them with an empty string.
This calls for a for loop.
We'll be taking our text and we will replace ch, the skip character
in question, with an empty string.
We then also want to capture the modified string that the replace method
returns, and this part is done.
Finally, to complete this modification to our function,
we want to make sure to update the docstring
to reflect the change we just made.
We'll just say skip punctuation.
Let's then run the definition of our function.
And now we can try running the function using our test
string that we had defined before.
In this case, looking at the output, it's a dictionary before,
but you'll notice said all of the keys are lowercase, which is what we wanted.
We also go to def the punctuation marks that we included in the skips list.
It's useful to be able to write your own counting routine like we just did.
However, counting the frequency of objects
is such a common operation that Python provides
what is known as a counter tool to support rabbit tallies.
We first need to import it from the collections module, which
provides many additional high performance data types.
The object returned by counter behaves much like a dictionary,
although strictly speaking it's a subclass of the Python dictionary
object.
Let's modify our function to use the counter object.
In this case, I would like to retain both my original function and the one
that uses to counter object.
Our first step is going to be to import that,
so from collections import counter.
To start the function I'm going to take my previous function
and I'm just going to copy paste it here underneath.
This is the code that I'll be working with.
Because this is a different function because it's
using the counter object from collections,
I'm to call this something else.
I'm going to add the word fast at the end.
The counting takes place in the last few lines of the code.
We don't change the first part where we simply convert the text to lowercase,
and we also want to keep the part that skips over punctuation characters.
The only thing that will be changed is the looping
over individual words in our text string.
The last several lines of the code can all simply
be replaced with a single expression.
We will define word_counts on this line, which is the first time we're using it.
The input to our counter object will be the text
that we would like to use for counting.
We'll take our text, we'll split it to get the words, and we're done.
Before we run the function let's first do the import.
We can now run the definition of the function
and then we can test it on our test dataset.
In this case, again as expected, the function
returns a counter object which looks essentially identical to the dictionary
object.
Let's see if the objects returned by these two different functions
are actually the same.
We'll first call the count_words function using our text.
And we want to ask Python if that's equal to the object which is returned
by count_words_fast on that same input.
In this case, the answer is true, therefore
we know that these two different implementations of the same function
return identical objects.

3.2.4 computing word frequency statistics

Def word_stats(word_counts):
Num_unique = Len(word_counts)
Counts =word_counts.values()
Return (num_unique, counts)
3.2.5 reading multiple files
Read_book
3.2.6 plotting book statistics

Pandas
>>> pd.Series([1,2,3],index = ["q","w","e"])
q 1
w 2
e 3
dtype: int64
>>> x= pd.Series([1,2,3],index = ["q","w","e"])
>>> x["w"]
2
>>> age = {"Tim":29, "Jim":31, "Pam":27, "Sam":35}
>>> x = pd.Series(age)
>>> x
Tim 29
Jim 31
Pam 27
Sam 35
dtype: int64
>>> #dataframe
>>> data = {"name" : ["Tim", "Jim", "Pam", "Sam"],
... "age" : [29, 31, 27,35],
... "ZIP" : ["02115","02130","67700","00100"]}
>>> x = pd.DataFrame(data, columns = ["name","age","ZIP"])
>>> x
name age ZIP
0 Tim 29 02115
1 Jim 31 02130
2 Pam 27 67700
3 Sam 35 00100
>>> x.name
0 Tim
1 Jim
2 Pam
3 Sam
Name: name, dtype: object
>>> x= pd.Series([1,2,3,4],index = ["q","w","e","r"])
>>> x
q 1
w 2
e 3
r 4
dtype: int64
>>> x.index
Index(['q', 'w', 'e', 'r'], dtype='object')
>>> sorted(x.index)
['e', 'q', 'r', 'w']
>>> x.reindex(sorted(x.index))
e 3
q 1
r 4
w 2
dtype: int64
>>> x= pd.Series([1,2,3,4],index = ["q","w","e","r"])
>>> y= pd.Series([5,6,7,8],index = ["q","w","t","w"])
>>> x + y
e NaN
q 6.0
r NaN
t NaN
w 8.0
w 10.0
dtype: float64
>>>

The Routledge Handbook of Second Language Acquisition and Speaking
No ratings yet
The Routledge Handbook of Second Language Acquisition and Speaking
491 pages
Empowerment Technologies Quarter 2 Module 1
No ratings yet
Empowerment Technologies Quarter 2 Module 1
44 pages
Fuji Inverter Manual
No ratings yet
Fuji Inverter Manual
103 pages
CS (12th) Mindmaps
No ratings yet
CS (12th) Mindmaps
14 pages
Communication Skills Class 9
90% (10)
Communication Skills Class 9
14 pages
Weeks 4 To 7
No ratings yet
Weeks 4 To 7
155 pages
01 Python I All Master 13 02 2025
No ratings yet
01 Python I All Master 13 02 2025
258 pages
Detailed Lesson Plan in Pe6
No ratings yet
Detailed Lesson Plan in Pe6
5 pages
L4 - Data Handling
No ratings yet
L4 - Data Handling
75 pages
Python
No ratings yet
Python
125 pages
Udacity Python Course
No ratings yet
Udacity Python Course
22 pages
DSP Full Notes Unit 1 To 5
No ratings yet
DSP Full Notes Unit 1 To 5
61 pages
Day 3
No ratings yet
Day 3
73 pages
04 Python3 Intro DRAFT
No ratings yet
04 Python3 Intro DRAFT
177 pages
Python Datatypes
No ratings yet
Python Datatypes
21 pages
Python Basics 1612354525
No ratings yet
Python Basics 1612354525
129 pages
Python 1 Merged
No ratings yet
Python 1 Merged
101 pages
AIMS-Python Notes 2016
No ratings yet
AIMS-Python Notes 2016
76 pages
Python
No ratings yet
Python
60 pages
C15819 - 6.100A2H Slideshow 2023
No ratings yet
C15819 - 6.100A2H Slideshow 2023
57 pages
02 - Variables and Datatypes
No ratings yet
02 - Variables and Datatypes
40 pages
Python Introduction
No ratings yet
Python Introduction
53 pages
Unit Wowo
No ratings yet
Unit Wowo
53 pages
Introduction
No ratings yet
Introduction
49 pages
Python Presentation 1
No ratings yet
Python Presentation 1
64 pages
Chapter 2 PythonBasics
No ratings yet
Chapter 2 PythonBasics
48 pages
Python3: Introduction To Python3 Programming Language
100% (1)
Python3: Introduction To Python3 Programming Language
81 pages
Python With Data Science
No ratings yet
Python With Data Science
102 pages
Learners Guide - Machine Learning and Advanced Analytics Using Python
No ratings yet
Learners Guide - Machine Learning and Advanced Analytics Using Python
44 pages
Python Book
No ratings yet
Python Book
110 pages
UNIT 3 Solution Python Programming QUESTION BANK 2023-24
No ratings yet
UNIT 3 Solution Python Programming QUESTION BANK 2023-24
21 pages
Section 1 SDA
No ratings yet
Section 1 SDA
24 pages
BSBTWK503 PPT v1.0
No ratings yet
BSBTWK503 PPT v1.0
103 pages
Mooc Seminar: Name Rajat Kushwaha St. Id 200211241 Section I'
No ratings yet
Mooc Seminar: Name Rajat Kushwaha St. Id 200211241 Section I'
46 pages
Python For Data Science and AI
100% (1)
Python For Data Science and AI
13 pages
Python PPT UNIT-2
No ratings yet
Python PPT UNIT-2
27 pages
Python 2
No ratings yet
Python 2
45 pages
Week2 Lecture2
No ratings yet
Week2 Lecture2
27 pages
BDS306B Module1
No ratings yet
BDS306B Module1
11 pages
Unit 2
No ratings yet
Unit 2
32 pages
Python 1
No ratings yet
Python 1
14 pages
Arithmetic Operators
No ratings yet
Arithmetic Operators
12 pages
Python 1
No ratings yet
Python 1
47 pages
Data Handling
No ratings yet
Data Handling
17 pages
Paython
No ratings yet
Paython
66 pages
Unit 2
No ratings yet
Unit 2
16 pages
Python Unit-1
No ratings yet
Python Unit-1
10 pages
Python Interview Question Toward Machine Learning
No ratings yet
Python Interview Question Toward Machine Learning
22 pages
Secure Wireless Data Transmission Report
50% (2)
Secure Wireless Data Transmission Report
75 pages
ch1 Revision Tour-1
No ratings yet
ch1 Revision Tour-1
29 pages
DWM Experiment 1
No ratings yet
DWM Experiment 1
9 pages
Python Question Paper Solved
No ratings yet
Python Question Paper Solved
13 pages
Upute Za Montiranje Vretena
No ratings yet
Upute Za Montiranje Vretena
80 pages
Python Unit 2
No ratings yet
Python Unit 2
8 pages
Mid Prep Data
No ratings yet
Mid Prep Data
10 pages
Internship Prutech Documentation
No ratings yet
Internship Prutech Documentation
8 pages
Data Types: Python Numbers
No ratings yet
Data Types: Python Numbers
7 pages
Introduction To Python Programming
No ratings yet
Introduction To Python Programming
36 pages
Aultons Pharmaceuticals Drying PDF
No ratings yet
Aultons Pharmaceuticals Drying PDF
18 pages
What Is Slicing in Python
No ratings yet
What Is Slicing in Python
8 pages
Python Objects: After Reading This Chapter, The Reader Will Be Able To
No ratings yet
Python Objects: After Reading This Chapter, The Reader Will Be Able To
21 pages
Python Programming Notes
No ratings yet
Python Programming Notes
33 pages
Final Broucher C&W 2023-24
No ratings yet
Final Broucher C&W 2023-24
46 pages
Chap 3-3-2 Grad Varied Flow Civil App-Online RRR Stvers
No ratings yet
Chap 3-3-2 Grad Varied Flow Civil App-Online RRR Stvers
17 pages
Manual de Medidores de GN B3 Roots
No ratings yet
Manual de Medidores de GN B3 Roots
32 pages
An Introduction To Python: Phil Spector
No ratings yet
An Introduction To Python: Phil Spector
26 pages
Python 2 DataTypes
No ratings yet
Python 2 DataTypes
31 pages
Invoice PDF
No ratings yet
Invoice PDF
1 page
Political Theology Four Chapters On The Concept of Sovereignty (Carl Schmitt)
No ratings yet
Political Theology Four Chapters On The Concept of Sovereignty (Carl Schmitt)
53 pages
Learn Python in 10 Minutes
No ratings yet
Learn Python in 10 Minutes
10 pages
PGDAS Brochure 5aug2022
No ratings yet
PGDAS Brochure 5aug2022
19 pages
Bosch Rexroth Gearbox Product Line
100% (1)
Bosch Rexroth Gearbox Product Line
10 pages
Python Reserved Words
No ratings yet
Python Reserved Words
2 pages
70mai Hardware Kit User Manual EN
100% (1)
70mai Hardware Kit User Manual EN
5 pages
(1600) Instruction Manual PDF
No ratings yet
(1600) Instruction Manual PDF
8 pages
GIS and Its Implementations
No ratings yet
GIS and Its Implementations
250 pages
Question IV. Supply The Correct Verb Tense
No ratings yet
Question IV. Supply The Correct Verb Tense
1 page
TCP Header
No ratings yet
TCP Header
8 pages
Unit III Referance Book
No ratings yet
Unit III Referance Book
65 pages
Zencha Washbowl # 237342..79 / 237342..79 / 237342..79 / 237342..79
No ratings yet
Zencha Washbowl # 237342..79 / 237342..79 / 237342..79 / 237342..79
2 pages
Levelling Surveying Terminology
No ratings yet
Levelling Surveying Terminology
24 pages
CORE Stat and Prob Q4 Mod11 W1 Hypothesistesting
No ratings yet
CORE Stat and Prob Q4 Mod11 W1 Hypothesistesting
24 pages
Personal Information: Application For Employment
No ratings yet
Personal Information: Application For Employment
1 page
Sist TS Cen TS 16555 7 2016
No ratings yet
Sist TS Cen TS 16555 7 2016
11 pages
Source:: 0606/11/M/J/19 - Question No. 10
No ratings yet
Source:: 0606/11/M/J/19 - Question No. 10
4 pages
Preezie Use Case
No ratings yet
Preezie Use Case
3 pages
Meteorologist For A Day Project-Revised
No ratings yet
Meteorologist For A Day Project-Revised
1 page
Learn Programming Using C#
From Everand
Learn Programming Using C#
Taurius Litvinavicius
No ratings yet
Python: Advanced Guide to Programming Code with Python
From Everand
Python: Advanced Guide to Programming Code with Python
Charlie Masterson
No ratings yet
A Beginner's guide to Python
From Everand
A Beginner's guide to Python
Steven Mcananey
No ratings yet