0% found this document useful (0 votes)
1 views15 pages

Python Tutorial Completed - Michigan.pdf

The EECS 445 Python Tutorial covers essential steps for setting up Python, including checking the version, installing Python and libraries, and running Python code through various methods such as .py files, Jupyter notebooks, and Google Colab. It also introduces NumPy, a fundamental package for scientific computing, explaining how to create and manipulate arrays, perform mathematical operations, and utilize linear algebra functions. The tutorial emphasizes using virtual environments to manage dependencies effectively.

Uploaded by

Omar D. Vazquez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views15 pages

Python Tutorial Completed - Michigan.pdf

The EECS 445 Python Tutorial covers essential steps for setting up Python, including checking the version, installing Python and libraries, and running Python code through various methods such as .py files, Jupyter notebooks, and Google Colab. It also introduces NumPy, a fundamental package for scientific computing, explaining how to create and manipulate arrays, perform mathematical operations, and utilize linear algebra functions. The tutorial emphasizes using virtual environments to manage dependencies effectively.

Uploaded by

Omar D. Vazquez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

EECS 445 Python Tutorial

In this tutorial we will walk through:

1. Making sure you have the correct version of Python installed


2. How to install libraries
3. Three different ways to run Python code
4. Using numpy to efficiently make calculations in Python

1. Checking Python version


First things first, we need to make sure you are using a new-enough version of Python. To check if
Python is already installed, and if so, its version:

python --version
If you have Python 2.x, install Python 3.7+, and use python3 .

Installing Python 3 on WSL and Linux


If you're on Windows, make sure you're using Windows Subsystem for Linux (WSL).

This assumes you are using Ubuntu or another Linux distribution that uses apt for package
management.

sudo apt-get install software-properties-common


sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.8
sudo apt-get install python3.8-venv

Installing Python 3 on MacOS


Make sure you have Homebrew installed (this should be true if you took EECS 280).

brew install python

2. How to install libraries


We will use virtualenv and pip to manage Python dependencies.

To create a virtual environment named env in your current folder, run this (only once):

python3 -m venv env

This will create an env/ folder which contains the files and binaries for your virtual environment.

To activate the virtual environment:

source env/bin/activate

To deactivate the virtual environment once you're done:

deactivate

Note!

Install all packages and do your work inside the (activated) virtual environment —
this will make your life a lot easier since you will not have to keep track of outside
dependencies!

To install dependencies, first make sure your virtualenv is activated. You should see (env) to the
left of your shell prompt.

Then, run
pip install <package names>

For example, for HW1,

pip install numpy matplotlib jupyter notebook

3. Running Python code


Method 1: Running a .py file
Python is an interpreted language, which means you provide a .py file as input to the Python
interpreter (usually python3 ), and the interpreter will execute that code.

So, just create a file (e.g. helloworld.py ) and write some code (e.g. print("hello,
world") ). Then, to run this code, just run the following in your terminal:

python3 helloworld.py

Method 2: Running a local Jupyter notebook


Jupyter notebooks are used frequently in the machine learning and data science world to combine
notes and code. In fact, you're reading a Jupyter notebok right now! Assuming you have Jupyter
Notebook installed to your virtual environment ( pip install jupyter notebook ), it's really
easy to launch a local Jupyter notebook server.

Just run:

jupyter notebook

in the same directory as the notebook(s) you want to open. This should launch Jupyter Notebook in
a new browser tab.

Note!

If you are on WSL, you may get an error and the browser tab won't open. This is
because your browser is not installed through WSL, and it's totally okay! Just copy
and paste the URL it specifies in this message into your browser

Or copy and paste one of these URLs:


http://localhost:8888/?token=...
or http://127.0.0.1:8888/?token=...

You can click open the notebooks and run the cells inside by using the buttons on the top bar or
pressing shift+enter to run a single cell.

Method 3: Running a Jupyter notebook on Google


Colab
Google Colaboratory (https://colab.research.google.com/notebooks/intro.ipynb#recent=true) is an
online Jupyter notebook instance. You can use it to run notebooks that you upload or notebooks
you store on Google Drive. To use Colab, you can navigate to the website, then select a notebook
from your Google Drive or upload a notebook from your local filesystem.

The interface should be similar to (but not exactly the same as) the local Jupyter notebook.

Note!

Loading external data on Colab is slightly different than with the other methods,
since notebooks are stored in the cloud. See this Colab notebook
(https://colab.research.google.com/notebooks/io.ipynb) for how to load datasets.

4. NumPy Basics
From the NumPy homepage (https://numpy.org/):

NumPy is the fundamental package for scientific computing with Python. It contains
among other things:

a powerful N-dimensional array object


sophisticated (broadcasting) functions
tools for integrating C/C++ and Fortran code
useful linear algebra, Fourier transform, and random number capabilities

We will focus mostly on the first line (the array object), with some coverage of other functions
numpy provides.

In [1]: # import numpy, alias as np for convenience.


import numpy as np

The Humble NumPy Array


The numpy array is an N-dimensional grid of items of the same type (in our case, usually floats or
ints).

We can create numpy arrays from python lists, or in any number of ways described in the
documentation (https://numpy.org/devdocs/user/quickstart.html#array-creation).

In [2]: # creating numpy array from python list using np.array() function
a = np.array([1, 2, 3, 4, 5, 6])

# these can be multiple dimensions -- just use nested lists!


b = np.array([[1.1, 2.2, 3.3], [4.4, 5.5, 6.6]])

In [3]: l = [1, 2, 3, 4]
In [5]: type(l)

Out[5]: list

Properties:

Homogeneous
All the elements of a numpy array are of the same type. You can see what type our values are
stored as with dtype

In [6]: [a.dtype, b.dtype]

Out[6]: [dtype('int64'), dtype('float64')]

N-dimensional
Numpy arrays don't need to be 1-dimensional. They can be 2-dimensional (like a matrix), or even
more. Each dimension is called an axis and the shape is a tuple of sizes along each axis.

Remember these terms: axis and shape . These are essential in using numpy.

In [8]: b

Out[8]: array([[1.1, 2.2, 3.3],


[4.4, 5.5, 6.6]])

In [7]: [a.shape, b.shape]

Out[7]: [(6,), (2, 3)]

a only has 1 axis. b has 2 axes.

Caution!

𝑁 vector is the same as an 𝑁 × 1 matrix. This is


In math, we like to say a length-
not exactly true with numpy.

In [9]: x = np.array([1, 2, 3, 4])


y = x.reshape(4, 1)

In [10]: (x, x.shape)

Out[10]: (array([1, 2, 3, 4]), (4,))


In [11]: (y, y.shape)

Out[11]: (array([[1],
[2],
[3],
[4]]),
(4, 1))

You can think of x as being 1D (you index once to get to a value) and y as being 2D (you have to
index twice to get to a value)

In [12]: x[0]

Out[12]: 1

In [13]: print(y[0])
print(y[0][0])

[1]
1

Indexing
Like vanilla Python lists, we can access data stored in a numpy array by indexing into it with
integers. Slices work, too.

In [15]: a

Out[15]: array([1, 2, 3, 4, 5, 6])

In [14]: # gives us the data at index 0


a[0]

Out[14]: 1

In [18]: # gives us data from index 2 inclusive to the end.


a[2:]

Out[18]: array([3, 4, 5, 6])

Unlike Python lists, we have some more options when indexing into Numpy arrays.

Multiple Indices
You can pass a list (or a numpy array) as the index to get data from multiple indices.

In [19]: # gives us data from index 2, 3,and 5


a[[2, 3, 5]]

Out[19]: array([3, 4, 6])


Indexing across dimensions
b[0] gives us the first row: [1.1, 2.2, 3.3]

What if, instead of the first row, we want just the second value of the first row? We can use
commas to separate along axes.

In [20]: b

Out[20]: array([[1.1, 2.2, 3.3],


[4.4, 5.5, 6.6]])

In [21]: # index 0 of axis 0 gives us the first row: [1.1, 2.2, 3.3].
# index 1 of axis 1 gives us the second column: [[2.2], [5.5]]
# together, we get the value at the first row, second column.
b[0, 1]

Out[21]: 2.2

We can also use slices to get, for instance, the entire second column.

In [22]: # slice : of axis 0 gives us all rows.


# index 1 of axis 1 gives us the second column.
# together, we get the second column of all rows.
b[:, 1]

Out[22]: array([2.2, 5.5])

Doing math with numpy


Numpy also provides lots of built-in functions for us to use. We will cover just a few relevant ones
here:

In [29]: np.random.seed(42)
# let's create two 3x3 matrices filled with random integers in the range [0
x = np.random.randint(0, 10, size=(3, 3))
y = np.random.randint(0, 10, size=(3, 3))

In [30]: x

Out[30]: array([[6, 3, 7],


[4, 6, 9],
[2, 6, 7]])

In [31]: y

Out[31]: array([[4, 3, 7],


[7, 2, 5],
[4, 1, 7]])
Basic Arithmetic
We can use the operators we're already familiar with to do element-wise math.

In [32]: x + y # addition

Out[32]: array([[10, 6, 14],


[11, 8, 14],
[ 6, 7, 14]])

In [33]: x * y # multiplication

Out[33]: array([[24, 9, 49],


[28, 12, 45],
[ 8, 6, 49]])

In [34]: x - y # subtraction

Out[34]: array([[ 2, 0, 0],


[-3, 4, 4],
[-2, 5, 0]])

In [35]: x / y # division

Out[35]: array([[1.5 , 1. , 1. ],
[0.57142857, 3. , 1.8 ],
[0.5 , 6. , 1. ]])

In [36]: x // y # integer division

Out[36]: array([[1, 1, 1],


[0, 3, 1],
[0, 6, 1]])

In [37]: x ** y # raising to the power

Out[37]: array([[ 1296, 27, 823543],


[ 16384, 36, 59049],
[ 16, 6, 823543]])

Math within an array


We can also call numpy functions to sum across an array or find the mean across an array (and
much more!)

In [39]: x

Out[39]: array([[6, 3, 7],


[4, 6, 9],
[2, 6, 7]])
In [40]: y

Out[40]: array([[4, 3, 7],


[7, 2, 5],
[4, 1, 7]])

In [38]: np.sum(x)

Out[38]: 50

In [41]: np.mean(x)

Out[41]: 5.555555555555555

We can even only sum across a single axis (e.g. if we want the sum of each row, we take the sum
across the columns (axis 1))

In [42]: g = np.array([[1, 2, 3], [4, 5, 6]])

In [43]: g

Out[43]: array([[1, 2, 3],


[4, 5, 6]])

In [44]: g.shape

Out[44]: (2, 3)

In [45]: # what will the shape of this output be?


np.sum(g, axis=1)

Out[45]: array([ 6, 15])

In [47]: # what will the shape of this output be?


np.sum(g, axis=0)

Out[47]: array([5, 7, 9])

In [48]: np.sum(g, axis=0).shape

Out[48]: (3,)

In [49]: p = np.random.randint(0, 4, size=(2, 3, 4))


In [51]: p

Out[51]: array([[[3, 1, 1, 1],


[3, 3, 0, 0],
[3, 1, 1, 0]],

[[3, 0, 0, 2],
[2, 2, 1, 3],
[3, 3, 3, 2]]])

In [52]: p.shape

Out[52]: (2, 3, 4)

In [53]: np.sum(p, axis=1)

Out[53]: array([[9, 5, 2, 1],


[8, 5, 4, 7]])

In [54]: # if we don't want to remove the axis, set keepdims=True


np.sum(g, axis=0, keepdims=True)

Out[54]: array([[5, 7, 9]])

In [55]: np.sum(g, axis=0, keepdims=True).shape

Out[55]: (1, 3)

Linear Algebra
There are also linear algebra functions you can use.

In [56]: # get the transpose of a matrix


x.T

Out[56]: array([[6, 4, 2],


[3, 6, 6],
[7, 9, 7]])

In [57]: # get the L2 norm of a vector


np.linalg.norm(x[0])

Out[57]: 9.695359714832659

In [58]: # get the dot product of two vectors


np.dot(x[0], x[1])

Out[58]: 105
In [59]: # multiply two matrices
np.matmul(x, y)

Out[59]: array([[ 73, 31, 106],


[ 94, 33, 121],
[ 78, 25, 93]])

argmax and argmin


In math notation, arg 𝑥max 𝑓(𝑥) returns the 𝑥 that maximizes 𝑓(𝑥).
In the same vein, np.argmin(a) returns the index of the minimum element of a .

In [63]: g

Out[63]: array([[1, 2, 3],


[4, 5, 6]])

In [64]: g.shape

Out[64]: (2, 3)

In [65]: # index of minimum value for each row


np.argmin(g, axis=1)

Out[65]: array([0, 0])

Broadcasting
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic
operations.

Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they
have compatible shapes.

This prevents making needless copies of data and usually leeds to efficient algorithm
implementations.

General Broadcasting Rules


When operating on two arrays, NumPy compares their shapes element-wise. It starts with the
trailing dimensions and works its way forward. Two dimensions are compatible when

1. they are equal, or


2. one of them is 1

If these conditions are not met, a ValueError: operands could not be broadcast together exception
is thrown, indicating that the arrays have incompatible shapes.

The size of the resulting array is the size that is not 1 along each axis of the input
In [66]: # The simplest broadcasting example occurs when an array
# and a scalar value are combined in an operation:

# We can think of the scalar b being stretched during the arithmentic


# operation into an array with the same shape as a

a = np.array([1.0, 2.0, 3.0])


b = 2.0
a * b

Out[66]: array([2., 4., 6.])

In [67]: x = np.arange(4) #[0,1,2,3]

y = np.ones(5) #[1,1,1,1,1]

z = np.ones((3,4)) #([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]

In [68]: x.shape

Out[68]: (4,)

In [69]: y.shape

Out[69]: (5,)

In [70]: x + y

#Unable to broadcacast

-------------------------------------------------------------------------
--
ValueError Traceback (most recent call las
t)
<ipython-input-70-dbfbbd7bce8d> in <module>
----> 1 x + y
2
3 #Unable to broadcacast

ValueError: operands could not be broadcast together with shapes (4,)


(5,)

In [71]: xp = np.arange(5)

In [72]: xp

Out[72]: array([0, 1, 2, 3, 4])

In [78]: xp.dtype

Out[78]: dtype('int64')
In [79]: y.dtype

Out[79]: dtype('float64')

In [73]: xp + y

Out[73]: array([1., 2., 3., 4., 5.])

If we check the broadcasting rules, we'll see numpy tries to match 4 with 5 . They're not equal,
and neither of them are 1, so we get an error. How can we fix this?

We have two options:

1. Add or subtract an element to make the shapes the same.


2. Add an axis with size 1.

In [75]: xx = x.reshape(4, 1)

In [76]: xx.shape

Out[76]: (4, 1)

In [77]: y.shape

Out[77]: (5,)

Now, what comparisons is numpy making? Remember, it "starts with the trailing dimensions and
works its way forward".

4, 1, <- shape of xx
5, <- shape of y

So what does numpy do? It basically clones the array across the size-1 axis 5 times, to give us an
array with shape (4, 5) which also aligns with our y (shape (5,) )

What is the shape of the final result?

In [81]: xx

Out[81]: array([[0],
[1],
[2],
[3]])

In [95]: a[np.newaxis, :].shape

Out[95]: (1, 3)

In [94]: a.reshape((1, 3)).shape

Out[94]: (1, 3)
In [82]: y

Out[82]: array([1., 1., 1., 1., 1.])

In [ ]: [[0 0 0 0 0],
[1 1 1 1 1],
[2 2 2 2 2],
[3 3 3 3 3]]

[[ 1 1 1 1 1],
[ 1 1 1 1 1],
...]

In [80]: xx + y

Out[80]: array([[1., 1., 1., 1., 1.],


[2., 2., 2., 2., 2.],
[3., 3., 3., 3., 3.],
[4., 4., 4., 4., 4.]])

In [83]: (x + z).shape

Out[83]: (3, 4)

In [84]: x+z

Out[84]: array([[1., 2., 3., 4.],


[1., 2., 3., 4.],
[1., 2., 3., 4.]])

In [85]: xpp = xx.reshape((4, 1, 1))

In [87]: xpp.shape

Out[87]: (4, 1, 1)

In [ ]: (4, 1, 5)

In [90]: y.shape

(4, 1, 5)

Out[90]: (5,)
In [88]: xpp + y

Out[88]: array([[[1., 1., 1., 1., 1.]],

[[2., 2., 2., 2., 2.]],

[[3., 3., 3., 3., 3.]],

[[4., 4., 4., 4., 4.]]])

In [89]: (xpp + y).shape

Out[89]: (4, 1, 5)

Broadcasting is tricky, and there is room for bugs! If you have buggy numpy code, a safe bet is to:

1. check that your broadcasting is working as intended and


2. you are applying a function to the correct axis.

You might also like