0% found this document useful (0 votes)
45 views35 pages

Round - 0 - Jupyter Notebook

This document introduces a Jupyter notebook for an introductory machine learning course. It discusses Python programming language, Jupyter notebooks, basic Python concepts like data types, sequences, iterations and functions. It includes examples of printing output, creating lists and loops. It outlines tasks for students to create integer sequences and loops. It also introduces NumPy and plotting with Matplotlib.

Uploaded by

JanoAravena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views35 pages

Round - 0 - Jupyter Notebook

This document introduces a Jupyter notebook for an introductory machine learning course. It discusses Python programming language, Jupyter notebooks, basic Python concepts like data types, sequences, iterations and functions. It includes examples of printing output, creating lists and loops. It outlines tasks for students to create integer sequences and loops. It also introduces NumPy and plotting with Matplotlib.

Uploaded by

JanoAravena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

01/06/2021 Round_0 - Jupyter Notebook

Welcome to "CS-EJ3211 - Machine Learning with Python" online course!



This is an introductory Jupyter notebook, in which you will get familiar with Python programming language,
Jupyter Notebook, and some mathematical notations. If you are comfortable with these topics already, please
skip this notebook.

Content

What is Python?

Jupyter Notebooks

Printing output

Basic data types in Python

Creating a sequence of integers

Iterations

User defined functions

Python Libraries

Pandas Data Frames

Plotting with Matplotlib

NumPy Arrays

Vectors and Matrices

Student task. Create sequences of integers.

Student task. Loops.

Student task. Power of two function.

Student task. Numpy Arrays

Student task. Vector and Matrix operations

What is Python?

Python is a programming language used to "give instructions" to a computer to produce the desired actions or
output. Like many other programming languages, such as Ruby, PHP, C++, and Java, Python is high-level
programming language (https://computersciencewiki.org/index.php/Higher_level_and_lower_level_languages),
which makes it easy to learn and use Python.

There are plenty of resources for learning basic Python, and we recommend you to utilize these if you are new
to Python or programming in general. Here (https://wiki.python.org/moin/BeginnersGuide/NonProgrammers),
you can find a comprehensive list of books and courses for beginners. For example, the books "Automate the
Boring Stuff with Python" (https://automatetheboringstuff.com) by Al Sweigart and "Think Python: How to
Think Like a Computer Scientist" (http://greenteapress.com/thinkpython/html/index.html) by Allen B. Downey
are freely available online and are excellent places to start. You do not need to read the whole book, as the
chapters about data types, indexing, loops, and functions will provide sufficient knowledge for this course. If
you prefer more interactive learning, you can find short tutorials and Python exercises that you can run in-
browser here (https://www.w3schools.com/python/python_intro.asp), amongst other places.

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 1/35
01/06/2021 Round_0 - Jupyter Notebook

Jupyter Notebooks

Jupyter Notebook is an interactive environment for running Python code in the browser. You can run notebooks
locally on your computer (given pre-installed python and Jupyter notebook), but we will be using Jupyter Hub
on this course. If you are reading this, you probably successfully logged in to Jupyter Hub and fetched a
notebook. After completing the notebook exercises, you will need to submit the latest notebook version.

A Jupyter notebook consists of blocks/cells containing text (markdown) or code (Python in our case). Below
you can see an example for both types of cells:

<<<< This is markdown cell. >>>>

In [1]:

# This is a code cell


# Lines which start with '#' are comments and they are ignored during code run

print("Hello world!")

Hello world!

To insert/delete cell, go to 'Edit' tab or use keyboard shortcuts:

To run cells, going to 'Cell' tab or using keyboard shortcuts:

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 2/35
01/06/2021 Round_0 - Jupyter Notebook

You can find a more elaborate introduction to Jupyter notebooks here (https://realpython.com/jupyter-
notebook-introduction/).

Printing output

In Python, you can print output by using the print() function:

In [1]:

# Assign value 42 to variable myvar


myvar = 42

# Display output
print("The answer is =", myvar)
print(f"The answer is = {myvar+2}")
print("The answer is = {}".format(myvar*0.5))

The answer is = 42

The answer is = 44

The answer is = 21.0

Basic data types in Python

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 3/35
01/06/2021 Round_0 - Jupyter Notebook

In [2]:

# Numeric: integers
myint = 42
print(myint)

# Numeric: floating-point numbers


myfloat = 42.5
print(myfloat)

# Boolean
mybool = 40+2 == 42
print("The statement 40+2 equals 42 is", mybool)

# Strings
mystr = "forty two"
print(mystr)

# Lists
mylist = [1, 2, "cat", 0.5, False]
print(mylist)

# Print out data types of variables


print(type(myint), type(myfloat), type(mybool), type(mystr), type(mylist))

42

42.5

The statement 40+2 equals 42 is True

forty two

[1, 2, 'cat', 0.5, False]

<class 'int'> <class 'float'> <class 'bool'> <class 'str'> <class 'lis
t'>

Creating a sequence of integers

You can create a sequence of integers using the built-in functions range(start, stop[, step]) and
list() . See https://docs.python.org/3/library/stdtypes.html#range
(https://docs.python.org/3/library/stdtypes.html#range) for more information. This built-in function creates the
sequence [start,start+step,start+2*step,...]. If the argument step is omitted, it defaults to 1. If the start
argument is omitted, it defaults to 0.

Caution!
In range(stop) the sequence starts from 0 and does not include stop value

Below is a blue markdown cell with an example of "Demo" coding exercise, which explains the task and a
code cell with implementation of this task. "Demo" exercises also help to do "Student task" exercises, which
are in yellow.

For student tasks you need to fill out the part after ### STUDENT TASK ### expression. Often the variables
names are already provided, for example:

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 4/35
01/06/2021 Round_0 - Jupyter Notebook

### STUDENT TASK ###

# Create lists

# list1 = ...

list2 = ...`

In this case # Create lists is a comment to clarify the task and # list1 = ... and # list2 =
... are the lines you need to first, uncomment (remove # ) and second, complete. In addition, you need to
remove raise NotImplementedError() line.

You will also see "Sanity check" cells after the student tasks. These cells are used to catch really obvious
mistakes, such as returning string data type instead of float or list with wrong number of elements (length). If
your answer passed these tests, "Sanity checks passed!" will be printed out.

Caution!
Passing sanity checks does NOT mean, that the task is solved correctly. You will
know if the students tasks were solved correctly only after the deadline.

Demo. Create a sequence of integers.


Here you can see an example of creating the sequence of integers from 0 to 10 with range() function.
Note, that if only one number passed as input to range(int) function, it will create sequence starting
from 0 to int (not included) and step size 1.

In [3]:

# create a sequence (list) 0,1,...,10


mylist = list(range(11))

# print mylist variable and its data type


print("mylist=", mylist, "data type =", type(mylist))

mylist= [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] data type = <class 'list'>

Student task. Create sequences of integers.


Your task is to create:

1. list list1 which stores a sequence of integers from 1 to 10 (including 10) with step size=1.
2. list list2 which stores a sequence of integers from 0 to 10 (including 10) with step size=2.

Lists should be created with use of list() and range() functions.

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 5/35
01/06/2021 Round_0 - Jupyter Notebook

In [23]:

list1 = list(range(1, 11))


list2 = list(range(0, 11, 2))

print("list1 = ", list1,"\n" ,"list2 = ",list2)

list1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

list2 = [0, 2, 4, 6, 8, 10]

In [ ]:

# %load solutions/student-task-1.py
list1 = list(range(1,11))
list2 = list(range(0,11,2))

Iterations

In [39]:

# create a sequence consisting of four words


some_sequence = ["hi","how","are","you"]

# loop over the sequence of elements


for word in some_sequence:
print(word)

hi

how

are

you

One of the main use of range() is to create loops that iterate over a sequence of values.

Caution!
Indexing in Python starts by default at 0 (and not at 1!)

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 6/35
01/06/2021 Round_0 - Jupyter Notebook

In [37]:

# create a sequence consisting of four words


some_sequence = ["hi","how","are","you"]
# find the length of the list
length = len(some_sequence)

# loop over the sequence of indices (0,1,2,3)


for i in range(length):
print("index: {} value: {}".format(i, some_sequence[i]))

index: 0 value: hi

index: 1 value: how

index: 2 value: are

index: 3 value: you

In [26]:

# Nested for-loops

# create a list
mylist = [[1,2,3],[4,5,6],[7,8,9]]

# outer loop
for i in range(len(mylist)):
print("\nouter loop, iteration: {} values: {}\n ".format(i, mylist[i]))

# inner loop
for j in range(len(mylist[0])):
print("inner loop, iteration: {} value: {} ".format(j, mylist[i][j]))

outer loop, iteration: 0 values: [1, 2, 3]

inner loop, iteration: 0 value: 1

inner loop, iteration: 1 value: 2

inner loop, iteration: 2 value: 3

outer loop, iteration: 1 values: [4, 5, 6]

inner loop, iteration: 0 value: 4

inner loop, iteration: 1 value: 5

inner loop, iteration: 2 value: 6

outer loop, iteration: 2 values: [7, 8, 9]

inner loop, iteration: 0 value: 7

inner loop, iteration: 1 value: 8

inner loop, iteration: 2 value: 9

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 7/35
01/06/2021 Round_0 - Jupyter Notebook

In [27]:

# Iterating with enumerate() Python function


# It takes as an input iterable object and returns tuple in a form of (index, elemen

# create a list
some_sequence = ["hi","how","are","you"]

# loop over elements of a list


for index, value in enumerate(some_sequence):
print("index: {} value: {}".format(index, value))

index: 0 value: hi

index: 1 value: how

index: 2 value: are

index: 3 value: you

If you need to iterate over two sequences of the same size, you can use the built-in function zip()

In [28]:

# Iterating multiple lists with zip()

# create lists
some_sequence = ["one","two","three","four"]
another_sequence = ["eins","zwei","drei","vier"]

# loop over two lists at the same time


for val1, val2 in zip(some_sequence, another_sequence):
print(val1, val2)

one eins

two zwei

three drei

four vier

In [29]:

# Iterating multiple lists with zip() and enumerate()

# create lists
some_sequence = ["one","two","three","four"]
another_sequence = ["eins","zwei","drei","vier"]

# loop over two lists at the same time


for ind, (val1, val2) in enumerate(zip(some_sequence, another_sequence)):
print("index: {} \nvalue mylist1: {}, value mylist2: {}".format(ind, val1, val2)

index: 0

value mylist1: one, value mylist2: eins

index: 1

value mylist1: two, value mylist2: zwei

index: 2

value mylist1: three, value mylist2: drei

index: 3

value mylist1: four, value mylist2: vier

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 8/35
01/06/2021 Round_0 - Jupyter Notebook

Student task. Loops.


Write a Python program to count the number of even and odd numbers from a list of numbers. Store
results in variables odd_count and even_count .
Hints: (1) use for-loops and if-else (https://www.w3schools.com/python/python_conditions.asp)
statements, (2) operator % is modulo operator (https://www.freecodecamp.org/news/the-python-
modulo-operator-what-does-the-symbol-mean-in-python-solved/) in Python. It's used to get the
remainder of a division.

In [31]:

# create a sequence of numbers


numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9]

# initialize variables
odd_count = 0
even_count = 0

### STUDENT TASK ###


#
#
# remove the line raise NotImplementedError() before testing your solution and submi
for num in numbers:
if not num%2:
even_count+=1
else:
odd_count+=1

print("Number of even numbers :", even_count)


print("Number of odd numbers :", odd_count)

Number of even numbers : 4

Number of odd numbers : 5

In [ ]:

# %load solutions/student-task-2.py
for num in numbers:
if not num%2:
even_count+=1
else:
odd_count+=1

User-Defined Functions

Like in other programming languages, user can define their own functions in Python. The basic syntax for
Python function contain def and return expressions.

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 9/35
01/06/2021 Round_0 - Jupyter Notebook

The code snippet below shows how to define a function multiply() which reads in two arguments x and
y . This function computes the product of the arguments and returns it.

In [2]:

# define a function
def multiply(x,y):
'''
this function takes input x and y
and returns multiplication of x and y

'''

# perform computation
out = x*y

return out

# apply the function


y = multiply(2,3)

# print the result


print(y)
# print the data type of the result
print(type(y))

<class 'int'>

Student task. Power of two function.


Your task is to write Python function power_of_two() , which takes as input integer, and returns list
out of power of two values, e.g. for power_of_two(3) variable out should be a list out=
[2,4,8] . Length of the list should be equal to the input integer.

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 10/35
01/06/2021 Round_0 - Jupyter Notebook

In [3]:

### STUDENT TASK ###


def power_of_two(n):
out = [ 2**i for i in range(1,n+1)]
return out
# remove the line raise NotImplementedError() before testing your solution and submi

out = power_of_two(5)
print(out)

[2, 4, 8, 16, 32]

In [ ]:

# %load solutions/student-task-3.py
def power_of_two(n):
out = [ 2**i for i in range(1,n+1) ]
return out

Python Libraries

Python programs can import functions from libraries or so-called packages. Some of the most commonly used
Python libraries are:

NumPy - (Numerical Python) for operations involving arrays of numbers. One-dimensional NumPy arrays are
used to represent Euclidean vectors. Two-dimensional NumPy arrays can represent matrices and higher-
dimensional arrays represent tensors.

https://numpy.org/ (https://numpy.org/)

Pandas - A library for loading, analyzing, and manipulating data.

https://pandas.pydata.org/docs/ (https://pandas.pydata.org/docs/)

Matplotlib - A library for data visualization containing many useful tools, e.g., for plotting time series or images.

https://matplotlib.org/3.1.1/contents.html (https://matplotlib.org/3.1.1/contents.html)

Scikit-learn - A library containing implementations of several traditional machine learning methods, such as
linear regression, decision trees, and clustering methods.

https://scikit-learn.org/stable/ (https://scikit-learn.org/stable/)

How To Use A Python Library

In order to use functions/classes provided by a library, it must first be imported via the command

import <library name> as <short name> .

For example, the statement

import numpy as np
https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 11/35
01/06/2021 Round_0 - Jupyter Notebook
p py p
imports the main numpy module under the name np .

Missing imports of libraries are the main cause of error message

NameError: <short name> is not defined

The error message

NameError: np is not defined

arises if a function of a library "np" is used, where the library has not been imported beforehand.

Pandas Data Frames

The library Pandas provides the class (object type) DataFrame . A DataFrame is a two-dimensional (with
rows and columns) tabular structure. Dataframes are convenient for storing and manipulating heterogeneous
data such mixtures of numeric and text data.

In [6]:

# import 'pandas' library


import pandas as pd

# create dictionary
mydict = {'animal':['cat', 'dog','mouse','rat', 'cat'],
'name':['Fluffy','Chewy','Squeaky','Spotty', 'Diablo'],
'age, years': [3,5,0.5,1,8]}

# create dataframe from dictionary


df = pd.DataFrame(mydict, index=['id1','id2','id3','id4','id5'])
print (df)

animal name age, years

id1 cat Fluffy 3.0

id2 dog Chewy 5.0

id3 mouse Squeaky 0.5

id4 rat Spotty 1.0

id5 cat Diablo 8.0

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 12/35
01/06/2021 Round_0 - Jupyter Notebook

In [7]:

# Accessing DataFrame elements

# access row by name with .loc


print(df.loc['id1'])

# access row by index with .iloc


print('\n', df.iloc[0])

animal cat

name Fluffy

age, years 3

Name: id1, dtype: object

animal cat

name Fluffy

age, years 3

Name: id1, dtype: object

In [8]:

# access column by name with .loc


print(df.loc[:,'animal'])

# accsss column by name without .loc


print('\n', df['animal'])

# access column by index with .iloc


print('\n', df.iloc[:,0])

id1 cat

id2 dog

id3 mouse

id4 rat

id5 cat

Name: animal, dtype: object

id1 cat

id2 dog

id3 mouse

id4 rat

id5 cat

Name: animal, dtype: object

id1 cat

id2 dog

id3 mouse

id4 rat

id5 cat

Name: animal, dtype: object

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 13/35
01/06/2021 Round_0 - Jupyter Notebook

In [11]:

# access specific row and columns by name with .loc


print(df.loc['id1',['animal','name']])

# access specific row and columns by index with .iloc


print('\n', df.iloc[0,[0,1]])

animal cat

name Fluffy

Name: id1, dtype: object

animal cat

name Fluffy

Name: id1, dtype: object

In [12]:

# select subset of data with boolean indexing

# select pets with age <= 4


print(df[df["age, years"]<=4])
print("\n") # print empty line

# select only cats


print(df[df["animal"]=="cat"])
print("\n") # print empty line

# select cats and dogs by using "|" operator (equivalent to `OR` opeartor)
print(df[(df["animal"]=="cat") | (df["animal"]=="dog")])

animal name age, years

id1 cat Fluffy 3.0

id3 mouse Squeaky 0.5

id4 rat Spotty 1.0

animal name age, years

id1 cat Fluffy 3.0

id5 cat Diablo 8.0

animal name age, years

id1 cat Fluffy 3.0

id2 dog Chewy 5.0

id5 cat Diablo 8.0

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 14/35
01/06/2021 Round_0 - Jupyter Notebook

In [18]:

# loading from .csv file by using pandas DataFrame structure


df = pd.read_csv('../../../coursedata/R0_Intro/Data.csv')

# check the shape of the dataframe


print("Shape of the dataframe: ",df.shape)
print("Number of dataframe rows: ",df.shape[0])
print("Number of dataframe columns: ",df.shape[1])

# print first 5 rows


df.head()

Shape of the dataframe: (600, 2)

Number of dataframe rows: 600

Number of dataframe columns: 2

Out[18]:

0 1

0 0.471435 -1.190976

1 1.432707 -0.312652

2 -0.720589 0.887163

3 0.859588 -0.636524

4 0.015696 -2.242685

In [16]:

# Convert dataframe to numpy array

# DataFrame.values return a Numpy representation of the DataFrame.


X = df.values
X

Out[16]:

array([[ 0.47143516, -1.19097569],

[ 1.43270697, -0.3126519 ],

[-0.72058873, 0.88716294],

...,
[ 3.16009399, 3.83897138],

[ 3.28939313, 3.68964166],

[ 3.39549918, 4.36393359]])

With pd.read_<format name> it is possible to read also excel, json, html, sql and many others types of
files:

https://pandas.pydata.org/pandas-docs/stable/reference/io.html (https://pandas.pydata.org/pandas-
docs/stable/reference/io.html)

Plotting with Matplotlib

Matplotlib is a library that provides plotting functionality for Python. Good introductory tutorials for Matplotlib
can be found at https://matplotlib.org/tutorials/index.html (https://matplotlib.org/tutorials/index.html).
https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 15/35
01/06/2021 Round_0 - Jupyter Notebook
can be found at https://matplotlib.org/tutorials/index.html (https://matplotlib.org/tutorials/index.html).
A useful command for creating a plot in Python is

fig, axes = plt.subplots()

plt.subplots() returns figure and axes (Axes object or array of Axes objects).

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 16/35
01/06/2021 Round_0 - Jupyter Notebook

In [19]:

# Plotting line and scatter plot

# the library "pyplot" provides functions for plotting data


import matplotlib.pyplot as plt

# create data to plot (numpy arrays)


x1 = np.linspace(10,100,50)
y1 = x1**2

# set random state for reproducibility


np.random.seed(42)

# generate 100 realizations of a Gaussian random variable


x2 = np.random.rand(100,)
y2 = np.random.rand(100,)

# create figure and axes objects


fig, axes = plt.subplots(1,2)
# plot a line in 1st subplot
axes[0].plot(x1,y1,c='r')
# plot scatter in 2nd subplot
axes[1].scatter(x2,y2)

# set axes labels for 1st subplot


axes[0].set_xlabel("x1")
axes[0].set_ylabel("y1")
# set axes labels for 2nd subplot
axes[1].set_xlabel("x2")
axes[1].set_ylabel("y2")
# set titles
axes[0].set_title('plot 1')
axes[1].set_title('plot 2')

# adjust subplots so the labels of different axes are not overlapping


fig.tight_layout()
# display plot
plt.show()

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 17/35
01/06/2021 Round_0 - Jupyter Notebook

In [20]:

# Plotting 2D plot with meshgrid

# create numpy arrays


x = np.arange(-5, 5, 1)
y = np.arange(-5, 5, 1)

# create the grid


xx, yy = np.meshgrid(x, y)

# plot the grid


plt.plot(xx,yy,marker='.', color='k', linestyle='none')

# set axes labels


plt.xlabel("x")
plt.ylabel("y")
# set title
plt.title('xy grid', fontweight='bold')

# display the plot


plt.show()

Numpy Arrays

The Python library numpy provides implementations of many matrix operations as well as other useful
features, such as random number generators. Many functions of this library are based on the data type "numpy
𝑁
array". A numpy array is an object that stores -dimensional arrays of numbers, where is the number of𝑁
dimensions. The shape of a numpy array is given by a sequence of 𝑁
integers that indicate the number of
"elements" in each dimension. Maybe the most important special case of numpy arrays is when , 𝑁=1
corresponding to vectors, or when 𝑁=2 5 2
for matrices. A matrix with rows and columns is represented by
a numpy array of shape . (5,2)
Some additional resources to learn more about numpy arrays and related operations can be found here:

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 18/35
01/06/2021 Round_0 - Jupyter Notebook

NumPy library page - https://numpy.org/ (https://numpy.org/)

Learn NumPy in 5 minutes video- https://www.youtube.com/watch?v=xECXZ3tyONo


(https://www.youtube.com/watch?v=xECXZ3tyONo)

Using NumPy arrays allows for vectorized computation which allows, in turn, faster code execution:

https://www.pythonlikeyoumeanit.com/Module3_IntroducingNumpy/VectorizedOperations.html
(https://www.pythonlikeyoumeanit.com/Module3_IntroducingNumpy/VectorizedOperations.html)

https://www.oreilly.com/library/view/python-for-data/9781449323592/ch04.html
(https://www.oreilly.com/library/view/python-for-data/9781449323592/ch04.html)

Different ways to create NumPy Arrays

In [22]:

# convert a sequence 0,1,..,9 to a numpy array `myarray1`


mylist = [0,1,2,3,4,5,6,7,8,9]
myarray1 = np.array(mylist)

# use range() to create a numpy array


myarray2 = np.array(range(10))

# use np.arange() function to create a numpy array


myarray3 = np.arange(10)

# print values of the arrays


myarray1, myarray2, myarray3

Out[22]:

(array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]))

In [23]:

# create an array (6 rows, 3 columns) with zeros


zeroarray = np.zeros((4,3))

# create an array (6 rows, 3 columns) with ones


onesarray = np.ones((4,3))

print(zeroarray,'\n')
print(onesarray)

[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]

[[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]
[1. 1. 1.]]

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 19/35
01/06/2021 Round_0 - Jupyter Notebook

In [1]:

# Pass lists directly to create 2D array


myarray = np.array([[1,2,3], [4,5,6]])

# Check the array dimensions with .shape attribute (rows, columns)


print("Number of rows: {} \nNumber of columns: {}".format(myarray.shape[0], myarray.
print(myarray)
print (myarray.shape)

Number of rows: 2

Number of columns: 3

[[1 2 3]

[4 5 6]]

(2, 3)

Caution!
A numpy array of shape (n,1) is different from a numpy array of shape (n,)!

In [7]:

# Note! Array of shape (n,1) is not equal to the array of shape (n,)
# Use .shape attribute to check the array's dimensions
# Use .reshape() function to get the array with desired dimensions

myarray1 = np.array(range(10))
myarray2 = np.array(range(10)).reshape(-1,1)

print("`myarray1` is", myarray1.ndim,"- dimensional np.array of shape", myarray1.sh


print("`myarray2` is", myarray2.ndim,"- dimensional np.array of shape", myarray2.sh
print(myarray1,myarray2)

`myarray1` is 1 - dimensional np.array of shape (10,)

`myarray2` is 2 - dimensional np.array of shape (10, 1)

[0 1 2 3 4 5 6 7 8 9] [[0]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]]

Slicing and Combining NumPy Arrays

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 20/35
01/06/2021 Round_0 - Jupyter Notebook

In [12]:

# Access element of the array by index


# Note! Indexing starts with 0

# 1D array
myarray = np.arange(10,0,-1)
print(myarray)
print("First element of the array: {}\n".format(myarray[0]))

# 2D array
myarray = np.array([[1,2,3],[4,5,6]])
print(myarray)
print("2nd row, 3rd column element of the array: {}\n".format(myarray[1,2]))

# Conditional indexing - print values of the array larger than 2


myarray = np.array([[1,2,3],[4,5,6]])
print(myarray)
print("Values >2: {}\n".format(myarray[myarray>2]))

[10 9 8 7 6 5 4 3 2 1]

First element of the array: 10

[[1 2 3]

[4 5 6]]

2nd row, 3rd column element of the array: 6

[[1 2 3]

[4 5 6]]

Values >2: [3 4 5 6]

In [15]:

# Slicing numpy array


# create numpy array with shape=(4,5)
myarray = np.arange(20).reshape(4,5)

# print the values of the array


print('\n',myarray, " array shape is ", myarray.shape)
# print the values of the array located at the rows 1,2 and columns 2,3
print("\nSliced array:\n", myarray[:2,1:3])

[[ 0 1 2 3 4]

[ 5 6 7 8 9]

[10 11 12 13 14]

[15 16 17 18 19]] array shape is (4, 5)

Sliced array:

[[1 2]

[6 7]]

Some more examples of numpy array slicing

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 21/35
01/06/2021 Round_0 - Jupyter Notebook

Stacking NumPy Arrays

In [16]:

# Stack arrays vertically (row wise)


myarray = np.zeros((2,5))
print(np.vstack([myarray, myarray+2]),'\n')

# Stack arrays horizontally (column wise)


myarray = np.zeros((2,5))
print(np.hstack([myarray, myarray+2]))

[[0. 0. 0. 0. 0.]

[0. 0. 0. 0. 0.]

[2. 2. 2. 2. 2.]

[2. 2. 2. 2. 2.]]

[[0. 0. 0. 0. 0. 2. 2. 2. 2. 2.]

[0. 0. 0. 0. 0. 2. 2. 2. 2. 2.]]

Viewing and Copying NumPy Arrays


Consider a numpy array a of shape (5,1). Assume you create a slice b which consists of the first two
elements of a : b=a[:2] . It is then important to be aware that the variable b is merely a pointer (or
reference) to the first two entries of a . Thus, when you modify the slice by b[0] = 10 , you will
simultaneously modify the first entry of a . If you want the slice to become a new object you need to copy the
slice using the function copy() .

Caution!
Modification of an array slice will modify the original array

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 22/35
01/06/2021 Round_0 - Jupyter Notebook

In [17]:

# Slice view, creates view of the array and any modification of it will update that

# create the array


myarray = np.arange(10)
# print values of the original array
print("Original array: ", myarray)

# assign the slice (view of the array) to a new variable 'myslice'


myslice = myarray[5:]
# print values of variable 'myslice'
print("\nSlice of the array: ", myslice)

# modify variable 'myslice' - assign value zero to all entries of the array
myslice[:] = 0

# print values of the original array and modified variable 'myslice'


print("\nModified slice of the array: ", myslice)
print("\nOriginal array: ", myarray)

Original array: [0 1 2 3 4 5 6 7 8 9]

Slice of the array: [5 6 7 8 9]

Modified slice of the array: [0 0 0 0 0]

Original array: [0 1 2 3 4 0 0 0 0 0]

In [18]:

# Copying array, creates a different object, original array is not modified.

# create the array


myarray = np.arange(10)
# print values of the original array
print("Original array: ", myarray)

# assign the slice (copy of the array) to a new variable 'myslice'


myslice = np.copy(myarray[5:])
# print values of variable 'myslice'
print("\nCopy of the array: ", myslice)

# modify variable 'myslice'


myslice[:] = 0

# print values of the original array and modified variable 'myslice'


print("\nModified copy of the array: ", myslice)
print("\nOriginal array: ", myarray)

Original array: [0 1 2 3 4 5 6 7 8 9]

Copy of the array: [5 6 7 8 9]

Modified copy of the array: [0 0 0 0 0]

Original array: [0 1 2 3 4 5 6 7 8 9]

You can find further reading about view and copy of NumPy Arrays here:
https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 23/35
01/06/2021 Round_0 - Jupyter Notebook

https://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html (https://scipy-
cookbook.readthedocs.io/items/ViewsVsCopies.html)

Operations on NumPy Arrays

In [19]:

# create two numpy arrays

x = np.arange(10)
y = np.arange(20,30)
print(x, y)

# elementwise addition and substraction

print(x + y)
print(x - y)

# elementwise multiplication and division

print(x * y)
print(x / y)

# elementwise power

print(x**2)

[0 1 2 3 4 5 6 7 8 9] [20 21 22 23 24 25 26 27 28 29]

[20 22 24 26 28 30 32 34 36 38]

[-20 -20 -20 -20 -20 -20 -20 -20 -20 -20]

[ 0 21 44 69 96 125 156 189 224 261]

[0. 0.04761905 0.09090909 0.13043478 0.16666667 0.2

0.23076923 0.25925926 0.28571429 0.31034483]

[ 0 1 4 9 16 25 36 49 64 81]

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 24/35
01/06/2021 Round_0 - Jupyter Notebook

In [20]:

# create numpy array


x = np.arange(10,0,-1)

# useful numpy array functions:


# sum of elements
x_sum = x.sum()

# maximum and minimum values


x_max = x.max()
x_min = x.min()

# indices of maximum and minimum values


x_indmax = x.argmax()
x_indmin = x.argmin()

print(x)
print("\nSum of the array: ", x_sum)
print("\nMaximum and minimun values: {}, {} \nIndices of maximum and minimum values:
x_max, x_min, x_indmax, x_indmin))

[10 9 8 7 6 5 4 3 2 1]

Sum of the array: 55

Maximum and minimun values: 10, 1

Indices of maximum and minimum values: 0, 9

Broadcasting

Sometimes we need to add the same constant value to all entries of a numpy array. Consider a numpy array
a of arbitrary size and a numpy array b containing a single number. We would like to be able to write a+b
to get a numpy array whose entries are given by adding the value in b to all entries in a . The concept of
"broadcasting" for numpy arrays makes this possible!

Find more information here:

https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
(https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

https://numpy.org/devdocs/user/theory.broadcasting.html
(https://numpy.org/devdocs/user/theory.broadcasting.html)

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 25/35
01/06/2021 Round_0 - Jupyter Notebook

In [21]:

# It is possible to do operations with different size arrays - broadcasting


# create two numpy arrays
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10,11,12]])
y = np.ones((1,3))

# display the values of two arrays


print("x = \n", x)
print("\n", "y = ", y)

# print the result of arrays addition


print("\n\n x+y = \n", x+y)

x =

[[ 1 2 3]

[ 4 5 6]
[ 7 8 9]
[10 11 12]]

y = [[1. 1. 1.]]

x+y =

[[ 2. 3. 4.]

[ 5. 6. 7.]

[ 8. 9. 10.]

[11. 12. 13.]]

Student task. Numpy Arrays


1. Create numpy array x1 of shape (3,4) with np.arange(12) .
2. Store values of first two columns and all rows in numpy array x2 .
3. Multiply scalar 5 to x2 .
4. Add two column of zeros as the last two column of x2 .
5. Add up x1 and x2 and store the result in x3 .

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 26/35
01/06/2021 Round_0 - Jupyter Notebook

In [22]:

import numpy as np
### STUDENT TASK ###
x1 = np.arange(12).reshape(3,4)
x2 = np.copy(x1[:,:2])
x3 = x2*5
x2 = np.hstack([x2, np.zeros((x2.shape[0],2))])
x3 = x1+x2
# remove the line raise NotImplementedError() before testing your solution and submi
print (x1)
print (x2)
print (x3)

[[ 0 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]]

[[0. 1. 0. 0.]

[4. 5. 0. 0.]

[8. 9. 0. 0.]]

[[ 0. 2. 2. 3.]

[ 8. 10. 6. 7.]

[16. 18. 10. 11.]]

In [ ]:

# %load solutions/student-task-4.py
x1 = np.arange(12).reshape(3,4)
x2 = np.copy(x1[:,:2])
x2 = x2*5
x2 = np.hstack([x2, np.zeros((x2.shape[0],2))])
x3 = x1+x2

Vectors and Matrices

It is often useful to represent data in a numerical format as vectors or matrices. For example, suppose we have
collected weather data (daily minimum, maximum, and average temperatures) for many days. In that case, we
can represent observations for one day as a vector (or as NumPy array in Python code) and stack all
observations in a matrix. Each row of this matrix would contain the weather observations for one day and each
column - the minimum, maximum, or average temperatures across all days.

We will soon present the mathematical notation, and the basic operations commonly used when working with
vectors and matrices. If the concepts seem difficult to grasp, you can start by watching the animated video
series "Essence of linear algebra" (https://www.youtube.com/watch?v=kjBOesZCoqc&list=PL0-
GT3co4r2y2YErbmuJw2L5tW4Ew2O5B) from 3Blue1Brown. For more detailed but still accessible
explanations, you can check "Mathematics for Machine Learning" (https://mml-book.github.io) book by
M.P.Deisenroth, A.A.Faisal, and C.S.Ong (PDF is available on the website).

Vectors
We denote vectors with lower-case bold letters, e.g. vector 𝐱 consisting of 𝑛 elements:
https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 27/35
01/06/2021 Round_0 - Jupyter Notebook

 𝑥1 
𝐱 =  𝑥⋮2 
 
𝑥𝑛
Traditionally vectors are represented as column vectors (elements of the vector stacked vertically). Also,
vectors sometimes represented as 𝐱 = (𝑥1 ,…, 𝑥𝑛 )𝑇 or a transpose (see below) of a row vector, just for
convenience.

Below, you can see how to create vector 𝐱 consisting of 𝑛 elements, where 𝑛 = 5 with Python numpy library.

In [25]:

# import Python library


import numpy as np

# create numpy array


x = np.array([1,2,3,4,5])
print(x)

[1 2 3 4 5]

𝑖
The :th entry of vector 𝐱 is denoted as 𝑥𝑖 , e.g. first element of vector 𝐱 is 𝑥1 .
Note! Indexing in Python starts from zero!

In [26]:

# print first element of vector x


print("First element of vector x =", x[0])

First element of vector x = 1

Dot Product of two vectors


The dot product (https://en.wikipedia.org/wiki/Dot_product) between two vectors, i.e., one-dimensional NumPy
arrays, of the same length is defined as

 𝑦1 
𝐱𝑇 𝐲 = (𝑥1 , 𝑥2 ,…, 𝑥𝑚 ) ⋅  𝑦⋮2  = 𝑥1 𝑦1 + 𝑥2 𝑦2 + … + 𝑥𝑚 𝑦𝑚
 
𝑦𝑚
Geometrically, it is the product of the Euclidean distances of the two vectors and the cosine of the angle
between them.

The dot product is also defined for NumPy arrays with more than one dimension (see numpy documentation
(https://numpy.org/doc/stable/reference/generated/numpy.dot.html?highlight=dot#numpy.dot) for more info).

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 28/35
01/06/2021 Round_0 - Jupyter Notebook

In [27]:

# create two numpy arrays


x = np.arange(3)
y = np.arange(3,6)

# display the values of two arrays


print(x,y)

# dot product 0*3+1*4+2*5


x.dot(y)

[0 1 2] [3 4 5]

Out[27]:

14

Outer product of two vectors


The outer product (https://en.wikipedia.org/wiki/Outer_product) between two vectors, i.e., one-dimensional
NumPy arrays, of the same length is defined as

 𝑥1   𝑥1 𝑦1 𝑥1 𝑦2 … 𝑥1 𝑦𝑚 
𝐱𝐲𝑇 =  𝑥⋮2  ⋅ (𝑦1 , 𝑦2 ,…, 𝑦𝑚 ) =  𝑥2⋮𝑦1 𝑥2 𝑦2



𝑥2 𝑦𝑚 
⋮ 
  
𝑥𝑚 𝑥𝑚 𝑦1 𝑥𝑚 𝑦2 … 𝑥𝑚 𝑦𝑚 
As you can see, the result of the outer product is a matrix, whereas the output of the dot product is scalar.

In [28]:

# create two numpy arrays


x = np.arange(3)
y = np.arange(3,6)

# display the values of two arrays


print(x,y)
# outer product
np.outer(x,y)

[0 1 2] [3 4 5]

Out[28]:

array([[ 0, 0, 0],

[ 3, 4, 5],

[ 6, 8, 10]])

Matrices

In many applications, it is natural to represent data as a matrix


(https://en.wikipedia.org/wiki/Matrix_(mathematics)), which is the special case of a two-dimensional NumPy
array.

We will discuss how to represent our data as a matrix for further analyses in the next round.
https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 29/35
01/06/2021 Round_0 - Jupyter Notebook

Matrices are denoted in bold capital letters, e.g. matrix 𝐗 with 𝑚 rows and 𝑛 columns or 𝑚 × 𝑛 matrix.
 𝑥(1)1 𝑥(1)2 … 𝑥(1)𝑛 

𝐗 =  1 𝑥 (2) 𝑥(2)2 … 𝑥(2)𝑛 
 𝑥⋮(𝑚) ⋮ ⋱ ⋮ 
1 𝑥(𝑚)2 … 𝑥(𝑚) 
𝑛
Below, you can see how to create a matrix 𝐗 with 𝑚 = 3 and 𝑛 = 4, containing the range of numbers
0,1,…,11 .

In [32]:

X = np.arange(12).reshape(3,4)
print(X)

[[ 0 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]]

Transpose of vectors and matrices


The transpose of a matrix and a vector are denoted 𝐗𝑇 or 𝐱𝑇 respectively.
In [33]:

print(X.T,"\n") # "\n" prints empty line


print(x.T)

[[ 0 4 8]
[ 1 5 9]
[ 2 6 10]
[ 3 7 11]]

[0 1 2]

Dimension (vector space)


Often you will see notations such as 𝐱 ∈ ℝ𝑛 𝐗 𝑛∈ ℝ𝑚×𝑛
or , whereℝ are real numbers
(https://en.wikipedia.org/wiki/Real_number) and ℝ is a coordinate space
𝑛
(https://en.wikipedia.org/wiki/Real_coordinate_space) consisting of length- lists of real numbers. For example,
ℝ is a plane and 𝐱 ∈ ℝ means that vector 𝐱 is in vector space ℝ or in the plane.
2 2 2
A vector 𝐱 ∈ ℝ 2 consist of two elements and a matrix 𝐗 ∈ ℝ 3×4 consists of 3 × 4 elements.

Generally, we say that the matrix 𝐗 with 𝑚 rows and 𝑛 columns

 𝑥1,1 𝑥1,2 … 𝑥1,𝑛 


𝐗 =  𝑥⋮2,1 𝑥⋮2,2 …⋮ 𝑥⋮2,𝑛  ∈ ℝ𝑚×𝑛
 
𝑥𝑚,1 𝑥𝑚,2 … 𝑥𝑚,𝑛 
Matrix multiplication
Now we will use numpy arrays to create a matrix 𝐗 with 𝑚 rows and 𝑛 columns
https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 30/35
01/06/2021 Round_0 - Jupyter Notebook

 𝑥1,1 𝑥1,2 … 𝑥1,𝑛 


𝐗 =  𝑥⋮2,1 𝑥2,2 … 𝑥2,𝑛  ∈ ℝ𝑚×𝑛
 ⋮ ⋮ ⋮ 
𝑥𝑚,1 𝑥𝑚,2 … 𝑥𝑚,𝑛 
a matrix 𝐘 with 𝑛 rows and 𝑚 columns
 𝑦1,1
… 𝑦1,𝑚  𝑦1,2
… 𝑦2,𝑚  ∈ ℝ𝑛×𝑚 ,
𝐘 =  𝑦⋮2,1 𝑦2,2
⋮ ⋮ 
 ⋮
… 𝑦𝑛,𝑚 
𝑦𝑛,1 𝑦𝑛,2
and perform matrix multiplication to compute the product 𝐗𝐘 .

Matrix multiplication (https://en.wikipedia.org/wiki/Matrix_multiplication) is a binary operation that produces a


matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to
the number of rows in the second matrix. The result matrix, known as the matrix product, has the number of
rows of the first and the number of columns of the second matrix.

In Python, matrix multiplication can be performed using NumPy with the @ operator, which is equivalent to the
function numpy.matmul() .

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 31/35
01/06/2021 Round_0 - Jupyter Notebook

In [35]:

# create an array of length m*n


m = 4
n = 3
array = np.arange(m*n)

# create matrix X represented as a numpy array of shape (m,n)


X = array.reshape(m,n)
dimension=np.shape(X) # determine dimensions of matrix X
rows = dimension[0] # first element of "dimension" is the number of rows
cols = dimension[1] # second element of "dimension" is the number of cols
print("the matrix X has", rows, "rows and", cols, "columns \n")

# create matrix Y represented as a numpy array of shape (n,m)


Y = array.reshape(n,m)
dimension=np.shape(Y) # determine dimensions of matrix Y
rows = dimension[0] # first element of "dimension" is the number of rows
cols = dimension[1] # second element of "dimension" is the number of cols
print("the matrix Y has", rows, "rows and", cols, "columns \n")

# matrix multiplication of X and Y


XY = X @ Y
# print the result of matrix multiplication
print("the product XY=X@Y is XY = \n", XY)
# print the shape of the XY matrix
print("\n the matrix XY has", XY.shape[0], "rows and", XY.shape[1], "columns \n")

the matrix X has 4 rows and 3 columns

the matrix Y has 3 rows and 4 columns

the product XY=X@Y is XY =

[[ 20 23 26 29]

[ 56 68 80 92]

[ 92 113 134 155]

[128 158 188 218]]

the matrix XY has 4 rows and 4 columns

Note! (1) For matrix multiplication, the number of columns in the first matrix must be equal to the number of
rows in the second matrix. The result matrix has the number of rows of the first and the number of columns of
the second matrix (2) Order of matrix multiplication is important: np.matmul(A,B) != np.matmul(B,A) (3) A*B is
elemet-waise multiplication in Python, and not the matrix multiplication.

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 32/35
01/06/2021 Round_0 - Jupyter Notebook

In [36]:

# For matrix multiplication A.dot(B) or A@B can be used


print("\nMatrix multiplication X@Y:\n\n", X @ Y)

# Order is important in matrix multiplication - A@B != B@A


print("\nMatrix multiplication Y@X:\n\n", Y @ X)

# Square of the matrix element-wise


Z = np.arange(9).reshape(3,3)
print("\nMatrix Z:\n\n", Z)
print("\nSquare - element-wise Z*Z:\n\n", Z**2)

# Square of the matrix by matrix multiplication


print("\nSquare - matrix multiplication Z@Z:\n\n", Z @ Z)

Matrix multiplication X@Y:

[[ 20 23 26 29]

[ 56 68 80 92]

[ 92 113 134 155]

[128 158 188 218]]

Matrix multiplication Y@X:

[[ 42 48 54]

[114 136 158]

[186 224 262]]

Matrix Z:

[[0 1 2]

[3 4 5]

[6 7 8]]

Square - element-wise Z*Z:

[[ 0 1 4]

[ 9 16 25]
[36 49 64]]

Square - matrix multiplication Z@Z:

[[ 15 18 21]

[ 42 54 66]

[ 69 90 111]]

L1-norm of a vector or matrix


Norms of vectors are instrumental when we need to measure the similarity between data points represented as
vectors.

The L1-norm of a vector 𝐱 is defined as the sum of the absolute values of its elements, and is denoted by
||𝐱||1 = |𝑥1 |+...+|𝑥𝑛 |
L2-norm of a vector or matrix

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 33/35
01/06/2021 Round_0 - Jupyter Notebook

The L2-norm of a vector or matrix is defined as the square root of sum of the squared components of a vector
or matrix, which corresponds to the intuitive notion of distance. It is denoted by

‖𝐱‖2 = √⎯𝑥⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
1 2 +...+ 𝑥 𝑛
⎯,
2
although the subscript is often omitted since the L2-norm is the standard norm in ℝ𝑛 . It is often useful to
calculate the squared L2-norm

‖𝐱‖2 = 𝑥1 2 +...+𝑥𝑛 2 ,
which is equivalent to the inner (dot) product of the vector with itself.

In the picture below, you can see the L1 and L2 norms between two points in the Euclidean plane:

Summation
𝑛
The sum of all elements (from 1 to ) of an indexed collection (𝑥1 , 𝑥2 ,…, 𝑥𝑛 ) (e.g., a vector 𝐱) is denoted by
𝑛
∑ 𝑥𝑖 = 𝑥1 +...+𝑥𝑛
𝑖=1
For example, we can re-write vector norm formula as
𝑛
‖𝐱‖2 = ∑ 𝑥𝑖 2 = 𝑥1 2 +...+𝑥𝑛 2
𝑖=1
Product
Product notation ∏ is used to indicate repeated multiplication.
For example,

7
∏ 𝑘 = 3 · 4 · 5 · 6 · 7,
𝑘=3
or

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 34/35
01/06/2021 Round_0 - Jupyter Notebook

𝑛
∏ 𝑥𝑖 = 𝑥1 ·...·𝑥𝑛 .
𝑖=1

Student task. Vector and Matrix operations.


Consider following vectors:
 𝑥1   𝑤1 
𝐱 =  𝑥⋮2   𝐰 =  𝑤⋮2 
   
𝑥𝑛 𝑤𝑛
Your task is to perform following calculations:

𝐱 ∑𝑛𝑖=1 𝑥𝑖2 and store result in variable vector_sum


1. sum all elements of vector :
2. square of L2-norm of vector 𝐱: ||𝐱||2 and store result in vector_norm
3. compute dot-product of two vectors 𝐰 and 𝐱: 𝐰𝑇 𝐱 and store in variable vector_dotprod
4. compute outer product of two vectors 𝐰 and 𝐱: 𝐰𝐱𝑇 and store in variable vector_outerprod
5. compute multiplication of vector_outerprod and A (given below) and store in mat_mult

In [37]:

np.random.seed(42)

x = np.arange(5).reshape(-1,)
w = np.random.rand(5).reshape(-1,)
A = np.arange(15).reshape(5,3)

### STUDENT TASK ###


vector_sum = x.sum()
vector_norm = sum(x**2)
vector_dotprod = w.dot (x)
vector_outerprod = np.outer(w,x)
mat_mult = vector_outerprod@A
# remove the line raise NotImplementedError() before testing your solution and submi

In [ ]:

# %load solutions/student-task-5.py
vector_sum = x.sum()
vector_norm = sum(x**2)
vector_dotprod = w.dot(x)
vector_outerprod = np.outer(w,x)
mat_mult = vector_outerprod@A

https://jupyter.cs.aalto.fi/user/rojasa3/notebooks/notebooks/mlpython2021b/R0_Intro/Round_0.ipynb 35/35

You might also like