0% found this document useful (0 votes)
69 views

Week2-1 Numpy

This document provides an overview of the NumPy Python library for scientific computing with multi-dimensional arrays. It discusses how NumPy allows for the creation, manipulation, and fast computation of arrays. Key points covered include creating arrays from lists, common array functions like zeros and random, indexing and slicing arrays, combining arrays through concatenation and stacking, and performing fast element-wise operations through vectorization and broadcasting.

Uploaded by

Jiaqi MEI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Week2-1 Numpy

This document provides an overview of the NumPy Python library for scientific computing with multi-dimensional arrays. It discusses how NumPy allows for the creation, manipulation, and fast computation of arrays. Key points covered include creating arrays from lists, common array functions like zeros and random, indexing and slicing arrays, combining arrays through concatenation and stacking, and performing fast element-wise operations through vectorization and broadcasting.

Uploaded by

Jiaqi MEI
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

IDAT7215

Computer Programming for


Product Development and
Applications

Lecture 2-1: Python Libraires:


NumPy

Dr. Zulfiqar Ali


Outline

▪ NumPy Introduction

▪ Creation of Arrays

▪ Data Retrieve and Restructure

▪ Fast Computation with NumPy


NumPy

▪ NumPy stands for Numerical Python and it is the fundamental package for
scientific computing with Python.

▪ NumPy is a Python library for handling multi-dimensional arrays.

▪ It contains both the data structures needed for the storing and accessing
arrays, and operations and functions for computation using these arrays.

▪ Unlike lists, the arrays must have the same data types for all its elements.

▪ The homogeneity of arrays allows highly optimized functions that use arrays
as their inputs and outputs.
Usages of high-dimensional arrays in data analysis

▪ Store matrices, solve systems of linear equations, compute


eigenvalues/eigenvectors, matrix decompositions, …

▪ Images and videos can be represented as NumPy arrays


Usages of high-dimensional arrays in data analysis

▪ Store matrices, solve systems of linear equations, compute


eigenvalues/eigenvectors, matrix decompositions, …

▪ Images and videos can be represented as NumPy arrays

▪ A 2-dimensional table might store an input data matrix in data analysis,


where row represents a sample, column represents a feature (Commonly
used in scikit-learn).
Creation of arrays

▪ Import the NumPy library


– Suggested to use the standard
abbreviation np
Creation of arrays

▪ Import the NumPy library


– Suggested to use the standard
abbreviation np

▪ Give a (nested) list as a


parameter to the array
constructor
– One dimensional array: list
– Two dimensional array: list of
lists
– Three dimensional array: list of
lists of list
One dimensional array, Two dimensional array, Three
dimensional array

▪ In a two-dimensional array, you have rows and columns. The rows are indicated
as “axis 0” while the columns are the “axis 1”.

▪ The number of the axis goes up accordingly with the number of the dimensions.
Creation of arrays

▪ Useful function to create common types of


arrays
– np.zeros(): all elements are 0s
– np.ones(): all elements are 1s
– np.full(): all elements to a specific value
– np.empty(): all elements are uninitialized
– np.eye(): identity matrix: a matrix with
elements on the diagonal are 1s, others are 0s
Creation of arrays

▪ Generate evenly spaced values within a given interval.

▪ np.arrange(): It works like Python built-in range() function

▪ For non-integer ranges it is better to use np.linspace().

▪ With np.linspace() one does not have to compute the length of the step, but
instead one specifies the wanted number of elements. By default, the endpoint is
included in the result, unlike with arange.
Creation of arrays with random elements

▪ We may need some random generated data to test our program

▪ NumPy can easily produce arrays of wanted shape with random numbers.

▪ np.random.random(): uniformly distributed from [0.0, 1.0)

▪ np.random.normal(): normally distributed

▪ np.random.randint(): uniformly distributed integers


Creation of arrays with random elements
Creation of arrays with random elements

▪ To debug our code, sometimes it is useful to re-create exactly the same


random data in every run of our program.

▪ We can create random numbers deterministically using seed.

If you run the code multiple times,


it will always give the same
numbers,
Array types and attributes

▪ An array has several attributes:


– ndim: the number of dimensions
– shape: size in each dimension
– size: the number of elements
– dtype: the type of element
Indexing

▪ One dimensional array works like


the list.

▪ For multi-dimensional array, the


index is a comma separated tuple
instead of single integer

▪ Note that if you give only a single


index to a multi-dimensional array,
it indexes the first dimension of
the array.
Slicing

▪ Slicing works similarly to lists, but now


we can have slices in different
dimensions.

▪ We can even assign to a slice

▪ Extract rows or columns from an array


Reshaping

▪ When an array is reshaped, its number of elements stays at the same, but
they are reinterpreted into a different shape.

▪ E.g., one dimensional array into two dimension array


Combining Arrays

▪ Combining several arrays into on bigger array:


concatenate and stack

▪ Concatenate: It takes n-dimensional arrays


and return an n-dimensional array.

▪ Stack: it takes n-dimensional arrays and return


(n+1)-dimensional array
Concatenate

▪ By default, concatenate joins the arrays


along axis 0.

▪ To joint array horizontally, add parameter


axis = 1
Concatenate different dimensions

▪ If you want to concatenate arrays with different dimensions, you must first
reshape the arrays to have the same number of dimensions.
– E.g, add a new row (column) to a 2d array
Stack

▪ Use stack to create higher dimensional arrays from lower dimensional


arrays:
Split

▪ split is the inverse operation of


concatenate.

▪ The input argument to split can be


the number of equal parts the arrays is
divided into.
Split

▪ The input argument to split can also


be indices that specified explicitly the
break points.

▪ The entries indicate where along axis


(default axis = 0) the array is split.

▪ E.g., np.split(d, (2, 3, 5)) split


array d into
– d[:2]
– d[2:3]
– d[3:5]
– d[5:]
Less Memory in NumPy

▪ Space occupied by
NumPy is less compare
to list.
Fast computation on arrays

▪ In addition to providing a way to store and access multi-dimension arrays,


NumPy also provides several routines to perform computations on them.

▪ One of the reasons for the popularity of NumPy is that these computations
can be very efficient, much more efficient than what Python can normally do.

▪ The biggest bottle-necks in efficiency are the loops, which can be iterated
millions, billions, or even more times.

▪ What slows down loops in Python is the fact that Python is dynamically typed
language: at each expression Python has to find out the types of the
arguments of the operations.
Fast computation examples

▪ Let consider multiply 2 to a collection of numbers.

▪ At each iteration of the loop, Python has find out the


type of the variable x, which can in this example be an
int, a float or a string, and depending on this type call a
different function to perform the “multiplication” by two.

▪ What makes NumPy efficient, is the requirement that


each element in an array must be of the same type.
This homogeneity of arrays makes it possible to
create vectorized operation, which don’t operate on
single elements, but on arrays (or subarrays).
Fast computation examples

▪ Because each iteration in NumPy is using identical operations only the data
differs, this can be compiled into machine language, and then performed in
one go, hence avoiding Python’s dynamic typing.

▪ The name vector operation comes from linear algebra


– addition of two vectors a = [𝑎1, 𝑎2], b = [𝑏1, 𝑏2] is element-wise addition
a + b = [𝑎1 + 𝑏1, 𝑎2 + 𝑏2]
Arithmetic Operations in NumPy

▪ The basic arithmetic operations in


NumPy are defined in the vector form.
– +: addition
– -: subtraction
– *:multiplication
– /:division
– //:floor division
– **: power
– %: remainder
Aggregations: max, min, sum, mean, standard deviations…

▪ Aggregations allow us to describe the information in an array by using few


numbers
Aggregation over certain axes

▪ Instead of aggregating over the whole array,


we can aggregate over certain axes only as
well.
Python function, NumPy function, NumPy Method

▪ Most of the aggregation functions in


NumPy have corresponding methods.

▪ Python language has builtin functions


sum, min, max, etc.

▪ Do not accidentally use Python built-in


functions for arrays, since they will be
significantly slower than NumPy’s
functions and methods.
Efficiency of NumPy functions

▪ The speed of NumPy partly comes from the fact that its arrays must have same
type for all the elements. This requirement allows some efficient optimizations.
Broadcasting

▪ We have seen that NumPy allows array operations that are performed
element-wise.

▪ NumPy also allows binary operation that do not require the two arrays to
have the same shape.

▪ E.g., add 4 to all elements of an array


Broadcasting

▪ How binary operation is performed?

▪ NumPy tries to stretch the arrays to have the same shape, then perform the
element-wise operation.

▪ In NumPy, this stretching is called broadcasting.

NumPy first stretched the scalar 4 to the array


np.array([4,4,4]) and then performed the element-
wise addition.
Broadcasting

▪ In this example the second argument b was first


broadcasted to the array

▪ Then the addition was performed.


Comparisons and Masking

▪ Just like NumPy allows element-wise arithmetic operations


between arrays, it is also possible to compare two arrays
element-wise.

▪ We can also count the number of comparisons that were


True. This solution relies on the interpretation that True
corresponds to 1 and False corresponds to 0.

▪ Broadcasting rules also apply to comparison


Masking

▪ Another use of Boolean arrays is that they


can be used to select a subset of
elements. It is called masking.

▪ It can also be used to assign a new value.


For example, the following zeroes out the
negative numbers.
Fancy Indexing

▪ Using indexing we can get a single elements from an


array. If we wanted multiple (not necessarily
contiguous) elements, we would have to index
several times.

▪ That’s quite verbose. Fancy indexing provides a


concise syntax for accessing multiple elements.

▪ We can also assign to multiple elements through


fancy indexing.
Fancy Indexing

▪ Fancy indexing works also for higher dimensional


arrays.

▪ We can also combine normal indexing, slicing and


fancy indexing.
Sorting arrays

▪ Sorting one dimensional array is similar


to sort a list.

▪ We can also sort a high dimensional


array along different axes.
Sorting arrays

▪ A related operation is the argsort function. Which doesn’t sort the elements
but returns the indices of the sorted elements.

These indices [3, 0, 4, 1, 2] say that the smallest element of the array is in position 3 of a,
second smallest elements is in position 0 of a, third smallest is in position 4, and so on.
Matrix operations

▪ NumPy support a wide variety of matrix operations, such as matrix


multiplication, solve systems of linear equations, compute
eigenvalues/eigenvectors, matrix decompositions and other linear algebra
related operations.
Applications of NumPy

▪ Mathematics (MATLAB Replacement)

▪ Plotting (Matplotlib)

▪ Backend (Pandas, Digital Photography)

▪ Machine Learning

▪ Signal Processing

You might also like