0% found this document useful (0 votes)
12 views42 pages

Chapter 1 Numpy

This document is a Jupyter Notebook chapter on NumPy, a Python library for numerical processing, detailing its installation, array creation, and various operations like indexing, slicing, and reshaping. It emphasizes the importance of NumPy in data science and machine learning for efficient matrix operations. The chapter covers the creation of NDarrays, their dimensions, and methods for manipulating and accessing array data.

Uploaded by

Apples r on fire
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views42 pages

Chapter 1 Numpy

This document is a Jupyter Notebook chapter on NumPy, a Python library for numerical processing, detailing its installation, array creation, and various operations like indexing, slicing, and reshaping. It emphasizes the importance of NumPy in data science and machine learning for efficient matrix operations. The chapter covers the creation of NDarrays, their dimensions, and methods for manipulating and accessing array data.

Uploaded by

Apples r on fire
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Contents
Introduction
Installing and Importing Numpy
Creating Arrays and the NDarray Object
Dimensions in Arrays
The shape of an array
Accessing Array Elements
Array Indexing
Array Slicing
Numpy Random Generators
Joining Numpy Arrays
Filtering & Searching Arrays
Joining Numpy Arrays
Filtering & Searching Arrays
Sorting Arrays
Main Matrix Operations
Bonus Material - ‫غير مطلوب في االمتحان‬

Introduction
NumPy is short for Numerical Python, it was created by Travis Oliphant in 2005.
It is an Open Source - Free Python library used for working with arrays.
It provides a high-performance multidimensional array object, and tools for working
with arrays.
Numpy aims to provide an array object called NDARRAY that is up to 50x faster than
traditional Python lists.
NumPy is written partially in Python, but most of the parts that require fast computation are
written in C or C++.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 1/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Importance of Numpy in Data Science


Numpy can be used to perform matrix operations in a fast and effecient manner.
Matrix processing is important in the the field of machine learning and predictive modelling.
In machine learning and predictive modelling, data is represented in Arrays and Matrices
of Numbers which are used to store and process these numbers to generate results.
Many important machine learning models needs matrix processing, for example:
Neural Networks
Linear Regression
Linear Algebra
Data Representation
Vector Embeddings
Principle Components Analysis and Dimensional Reduction
...

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 2/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Installing Numpy
Numpy is not a standard library, therefore, to use numpy, it needs to be installed in our
system.
Once NumPy is installed, we can import it in our applications by using the import keyword.
We can install numpy using Python package manager (PIP Command), as following:

pip install numpy

Importing Numpy
After installing NumPy, we can import it and use it in our code.

In [4]: 1 import numpy

Now NumPy is imported and ready to use.

Numpy alias
alias: In Python alias are an alternate name for referring to the same thing.
Numpy is usually imported under the np alias.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 3/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Import numpy using an alias name e.g. np

In [5]: 1 import numpy as np

Now the NumPy package can be referred to as np instead of numpy.

We can check the version of numpy library using the version attribute.

In [6]: 1 import numpy as np


2 ​
3 print(np.__version__)

1.21.5

Creating Arrays and the NDarray Object


The most important object defined in NumPy is an N-dimensional Array type called
ndarray.
It describes the collection of items of the same type.
To create an ndarray, we use the array() function.
We can feed the array function with different inputs such as Python lists and tuples.

Example: Create a Python list of 3 floating point values, then create an ndarray
from the list

In [7]: 1 # create array


2 import numpy as np
3 ​
4 lst = [1.0, 2.0, 3.0]
5 arr = np.array(lst)
6 print(arr)

[1. 2. 3.]

Get the Type of an Object in Python


In Python, we can use the built-in type() function to get the type of any object.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 4/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Lets find the type of our created array is an object of type ndarray (n
dimensions array):

In [8]: 1 print(type(arr))

<class 'numpy.ndarray'>

Use a tuple to create a numpy array


Question: What is the tuple datatype in Python?

Example: Create a numpy array based on a tuple, similar to the list in the
previous example and print its values

In [9]: 1 import numpy as np


2 ​
3 arr = np.array((1, 2, 3, 4, 5))
4 print(arr)

[1 2 3 4 5]

Dimensions in Arrays
A dimension is a direction in which you can vary the specification of an array's elements.
An array that holds the sales total for each day of the month has one dimension (the day of
the month).
Nested array are arrays that have arrays as their elements.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 5/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

The last column (the daily sales amounts) represents a 1 dimensional array.

Arrays in NumPy can have different dimensions: 0, 1, 2, 3 or more

A single number array is called a scaller


A one-dimensional array of numbers is called a Vector
A two-dimensional array of numbers is called a Matrix
A multi-dimensional array of numbers is called a Tensor

0-D Arrays

A array with 0 dimensions is called a Scalar.


A scaler is a 0-D array that has only 1 element.

Example: Create a 0-D array with value 42 import numpy as np

In [10]: 1 import numpy as np


2 ​
3 arr = np.array(42)
4 print(type(arr))

<class 'numpy.ndarray'>

1-D Arrays
An array with 1 dimension is called a 1-D array or a Vector.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 6/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Example: Create a 1-D array containing the following numbers


5,2,7,9,1000,2450,75

In [11]: 1 import numpy as np


2 ​
3 arr = np.array([5,2,7,9,1000,2450,75])
4 print(arr)

[ 5 2 7 9 1000 2450 75]

2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array.

a 2-D array is commonly referred to as a Matrix.

Example: Create a 2-D array containing two arrays with the values 1,2,3 and
4,5,6:

In [12]: 1 import numpy as np


2 ​
3 arr = np.array([[1, 2, 3], [4, 5, 6]])
4 print(arr[0,1])

3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.

Example: Create a 3-D array with two 2-D arrays, both containing two arrays
with the values 1,2,3 and 4,5,6:

In [13]: 1 import numpy as np


2 ​
3 arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
4 print(arr[1,1,0])

Check The Number of Dimensions?


We can use the ndim attribute to get the number of dimensions for an array.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 7/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Example: Check how many dimensions the arrays has:

In [14]: 1 import numpy as np


2 ​
3 a = np.array(42)
4 b = np.array([1, 2, 3, 4, 5])
5 c = np.array([[1, 2, 3], [4, 5, 6]])
6 d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
7 ​
8 print(a.ndim)
9 print(b.ndim)
10 print(c.ndim)
11 print(d.ndim)

0
1
2
3

The shape of an array


The shape of an array is the number of elements in each dimension.
We can get the shape of any array using the shape attribute.
The shape attribute returns a tuple with each index having the number of corresponding
elements.

Example: Print the shape of a 2-D array:

In [15]: 1 import numpy as np


2 ​
3 arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
4 print(arr)
5 print(arr.shape)

[[1 2 3 4]
[5 6 7 8]]
(2, 4)

Question: What is the number of dimensions of the previous array?

In [16]: 1 print(arr.ndim)

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 8/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Reshaping Arrays
We can change the shape of an array e.g. from 1-d to 2-d, from 2-d to 5-d...etc using the
reshape method.
The only condition that we need to enforce is:

The number of elements in the original array has to be similar to the


number of elements of the new array:

1-d array with 10 scalars can be reshaped to a 2-d array of shape 2 x 5


2-d of shape 4 x 2 can be reshaped to a 3-d array of shape 2 x 2 x 2

Example: Reshape a 1-d array with 12 elements into a 2-d array

In [17]: 1 import numpy as np


2 ​
3 arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
4 ​
5 print("..................................................")
6 print("Array shape: ", arr.shape)
7 print(arr)
8 print("..................................................")
9 ​
10 newarr = arr.reshape(4, 3)
11 print("Array shape: ", newarr.shape, newarr.ndim)
12 print(newarr)
13 print("..................................................")

..................................................
Array shape: (12,)
[ 1 2 3 4 5 6 7 8 9 10 11 12]
..................................................
Array shape: (4, 3) 2
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
..................................................

Example: Reshape a 1-d array with 12 elements into a 3-D array

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usin… 9/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [18]: 1 import numpy as np


2 ​
3 arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
4 ​
5 print("..................................................")
6 print(arr)
7 print("Array shape: ", arr.shape)
8 print("..................................................")
9 ​
10 newarr = arr.reshape(2, 3, 2)
11 print(newarr)
12 print("Array shape: ", newarr.shape)
13 print("..................................................")
14 ​
15 newarr = arr.reshape(1,1,12)
16 print(newarr)
17 print("Array shape: ", newarr.shape)
18 print("..................................................")

..................................................
[ 1 2 3 4 5 6 7 8 9 10 11 12]
Array shape: (12,)
..................................................
[[[ 1 2]
[ 3 4]
[ 5 6]]

[[ 7 8]
[ 9 10]
[11 12]]]
Array shape: (2, 3, 2)
..................................................
[[[ 1 2 3 4 5 6 7 8 9 10 11 12]]]
Array shape: (1, 1, 12)
..................................................

Example: Reshape a 16 elements 1D array into 4-d array

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 10/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [19]: 1 import numpy as np


2 ​
3 arr = np.arange(16)
4 arr = arr.reshape(2,2,2,2)
5 print(arr)

[[[[ 0 1]
[ 2 3]]

[[ 4 5]
[ 6 7]]]

[[[ 8 9]
[10 11]]

[[12 13]
[14 15]]]]

We can reshape the previous array into a 2-d array?

In [20]: 1 print("2 x 8 array:")


2 print(arr.reshape(2,8))
3 print("")
4 print("8 x 2 array:")
5 print(arr.reshape(8,2))

2 x 8 array:
[[ 0 1 2 3 4 5 6 7]
[ 8 9 10 11 12 13 14 15]]

8 x 2 array:
[[ 0 1]
[ 2 3]
[ 4 5]
[ 6 7]
[ 8 9]
[10 11]
[12 13]
[14 15]]

Question: Can we reshape the previous array into a 3-d array?

In [ ]: 1 ​

We can reshape an array into any new shape as long as the elements required for reshaping
are equal in both shapes.

Flattening arrays
Flattening array means converting a multidimensional array into a 1D array.
localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 11/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

We can use reshape(-1) to fatten an array.

Example: Convert a multi-dimensional array into a 1D array:

In [21]: 1 import numpy as np


2 ​
3 arr = np.arange(20).reshape(2,10)
4 print(arr)
5 print("\n")
6 ​
7 arr = arr.reshape(-1)
8 print(arr)

[[ 0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]]

[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]

Practice - Create basic numpy arrays


1. Part 1, write Python script to do the following:

create a new code cell in Jupyter or a new Python file in spyder


import numpy library
use an alias like np during the import process, np is used as a default alias name, you can
use whatever alias name you find suitable
create a 1-dimension array that has the values 100,200,300, 400
print out the values of the object you created
print out the type of the object you created, make sure to know the type of the object
created
print out the number of dimensions of the array you created
print out the shape array you created
reshape the previous array into a 2 x 2 array
flatten the array

2. Part 2, Write Python script to do the following:

create a new code cell in Jupyter or a new Python file in spyder


import numpy library
use an alias like np during the import process, np is used as a default alias name, you can
use whatever alias name you find suitable
create a 2-dimension array that has the arrays: 100, 200, 300, 400 and 500, 600, 700, 800
as its two elements
print out the values of the object you created
print out the type of the object you created, make sure to know the type of the object
created
print out the number of dimensions of the array you created
print out the shape array you created

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 12/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

reshape the previous array into a 4 x 2 array


flatten the array

In [22]: 1 # Part 1 Answer


2 ​
3 ​

In [23]: 1 # Part 2 Answer


2 ​
3 ​

Accessing Array Elements


Array Indexing
We can access an element of an array by referrencing its index number.
The indexes in NumPy arrays start with 0, therefore, the first element has index 0, and the
second has index 1 etc.

Example: Get the first element from the following array:

In [24]: 1 import numpy as np


2 ​
3 arr = np.array([1, 2, 3, 4])
4 print(arr[0])

Example: Get the second element from the following array.

In [25]: 1 import numpy as np


2 ​
3 arr = np.array([1, 2, 3, 4])
4 print(arr[1])

Question: What is the output of the following code:

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 13/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [26]: 1 import numpy as np


2 ​
3 arr = np.array([1, 2, 3, 4])
4 print(arr[3])

Example: Get third and fourth elements from the following array and add them.

In [27]: 1 import numpy as np


2 ​
3 arr = np.array([1, 2, 3, 4])
4 print(arr[2] + arr[3])

Access 2-D array elements


To access elements from 2-D arrays we can use comma separated integers representing the
dimension and the index of the element.

Example: Access number 6 in the following array

In [28]: 1 import numpy as np


2 ​
3 arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
4 print('2nd element of the 1st dim: ', arr[1, 0])

2nd element of the 1st dim: 6

Example: Access the 5th element of the 2nd element in the first dimension

In [29]: 1 import numpy as np


2 ​
3 arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
4 print('5th element (number 10) in the 2nd element ([6,7,8,9,10]) of the fi

5th element (number 10) in the 2nd element ([6,7,8,9,10]) of the first dimens
ion: 10

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 14/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Access 3-D array elements


To access elements from 3-D arrays we can use comma separated integers representing the
dimensions and the index of the element.

Example: Access number 11 in the following 3-d array?

In [30]: 1 import numpy as np


2 ​
3 arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
4 print(arr[1,1,1])

11

arr[1, 1, 1] shall print the value 11.

Explaination:

The first number represents the first dimension, which contains two 2-d arrays: [[1, 2, 3], [4,
5, 6]] and [[7, 8, 9], [10, 11, 12]]
Since we selected 1 at the first index location [1,-,-] , we are left with the second array: [[7,
8, 9], [10, 11, 12]]
The second number in our index represents the second dimension [-, 1, -], which also
contains two arrays: [7, 8, 9] and [10, 11, 12]
Since we selected 1, we are left with the second array: [10, 11, 12]
The third number represents the third dimension [-, -, 1], which contains three values: 10
11 and 12
Since we selected 1, we end up with the second element which is 11

Question: What is the output of the following code?

In [31]: 1 import numpy as np


2 ​
3 arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
4 print(arr[0,1,0])

Negative Indexing
We can use negative indexing to access array elements using a reverse order.

The last element in an array has a negative index of -1

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 15/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Print the last element in the 2nd element of the first dimension:

In [32]: 1 import numpy as np


2 ​
3 arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
4 ​
5 print('Last element from the 2nd element in the first dimension: ', arr[1,

Last element from the 2nd element in the first dimension: 10

Array Slicing
Slicing in python refers to extracting a number of elements from a sequence data type
(array, list,tuple..etc).
To slice a number of elements, We use 2 integers to specify the begining and the end of
the sliced elements [start:end]
We can also define a step value to skip a number of elements while slicing array elements
e.g. [start:end:step]
If we don't provide the slice start value, the default value of 0 will be used.
If we don't provide slice end value, the default value of -1 will be used, which refers to the
last element in the array.

Slice elements from index 1 to index 5 from the following array:

In [33]: 1 import numpy as np


2 ​
3 arr = np.array([10, 20, 30, 40, 50, 60, 70])
4 values = arr[1:5]
5 print(values)

[20 30 40 50]

Slicing with STEP value


We can use the step value to determine the number of elements to skip while slicing elements
from an array.

In [34]: 1 import numpy as np


2 ​
3 arr = np.array([1, 2, 3, 4, 5, 6, 7])
4 print(arr[::3])

[1 4 7]

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 16/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Slicing 2-d arrays


From the second element in the first dimension of the array, slice elements from
index 1 to index 4:

Note: Remember that second element has index 1.

In [35]: 1 import numpy as np


2 ​
3 arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
4 print(arr[1, 1:4])

[7 8 9]

From both elements of the first dimension, return the element at index 2:

In [36]: 1 import numpy as np


2 ​
3 arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
4 ​
5 print(arr[:, 2])

[3 8]

From both elements in the first dimension, slice index 1 to index 4 (not
included), this will return a 2-D array:

In [37]: 1 import numpy as np


2 ​
3 arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
4 print(arr[0:2, 1:50])

[[ 2 3 4 5]
[ 7 8 9 10]]

Negative Slicing
We can use negative indexing to slice elements using reverse indexing scheme

Slice the last 3 elements from the array

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 17/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [38]: 1 import numpy as np


2 ​
3 arr = np.array([1,2,3,4,5,6,7])
4 slicedValues = arr[-3:]
5 print(slicedValues)

[5 6 7]

Question: What is the output of the following code?

In [39]: 1 import numpy as np


2 ​
3 arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
4 slicedValues = arr[-2]
5 print(slicedValues)

[4 5 6]

We can use negative indexing to reverse an array

In [40]: 1 # We can use arange function to create an integer array based on counters
2 x = np.arange(10)
3 ​
4 print("orginal array: ", x)
5 print("Reversed array: ", x[::-1])
6
7 print (x)

orginal array: [0 1 2 3 4 5 6 7 8 9]
Reversed array: [9 8 7 6 5 4 3 2 1 0]
[0 1 2 3 4 5 6 7 8 9]

Numpy Random Numbers Generators


We can use numpy random module to generate integer and floating points arrays of different
sizes.

Generate random integers arrays

Example: Generate a single random integer between 0 and 100:

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 18/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [41]: 1 import numpy as np


2 ​
3 x = np.random.randint(4)
4 print(x)

The randint() method also accepts a size parameter where you can specify the shape of an
array.

Example: Generate a 1-D array containing 5 random integers from 0 to 100:

In [42]: 1 from numpy import random


2 ​
3 x=random.randint(100, size=(5))
4 print(x)

[51 43 87 48 88]

Example: Generate a 2-D array with 3 rows, each row containing 5 random
integers from 0 to 100:

In [43]: 1 from numpy import random


2 ​
3 x = random.randint(100, size=(3, 5))
4 print (x)

[[18 3 24 49 71]
[38 39 96 5 21]
[16 8 3 89 25]]

Generate an array of random floating points between 0 and 1

Example: Use the random module of numpy to generate a 1-d array that
contains 5 floating point numbers

In [44]: 1 import numpy as np


2 ​
3 arr = np.random.rand(5)
4 print(arr)

[0.56007468 0.72382166 0.90505211 0.03938398 0.21142969]

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 19/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Example: Use the random module of numpy to create a 3 x 5 array of floating


point numbers

In [45]: 1 # Generate random floats


2 from numpy import random
3 ​
4 x = random.rand(3,5)
5 print(x)

[[0.00296276 0.32350571 0.39272264 0.35255918 0.8508396 ]


[0.07787988 0.24249263 0.2175688 0.7153107 0.1991786 ]
[0.13902454 0.31083462 0.53086378 0.49882617 0.2545397 ]]

Question: What will the following code do?

In [46]: 1 for x in range(10):


2 y = np.random.rand()
3
4 if y > .5:
5 print("Heads")
6 else:
7 print("Tails")

Heads
Heads
Heads
Tails
Heads
Heads
Heads
Tails
Heads
Heads

Joining Numpy Arrays


Joining means putting contents of two or more arrays in a single array.
In SQL we join tables based on a key, whereas in NumPy we join arrays by axes.
We pass a sequence of arrays that we want to join to the concatenate() function, along
with the axis. If axis is not explicitly passed, it is taken as 0.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 20/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Example: Joinning two arrays

In [47]: 1 import numpy as np


2 ​
3 arr1 = np.array([1, 2, 3])
4 arr2 = np.array([4, 5, 6])
5 arr3 = np.concatenate((arr1, arr2))
6 print(arr3)

[1 2 3 4 5 6]

Example: Vertically stacking arrays

In [48]: 1 import numpy as np


2 ​
3 arr1 = np.array([1, 2, 3])
4 arr2 = np.array([4, 5, 6])
5 arr3 = np.vstack((arr1, arr2))
6 print(arr3)

[[1 2 3]
[4 5 6]]

Example: Horizontally stacking arrays

In [49]: 1 import numpy as np


2 ​
3 arr1 = np.array([1, 2, 3])
4 arr2 = np.array([4, 5, 6])
5 arr3 = np.hstack((arr1, arr2))
6 print(arr3)

[1 2 3 4 5 6]

Filtering & Searching Arrays


We can filter elements and create new arrays.
In Numpy, we filter an array using a Boolean index list.
A Boolean index list is a list of Booleans corresponding to indexes in the array.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 21/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

If the value at an index is True that element is contained in the filtered array, if the value at
that index is False that element is excluded from the filtered array.

Create a new array that contains elements in index 0 and 3 only, and filter out
all the rest.

In [50]: 1 import numpy as np


2 ​
3 arr = np.array([41, 42, 43, 44])
4 ​
5 filtersArray = [True, False, True, False]
6 newarr = arr[filtersArray]
7 print(newarr)

[41 43]

Creating a filter array

We can create a filter array by using comparison operators > < !=

Create a filter array that will return only values higher than 42:

In [51]: 1 import numpy as np


2 ​
3 arr = np.array([41, 42, 43, 44])
4 filter_arr = arr > 42
5 print(filter_arr)

[False False True True]

We can use the filter array to return array elements

In [52]: 1 newarr = arr[filter_arr]


2 print(newarr)

[43 44]

Create a filter array that will return only even elements from the original array:

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 22/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [53]: 1 import numpy as np


2 ​
3 arr = np.array([1,2,3,4,5,6,7])
4 newarr = arr[arr % 2]
5 print(filter_arr)
6 print(newarr)

[False False True True]


[2 1 2 1 2 1 2]

Searching for a single element in an Array

Search for the value 4 in the array

In [54]: 1 import numpy as np


2 ​
3 arr = np.array([1,2,3,4,5,6,7])
4 filter_arr = arr == 4
5 resultArray = arr[filter_arr]
6 ​
7 if len(resultArray) > 0:
8 print("Value found")
9 else:
10 print("Value not found")

Value found

We can use the in operator to search for a value in an array

In [55]: 1 import numpy as np


2 ​
3 arr = np.array([1,2,3,4,5,6,7])
4 ​
5 if 4 in arr:
6 print("Value found")
7 else:
8 print("Value not found")

Value found

Sorting Arrays
Sorting Arrays means putting elements in an ordered sequence.
Ordered sequence is any sequence that has an order corresponding to elements, like
numeric or alphabetical, ascending or descending.
The NumPy ndarray object has a function called sort(), that will sort a specified array.

Sort a numpy array

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 23/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [56]: 1 import numpy as np


2 ​
3 arr = np.array([3, 2, 0, 1])
4 print(np.sort(arr))

[0 1 2 3]

notice that sort is not a inplace operation, the original array remains unchanged

In [57]: 1 print(arr)

[3 2 0 1]

Main Matrix Operations

1. Addition, Subtraction, Multiplication, Division

Similar to how we perform operations on numbers, the same logic also works for matrices
and vectors.
However, we need to note that these operations on matrices have restrictions on two
matrices being the same size. This is because they are performed in an element-wise
manner, which is different from matrix dot product.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 24/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [58]: 1 import numpy as np


2 ​
3 matrix1 = np.array([[1,1],
4 [1,1]])
5 matrix2 = np.array([[2,2],
6 [2,2]])
7 ​
8 ​
9 print("matrix addition\n", matrix2 + matrix1, "\n")
10 print("matrix substraction\n", matrix2 - matrix1, "\n")
11 print("matrix multiplication\n", matrix2 * matrix1, "\n")
12 print("matrix division\n", matrix2 / matrix1, "\n")

matrix addition
[[3 3]
[3 3]]

matrix substraction
[[1 1]
[1 1]]

matrix multiplication
[[2 2]
[2 2]]

matrix division
[[2. 2.]
[2. 2.]]

The Dot Product Operation

Dot product is often being confused with matrix element-wise multiplication (which is
demonstrated above); in fact, it is a more commonly used operation on matrices and
vectors.
Dot product operates by iteratively multiplying each row of the first matrix to the column of
the second matrix one element at a time, therefore the dot product between a j x k matrix
and k x i matrix is a j x i matrix.
Here is an example of how the dot product works between a 3x2 matrix and a 2x3 matrix.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 25/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Dot product operation necessitates the number of columns in the first matrix matching the
number of rows in the second matrix. We use dot() to execute the dot product.
The order of the matrices in the operations is crucial — as indicated below,
matrix2.dot(matrix1) will produce a different result compared to matrix1.dot(matrix2).
Therefore, as opposed to the element-wise multiplication, matrix dot product is not
commutative.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 26/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [59]: 1 import numpy as np


2 ​
3 matrix1 = np.array([[1,1],
4 [2,2],
5 [3,3]])
6 matrix2 = np.array([[4,4,4],
7 [5,5,5]])
8 ​
9 ​
10 print("matrix1, matrix2 dot product\n", matrix1.dot(matrix2), "\n")
11 print("matrix2, matrix1 dot product\n", matrix2.dot(matrix1), "\n")

matrix1, matrix2 dot product


[[ 9 9 9]
[18 18 18]
[27 27 27]]

matrix2, matrix1 dot product


[[24 24]
[30 30]]

Matrix Transpose

Transpose swaps the rows and columns of the matrix, so that an j x k matrix becomes k x
j.
To transpose a matrix, we use matrix.T.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 27/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [60]: 1 import numpy as np


2 ​
3 matrix = np.array([[1,1],
4 [2,2],
5 [3,3]])
6 ​
7 print("Matrix before transpose\n", matrix)
8 print("Transposed matrix:\n", matrix.T)

Matrix before transpose


[[1 1]
[2 2]
[3 3]]
Transposed matrix:
[[1 2 3]
[1 2 3]]

Identity and Inverse Matrix

Inverse is an important transformation of matrices, but to understand inverse matrix we first


need to address what is an identity matrix.
An identity matrix requires the number of columns and rows to be the same and all the
diagonal elements to be 1.
Additionally, a matrix or vector remains the same after multiplying its corresponding identity
matrix.

To create a 3 x 3 identity matrix in Python, we use numpy.identity(3).

In [61]: 1 import numpy as np


2 ​
3 ID3 = np.identity(3)
4 print("3 x 3 Identity Matrix:\n", ID3)

3 x 3 Identity Matrix:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 28/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

The dot product of the matrix itself (stated as M below) and the inverse of the matrix is the
identity matrix which follows the equation:

There are two things to take into consideration with matrix inverse:

1. The order of the matrix and matrix inverse does not matter even though most matrix
dot products are different when the order changes.
2. Not all matrices have an inverse.
To compute inverse of the matrix, we can use np.linalg.inv().

In [62]: 1 import numpy as np


2 ​
3 matrix = np.array([[1,1,1],[0,0,2],[2,0,3]])
4 print("The inverse of the matrix is:\n", np.linalg.inv(matrix))

The inverse of the matrix is:


[[ 0. -0.75 0.5 ]
[ 1. 0.25 -0.5 ]
[ 0. 0.5 0. ]]

At this stage, we have only covered some basic concepts in linear algebra that support the
application for data representation.
If you would like to go deeper into more concepts, there is a free helpful book Mathematics
for Machine Learning (https://mml-book.github.io/book/mml-book.pdf) from Deisenroth,
Faisal and Ong that you can examine.

BONUS Material - ‫غير مطلوب في االمتحان‬

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 29/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Applications of Numpy Arrays in Linear Algebra and


Machine Learning

We will briefly examine three straightforward use cases of using vectors and matrices in
Numpy:

1. Solving systems of linear equations.


2. Solving linear regression predictive modeling.
3. Neural Networks are based on Matrix processing.

1. Systems of Linear Equations


For example, suppose we have a factory that produces a product, and it uses three
different resources to produce that product.
Suppose that the production process can be represented using a system of linear
equations comprised of three variables x, y, and z.
These variables could represent the number of units of three different products produced
by a factory in a day, and the equations represent the amount of resources used in
producing the products.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 30/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

The first equation states that if the factory produces 2 * x units of product A, 4 * y units of
product B, and 6 * z units of product C in a day, then the total resources used is 18.
The second equation states that if the factory produces 4 * x units of product A, 5 * y units
of product B, and 6 * z units of product C in a day, then the total resources used is 24.
The third equation states that if the factory produces 3 * x units of product A, 1 * y units of
product B, and -2 * z units of product C in a day, then the total resources used is 4.

This linear system could be used to find out:

How many units of each product were produced in a day given the total amount of
resources used?
How much of each resource was used to produce a certain number of units of each
product?

A typical way to compute the value of x,y and z is to eliminate one element at a time, which can
take many steps for tree variables.

An alternative solution is to represent it using the dot product between matrix and vector.
We can package all the coefficients into a matrix and all the variable into a vector, hence we get
following:

Matrix representation gives us a different mindset to solve the equation in one step.
As demonstrated below:

We represent the coefficient matrix as M, variable vector as x and output vector y


Then we multiply both side of the equation by inverse of the matrix M.

Since the dot product between inverse of the matrix and the matrix itself is the identity matrix,
we can simplify the solution of the linear equation system as the dot product between the
inverse of the coefficient matrix M and the output vector y.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 31/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [63]: 1 import numpy as np


2 ​
3 M = np.array([[2,4,6],[4,5,6],[3,1,-2]])
4 y = [18, 24, 4]
5 ​
6 # np.linalg.inv computes the inverse of a matrix
7 variables_vector = np.linalg.inv(M).dot(y)
8 print("Values of variable vector is:", variables_vector)
9 print("Values of variable x is:", int(variables_vector[0]))
10 print("Values of variable y is:", int(variables_vector[1]))
11 print("Values of variable z is:", int(variables_vector[2]))

Values of variable vector is: [ 4. -2. 3.]


Values of variable x is: 4
Values of variable y is: -2
Values of variable z is: 2

2. Linear Regression
The same principle shown in solving the linear equation system can be generalized to linear
regression models in machine learning.

Linear regression is a method for modeling the relationship between one or more independent
variables and a dependent variable.
Linear Regression can be solved using different methods (below), all these methods utilize
matrix operations:

1. Matrix reformulation with the normal equations.


2. Using a QR matrix decomposition.
3. Using SVD and the pseudoinverse.

Further information can be found at this link Link (https://machinelearningmastery.com/solve-


linear-regression-using-linear-algebra/)

Example:

Suppose that we have a dataset with n features and m instances.\


One of these features e.g. (y: Salary) depends on the other features e.g. (x1: Experiene,
x2: Education, x3: Skills...etc.).
We can represent linear regression relationship betweem the dependent feature (y) and
the independent features (x1, x2, x3) as the weighted sum of these features.
The objective of creating a linear regression model is to find the values for the coefficient
values (w0, w1, w2...etc) that minimize the error in the prediction of the output variable y.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 32/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

We can represent the formula using the matrix form.

We can store the feature values in a 1 x (n+1) matrix and the weights are stored in an (n+1) x 1
vector. Then we multiply the element with the same color and add them together to get the
weighted sum.

When the number of instances increase, we naturally think of using for loop to iterate an item at
a time which can be time consuming.

By representing the algorithm in the matrix format, the linear regression optimization process
boils down to solving the coefficient vector [w0, w1, w2 … wn] through linear algebra
operations.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 33/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

We can solve Linear Regression directly.


That is, given x, we can find the coefficients w that when multiplied by X will give y.

As we saw in a previous section, the normal equations define how to calculate w directly:

w = (X^T . X)^-1 . X^T . y

This can be calculated directly in NumPy using the inv() function for calculating the matrix
inverse.

w = inv(X.T.dot(X)).dot(X.T).dot(y)

Once the coefficients are calculated, we can use them to predict outcomes given X.

yhat = X.dot(w)

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 34/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [64]: 1 # Linear Regression Example


2 from numpy import array
3 from numpy.linalg import inv
4 from matplotlib import pyplot
5 ​
6 data = array([
7 [0.05, 0.12],
8 [0.18, 0.22],
9 [0.31, 0.35],
10 [0.42, 0.38],
11 [0.5, 0.49],
12 ])
13 ​
14 X, y = data[:,0], data[:,1]
15 X = X.reshape((len(X), 1))
16 # linear least squares
17 w = inv(X.T.dot(X)).dot(X.T).dot(y)
18 print(w)
19 # predict using coefficients
20 yhat = X.dot(w)
21 # plot data and predictions
22 pyplot.scatter(X, y)
23 pyplot.plot(X, yhat, color='red')
24 pyplot.show()

[1.00233226]

Additionally, popular Python libraries such as Numpy and Pandas build upon matrix
representation and utilize “vectorization” to speed up the data processing speed.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 35/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

3. Neural Network
Neural network is composed of multiple layers of interconnected nodes, where the outputs of
nodes from the previous layers are weighted and then aggregated to form the input of the
subsequent layers.

If we zoom into the interconnected layer of a neural network, we can see some components of
the regression model.

Take a simple example that we visualize the inner process of the hidden layer i (with node i1,
i2, i3) and hidden layer j (with node j1, j2) from a neural network.

w11 represents the weight of the input node i1 that feeds into the node j1, and w21 represents
the weight of input node i2 that feeds into node j1. In this case, we can package the weights
into 3x2 matrix.

This can be generalized to thousands or even millions of instances which forms the massive
training dataset of neural network models.

Now this process resembles how we represent the linear regression model, except that we we
use a matrix to store the weights instead of a vector, but the principle remains the same.

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 36/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

To take a step further, we can expand this to deep neural networks for deep learning.

This is where Tensor come into play to represent data with more than two dimensions.

For example, in Convolutional Neural Network, we use 3D Tensor for image pixels, as they are
often depicted through three different channels (i.e., red, green, blue color channel).

As you can see, linear algebra act as the building block in machine learning and deep
learning algorithms, and this is just one of the multiple use cases of linear algebra in data
science.

A Neural Network in 11 lines of code in Python Reference


(http://iamtrask.github.io/2015/07/12/basic-python-network/)

Input features: [[0,0,1],[0,1,1],[1,0,1],[1,1,1]]


Output Features: [[0,1,1,0]]

Objective: Build a Neural Network that can map the relationship between inputs and outputs

In [65]: 1 X = numpy.array([[0,0,1],[0,1,1],[1,0,1],[1,1,1]])
2 y = numpy.array([[0,1,1,0]]).T
3 syn0 = 2*numpy.random.random((3,4)) - 1
4 syn1 = 2*numpy.random.random((4,1)) - 1
5 for j in numpy.arange(60000):
6 l1 = 1/(1+numpy.exp(-(numpy.dot(X,syn0))))
7 l2 = 1/(1+np.exp(-(numpy.dot(l1,syn1))))
8 l2_delta = (y - l2)*(l2*(1-l2))
9 l1_delta = l2_delta.dot(syn1.T) * (l1 * (1-l1))
10 syn1 += l1.T.dot(l2_delta)
11 syn0 += X.T.dot(l1_delta)

In [66]: 1 print (syn0, syn1)

[[ 0.89642903 -7.20395548 -4.91482837 -5.13488518]


[-3.49949717 6.49883509 -4.46420599 6.16026857]
[ 0.85995983 -3.16728359 1.3705138 2.30188465]] [[ 6.98670876]
[12.75748353]
[-5.98082173]
[-7.0643544 ]]

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 37/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Universal Functions in Numpy

Python's default implementation (known as CPython) does some operations very slowly.
Numpy provides an easy and flexible interface to optimized computation with arrays of
data.
The key to making it fast is to use vectorized operations, generally implemented through
NumPy's universal functions (ufuncs).

Lets see an example that compares normal for loop which the UFunc
alternative
- The exampe will compute the reciprocal of elements in an array using a for
loop and then using the Universal Functions.
- We will use timeit to determine the execution time.

Note: reciprocal (‫ )مقلوب العدد‬is the quantity obtained by dividing the n


umber one by a given quantity

In [67]: 1 # Create data array


2 import numpy as np
3 ​
4 np.random.seed(0)
5 big_array = np.random.randint(1, 100, size=1000000)
6 ​
7 # Define conventional function
8 def compute_reciprocals(values):
9 output = np.empty(len(values))
10 for i in range(len(values)):
11 output[i] = 1.0 / values[i]
12 return output

Run conventional function

In [68]: 1 %timeit compute_reciprocals(big_array)

1.81 s ± 48.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Now lets try the Numpy universal function approach

In [*]: 1 %timeit (1.0 / big_array)

The results show that the universal function is much faster than the standard for..loop approach

Anothe example using standard sum function and np.sum function

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 38/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [*]: 1 big_array = np.random.rand(1000000)


2 ​
3 %timeit sum(big_array)
4 %timeit np.sum(big_array)

Exploring NumPy's UFuncs

Array arithmetic, Trigonometric, Exponents and


logarithms

NumPy's ufuncs feel very natural to use because they make use of Python's native arithmetic
operators. The standard addition, subtraction, multiplication, and division can all be used:

In [*]: 1 x = np.arange(4)
2 ​
3 print("x =", x)
4 print("x + 5 =", x + 5)
5 print("x - 5 =", x - 5)
6 print("x * 2 =", x * 2)
7 print("x / 2 =", x / 2)
8 print("x // 2 =", x // 2) # floor division
9 print("-x = ", -x)
10 print("x ** 2 = ", x ** 2)
11 print("x % 2 = ", x % 2)
12 print(abs(x))
13 ​
14 # Trigonometric functions
15 ​
16 theta = np.linspace(0, np.pi, 3)
17 print("theta = ", theta)
18 print("sin(theta) = ", np.sin(theta))
19 print("cos(theta) = ", np.cos(theta))
20 print("tan(theta) = ", np.tan(theta))
21 ​
22 # Exponents and logarithms
23 x = [1, 2, 3]
24 print("x =", x)
25 print("e^x =", np.exp(x))
26 print("2^x =", np.exp2(x))
27 print("3^x =", np.power(3, x))
28 ​
29 x = [1, 2, 4, 10]
30 print("x =", x)
31 print("ln(x) =", np.log(x))
32 print("log2(x) =", np.log2(x))
33 print("log10(x) =", np.log10(x))

Each of these arithmetic operations are simply convenient wrappers around specific functions
built into NumPy; for example, the + operator is a wrapper for the add function:

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 39/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [*]: 1 np.add(x, 2)

The following table lists the arithmetic operators implemented in NumPy:

Operator Equivalent ufunc Description

(+) np.add Addition (e.g., 1 + 1 = 2)


(-) np.subtract Subtraction (e.g., 3 - 2 = 1)
(-) np.negative Unary negation (e.g., -2)
(*) np.multiply Multiplication (e.g., 2 * 3 = 6)
(/) np.divide Division (e.g., 3 / 2 = 1.5)
(//) np.floor_divide Floor division (e.g., 3 // 2 = 1)
(%) np.mod Modulus/remainder (e.g., 9 % 4 = 1)
(*) np.power Exponentiation (e.g., 2 * 3 = 8)

Aggregation Functions

Min, Max, and Everything In Between


There is a large number of statistical functions available in numpy (mean, median, amin, amax,
percentile...etc)

For example

In [*]: 1 a = [10,20,30,40,50,60,70,80,90,100,110]
2 ​
3 arr = np.random.randint(20,size=(10))
4 print("unsorted: ", arr)
5 print("sorted: ", np.sort(arr))
6 ​
7 print("Sum: ", np.sum(arr))
8 print("Mean: ", np.mean(arr))
9 print("Median: ", np.median(arr))
10 print("Standard Deviation: ", np.std(arr))
11 print("Variance: ", np.var(arr))
12 print("Minimum: ", np.amin(arr, 0)) #min in the axis
13 print("Maximum: ", np.amax(arr, 0)) #max in the axis
14 ​
15 #percentile()function used to compute the nth percentile
16 #f the given data (array elements) along the specified axis.
17 print("10% percentile: ", np.percentile(arr, 10))
18 print("50% percentile: ", np.percentile(arr, 50))

Additional descritive functions can be incporporated using other Python libraries e.g. SciPy,
Statmodels...etc

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 40/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

In [*]: 1 from scipy.stats import skew, kurtosis, mode


2 ​
3 print("Mode: ", mode(arr))
4 print("Skew: ", skew(arr))
5 print("Kurtosis: ", kurtosis(arr))

Broadcasting in NumPy Arrays


For arrays of the same size, binary operations are performed on an element-by-element basis

In [*]: 1 import numpy as np


2 ​
3 a = np.array([0, 1, 2])
4 b = np.array([5, 5, 5])
5 a + b

Broadcasting allows these types of binary operations to be performed on arrays of different


sizes.

for example, we can just as easily add a scalar (think of it as a zero-


dimensional array) to an array

In [*]: 1 a + 5

We can think of this as an operation that stretches or duplicates the value 5 into the array
[5, 5, 5], and adds the results.
The advantage of NumPy's broadcasting is that this duplication of values does not actually
take place, but it is a useful mental model as we think about broadcasting.

In [*]: 1 # create a 3 by 3 array with values of 1s


2 c = np.ones((3, 3))
3 print(c)

In [*]: 1 c + a

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 41/42
3/4/23, 11:01 AM chapter_1_numpy - Jupyter Notebook

Here the one-dimensional array a is stretched, or broadcast across the second dimension in
order to match the shape of array c.

While these examples are relatively easy to understand, more complicated cases can involve
broadcasting of both arrays.

Consider the following example:

In [*]: 1 a = np.arange(3)
2 b = np.arange(3)[:, np.newaxis]
3 ​
4 print(a)
5 print(b)

In [*]: 1 a + b

Just as before we stretched or broadcasted one value to match the shape of the other.
Here we've stretched both a and b to match a common shape, and the result is a two-
dimensional array!
The geometry of these examples is visualized in the following figure (Code to produce this
plot can be found in the appendix, and is adapted from source published in the astroML
documentation. Used by permission).

References

www.w3schools.org (http://www.w3schools.org)
www.tutorialspoint.com (http://www.tutorialspoint.com).
Python Data Science Handbook, Jake VanderPlas, 2017 Link 1
(https://jakevdp.github.io/PythonDataScienceHandbook/) Link 2
(https://github.com/jakevdp/PythonDataScienceHandbook)
How is Linear Algebra Applied for Machine Learning (https://medium.com/towards-data-
science/how-is-linear-algebra-applied-for-machine-learning-d193bdeed268)

localhost:8888/notebooks/OneDrive - UNIVERSITY OF PETRA/Petra/20222/307301 Programming BI-Python II/Chapter 1 Numerical Processing usi… 42/42

You might also like