Applied Machine Learning For Engineers: Introduction To Numpy
Applied Machine Learning For Engineers: Introduction To Numpy
FS 2020 - B. Vennemann
Introduction to NumPy
What is NumPy
Python module for scientific computing
Provides efficiency-boost over Python's built-in datatypes
Offers matrix operations and linear algebra
Provides other useful mathematical operations (trigonometic functions, statistical computations, random numbers,
...)
In [1]:
import numpy as np
In [2]:
a = np.array([1, 2, 3])
print(a)
print(type(a))
print(a.dtype) # element datatype
print(a.shape) # shape
print(a.ndim) # number of dimensions
print(a.size) # total number of elements
[1 2 3]
<class 'numpy.ndarray'>
int64
(3,)
1
3
Unlike Python lists, all elements in NumPy arrays must have the same datatype. It can be explicitly defined during array
creation.
In [3]:
[1 2 3]
int16
In [4]:
[[1 2 3]
[4 5 6]]
(2, 3)
In [5]:
z = np.zeros((3, 4))
print(z)
print(z.dtype)
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
float64
In [6]:
[[1 1 1 1]
[1 1 1 1]
[1 1 1 1]]
uint8
In [7]:
[[3 3]
[3 3]
[3 3]
[3 3]]
Create sequence of numbers using np.arange (similar to Python's range, but returns a NumPy array)
In [8]:
[10 15 20 25]
<class 'numpy.ndarray'>
In [9]:
Basic operations
In [10]:
a = np.arange(0, 10)
print(a)
print(a + 2)
print(a - 2)
print(a * 2)
print(a / 2)
print(a ** 2)
[0 1 2 3 4 5 6 7 8 9]
[ 2 3 4 5 6 7 8 9 10 11]
[-2 -1 0 1 2 3 4 5 6 7]
[ 0 2 4 6 8 10 12 14 16 18]
[0. 0.5 1. 1.5 2. 2.5 3. 3.5 4. 4.5]
[ 0 1 4 9 16 25 36 49 64 81]
In [11]:
a = np.arange(0, 10)
b = np.arange(10, 20)
print(a)
print(b)
print(a + b)
print(a * b)
[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
[10 12 14 16 18 20 22 24 26 28]
[ 0 11 24 39 56 75 96 119 144 171]
The ndarray class provides some handy methods, e.g.
mean
max
min
std
...
In [14]:
A = np.array([[1, 5, 7],
[2, 3, 0],
[14, 12, 11]
])
print(A.min())
print(A.max())
print(A.mean())
print(A.std())
0
14
6.111111111111111
4.863570806275398
These operations can also be computed along one axis using the axis keyword
In [15]:
print(np.min(A, axis=0))
print(np.min(A, axis=1))
[1 3 0]
[ 1 0 11]
In [16]:
[[1 2 3 4]
[5 6 7 8]]
In [17]:
Out[17]:
In [18]:
print(A[:, 0])
print(A[0, :])
[1 5]
[1 2 3 4]
3D example
In [19]:
Out[19]:
array([[[0.08324252, 0.15436984],
[0.64798569, 0.18131407],
[0.6383836 , 0.21887796],
[0.39368218, 0.98799556],
[0.22712382, 0.50594579]],
[[0.31433161, 0.59736486],
[0.91134491, 0.9936921 ],
[0.33073117, 0.00934313],
[0.65878591, 0.28912556],
[0.4903908 , 0.32505673]],
[[0.63503595, 0.24154275],
[0.32045564, 0.21570191],
[0.88073534, 0.8438529 ],
[0.47166065, 0.55325182],
[0.97349501, 0.12223283]]])
In [20]:
B[0,:,:]
Out[20]:
array([[0.08324252, 0.15436984],
[0.64798569, 0.18131407],
[0.6383836 , 0.21887796],
[0.39368218, 0.98799556],
[0.22712382, 0.50594579]])
In [21]:
B[0,...]
Out[21]:
array([[0.08324252, 0.15436984],
[0.64798569, 0.18131407],
[0.6383836 , 0.21887796],
[0.39368218, 0.98799556],
[0.22712382, 0.50594579]])
In [22]:
B[0]
Out[22]:
array([[0.08324252, 0.15436984],
[0.64798569, 0.18131407],
[0.6383836 , 0.21887796],
[0.39368218, 0.98799556],
[0.22712382, 0.50594579]])
In [23]:
[6 7]
Slicing with defined step
In [24]:
[5 7]
[5 7]
In [25]:
8
[3 4]
In [26]:
columns = [0, 2, 3]
print(A[0, columns])
[1 3 4]
Quick exercise
Create the following matrix using numpy slicing operations
In [27]:
%%latex
\begin{bmatrix}
1 & 1 & 1 & 1 & 1\\
1 & 2 & 2 & 2 & 1\\
1 & 2 & 3 & 2 & 1\\
1 & 2 & 2 & 2 & 1\\
1 & 1 & 1 & 1 & 1\\
\end{bmatrix}
⎡1 1 1 1 1⎤
⎢ ⎥
⎢1 2 2 2 1⎥
⎢1 2 3 2 1⎥
⎢ ⎥
⎢1 2 2 2 1⎥
⎣1 1 1 1 1⎦
[ ]
1 1 1 1 1
1 2 2 2 1
1 2 3 2 1
1 2 2 2 1
1 1 1 1 1
In [28]:
M = np.ones((5, 5))
M[1:-1, 1:-1] = np.full((3, 3), fill_value=2)
M[2, 2] = 3
M
Out[28]:
In [29]:
[4 5]
[4 5]
[4 5]
In [30]:
In [32]:
arr[0,0] = 20
arr
Out[32]:
array([[20, 4, 6],
[ 2, 5, 7]])
Matrix operations
In [33]:
[[60 12 18]
[ 6 15 21]]
In [34]:
print(arr + 3)
[[23 7 9]
[ 5 8 10]]
In [35]:
print(arr / 2)
[[10. 2. 3. ]
[ 1. 2.5 3.5]]
In [36]:
[[1 2]
[1 2]]
[[3 4]
[3 4]]
[[4 6]
[4 6]]
[[3 8]
[3 8]]
In [37]:
[[1 2]
[1 2]]
[[3 4]
[3 4]]
[[ 9 12]
[ 9 12]]
In [38]:
A = np.array([[1, 2, 3]])
B = np.array([[4], [5], [6]])
print(A.shape)
print(B.shape)
print(np.dot(A, B)) # Shapes must be consistent
(1, 3)
(3, 1)
[[32]]
In [39]:
# When arrays of different datatypes are combined, the resulting array has the more preci
se datatype
A = np.array([1, 2, 3], dtype='uint8')
B = np.array([[4], [5], [6]], dtype='int64')
C = A * B
print(C)
print(C.dtype)
[[ 4 8 12]
[ 5 10 15]
[ 6 12 18]]
int64
Reshaping arrays
In [40]:
A = np.array([1, 2, 3, 4, 5, 6, 7, 8])
print(np.reshape(A, (8, 1)))
print(np.reshape(A, (2, 4)))
[[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]]
[[1 2 3 4]
[5 6 7 8]]
In [41]:
B = A.reshape((2, 4))
print(B) # also works
[[1 2 3 4]
[5 6 7 8]]
In [42]:
[1 2 3 4 5 6 7 8]
In [43]:
print(B)
print(B.T) # Transpose the array
[[1 2 3 4]
[5 6 7 8]]
[[1 5]
[2 6]
[3 7]
[4 8]]
In [44]:
[1 2 3 4 5 6 7 8]
[[1 2 3 4]
[5 6 7 8]]
Stacking arrays
In [45]:
[[3.16485187 5.50691504]
[8.19219751 2.46261689]]
[[0.57457897 0.8490203 ]
[0.70893401 0.2810346 ]]
[[3.16485187 5.50691504]
[8.19219751 2.46261689]
[0.57457897 0.8490203 ]
[0.70893401 0.2810346 ]]
In [46]:
print(np.hstack((A, B)))
np.vstack stacks along the first axis, np.hstack stacks along the second axis.
np.concatenate allows to specify the axis explicitly: np.concatenate((a1, a2, ...), axis=0)
In [47]:
[[3.16485187 5.50691504]
[8.19219751 2.46261689]
[0.57457897 0.8490203 ]
[0.70893401 0.2810346 ]]
In [48]:
Also see np.r_ , np.c_ (similar to hstack and vstack, but allows slicing notation : )
In [49]:
a = np.r_[1, 2, 3, 4:7, 9]
a
Out[49]:
array([1, 2, 3, 4, 5, 6, 9])
Splitting arrays
Horizontal splitting using np.hsplit by specifying either the number of equally-shaped arrays, or the index where to
split
In [50]:
C = np.random.random(size=(6, 6))
C
Out[50]:
In [51]:
D, E, F = np.hsplit(C, 3)
print(D)
print('')
print(E)
print('')
print(F)
[[1.61868575e-01 4.13366875e-01]
[3.72698791e-02 4.71009944e-01]
[4.56622699e-01 8.28516837e-01]
[9.63742500e-02 9.32547885e-01]
[4.05271519e-01 6.24753424e-02]
[4.99657300e-04 7.05094876e-01]]
[[0.87395411 0.73092786]
[0.37242396 0.38790044]
[0.16809002 0.15013425]
[0.2254584 0.927528 ]
[0.62919061 0.64623755]
[0.89660832 0.11522103]]
[[0.49902899 0.86291296]
[0.73599948 0.25923562]
[0.91176474 0.4139083 ]
[0.37044447 0.79934959]
[0.58278234 0.57991 ]
[0.88734888 0.82004693]]
In [52]:
[[1.61868575e-01 4.13366875e-01]
[3.72698791e-02 4.71009944e-01]
[4.56622699e-01 8.28516837e-01]
[9.63742500e-02 9.32547885e-01]
[4.05271519e-01 6.24753424e-02]
[4.99657300e-04 7.05094876e-01]]
[[0.86291296]
[0.25923562]
[0.4139083 ]
[0.79934959]
[0.57991 ]
[0.82004693]]
Common pitfalls:
Be careful when reassigning an array to a different variable!
In [53]:
a = np.arange(0, 10)
a
Out[53]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [54]:
Out[54]:
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [55]:
b[0] = 10
b
Out[55]:
array([10, 1, 2, 3, 4, 5, 6, 7, 8, 9])
So far, so good. But we also changed the original array a in the process, because b points to the same object in
memory.
In [56]:
Out[56]:
array([10, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [57]:
a = np.arange(0, 10)
b = a.copy() # this create a seperate object in memory
b[0] = 10
print(b)
print(a)
[10 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
Further reading
More info at https://docs.scipy.org/doc/ (https://docs.scipy.org/doc/)