Numpy_Learning
May 13, 2025
Import numpy Dependency
[ ]: import numpy as np
creating array using np.array
[ ]: # 1- Array --> printing 1d arrray using numpy
a = np.array([1,2,3,4,5])
print(a)
[1 2 3 4 5]
[ ]: # 2- Array
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(a)
[[1 2 3]
[4 5 6]
[7 8 9]]
[ ]: # 3- Array
a = np.array([[[1,2,3],[4,5,6]],[[7,8,9],[1,2,3]]])
print(a)
[[[1 2 3]
[4 5 6]]
[[7 8 9]
[1 2 3]]]
dtype use for converting from one datatype to another datatype
[ ]: # dtype
np.array([1,2,3],dtype = float)
[ ]: array([1., 2., 3.])
numpy.arange() function is used to create an array of evenly spaced values within a given interval
1
[ ]: # np.arange
print(np.arange(1,11,4))
print(np.arange(2,21,2))
[1 5 9]
[ 2 4 6 8 10 12 14 16 18 20]
reshape() function in NumPy is used to change the shape of a NumPy array without altering its
data
[ ]: # with reshape
np.arange(1,13).reshape(3,4)
[ ]: array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
numpy.ones() function returns a new array of given shape and type, with ones.
[ ]: # np.ones and
np.ones((3,4),dtype = int)
[ ]: array([[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]])
numpy.zeroes() function returns a new array of given shape and type, with zeroes.
[ ]: # np.zeroes
np.zeros((3,4),dtype = int)
[ ]: array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
[ ]: # np.random
np.random.random((3,3))
[ ]: array([[0.54522566, 0.97305955, 0.8949903 ],
[0.75715068, 0.30529836, 0.43230545],
[0.61907267, 0.35941802, 0.91634613]])
linspace function from NumPy creates an array of evenly spaced numbers over a specified inter-
val,useful when needing a specific number of points within a range, unlike arange which relies on a
step size
[ ]: # np.linspace ----> (b/w two point distance is always same)
np.linspace(-10,10,20)
2
[ ]: array([-10. , -8.94736842, -7.89473684, -6.84210526,
-5.78947368, -4.73684211, -3.68421053, -2.63157895,
-1.57894737, -0.52631579, 0.52631579, 1.57894737,
2.63157895, 3.68421053, 4.73684211, 5.78947368,
6.84210526, 7.89473684, 8.94736842, 10. ])
np.identity(), return the identity matrix
[ ]: # np.identity
np.identity((3),dtype = int)
[ ]: array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
[ ]:
[ ]:
1 Array Attribute
[ ]: a1 = np.arange(10) # ---> vector
a2 = np.arange(12,dtype = float).reshape(3,4) # ---> Matrix
a3 = np.arange(8,dtype = np.int64).reshape(2,2,2) # ---> tensor
ndarray.ndim in NumPy returns the number of dimensions of a NumPy array. It is an attribute of
the ndarray object, which represents an n-dimensional array. Scalars have ndim of 0, vectors have
ndim of 1, matrices have ndim of 2, and so on.
[ ]: # ndim (dimension)
print(a1.ndim)
print(a2.ndim)
print(a3.ndim)
1
2
3
shape, reutrn how many column and row in the datasets
[ ]: # shape
a2.shape
[ ]: (3, 4)
size denote how many value present in it
3
[ ]: # Size
print(a2.size)
12
[ ]: # itemsize
a3.itemsize
[ ]: 8
[ ]: # dtype
print(a1.dtype)
print(a2.dtype)
print(a3.dtype)
int32
float64
int64
[ ]:
[ ]:
2 Changing Datatype
astype() method returns a new DataFrame where the data types has been changed to the specified
type
[ ]: # astype
a3.astype(np.int32)
[ ]: array([[[0, 1],
[2, 3]],
[[4, 5],
[6, 7]]])
[ ]:
3 Array Operation
[ ]: a1 = np.arange(12).reshape(3,4)
a2 = np.arange(12,24).reshape(3,4)
[ ]: # scalar operation
4
# arithmetic
a1 ** 2
[ ]: array([[ 0, 1, 4, 9],
[ 16, 25, 36, 49],
[ 64, 81, 100, 121]])
[ ]: # Relational
a1 > 5
[ ]: array([[False, False, False, False],
[False, False, True, True],
[ True, True, True, True]])
[ ]: # Vector operation
# Arithmetic
a1 + a2
[ ]: array([[12, 14, 16, 18],
[20, 22, 24, 26],
[28, 30, 32, 34]])
[ ]:
[ ]:
4 Array Functions
[ ]: a1 = np.random.random((3,3))
a1 = np.round(a1*100)
a1
[ ]: array([[98., 67., 22.],
[99., 51., 83.],
[ 8., 18., 46.]])
[ ]: # min/max/sum/prod
# 0 --> Columns and 1 --> Rows
print(np.min(a1,axis = 1))
print(np.max(a1))
print(np.sum(a1))
print(np.prod(a1))
[22. 51. 8.]
99.0
5
492.0
400984279065216.0
[ ]: # mean/median/std/var
print(np.mean(a1,axis=1))
print(np.median(a1,axis=1))
print(np.std(a1,axis=1))
print(np.var(a1,axis=1))
[62.33333333 77.66666667 24. ]
[67. 83. 18.]
[31.2018518 19.95550606 16.08311744]
[973.55555556 398.22222222 258.66666667]
[ ]: # trigonometric function
np.sin(a1)
[ ]: array([[-0.57338187, -0.85551998, -0.00885131],
[-0.99920683, 0.67022918, 0.96836446],
[ 0.98935825, -0.75098725, 0.90178835]])
[ ]: # dot product ( fist matrix columns should be equal to second matrix row)
a2 = np.arange(12).reshape(3,4)
a3 = np.arange(12,24).reshape(4,3)
[ ]: np.dot(a2,a3)
[ ]: array([[114, 120, 126],
[378, 400, 422],
[642, 680, 718]])
[ ]: # log and exponents
np.log(a1)
[ ]: array([[4.58496748, 4.20469262, 3.09104245],
[4.59511985, 3.93182563, 4.41884061],
[2.07944154, 2.89037176, 3.8286414 ]])
[ ]: # round/floor/ceil
print(np.round(np.random.random((2,3))*100))
print()
print(np.floor(np.random.random((2,3))*100))
print()
print(np.ceil(np.random.random((2,3))*100))
[[83. 23. 20.]
[59. 57. 38.]]
6
[[55. 30. 90.]
[21. 47. 87.]]
[[54. 7. 60.]
[60. 63. 35.]]
[ ]:
[ ]:
5 Indexing And Slicing
[ ]: a1 = np.arange(10)
a2 = np.arange(12).reshape(3,4)
a3 = np.arange(8).reshape(2,2,2)
[ ]: a1
[ ]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
[ ]: # indexing (1d Araay)
a1[-1]
[ ]: 9
[ ]: a2
[ ]: array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
[ ]: a2[1,2]
[ ]: 6
[ ]: a3
[ ]: array([[[0, 1],
[2, 3]],
[[4, 5],
[6, 7]]])
[ ]: a3[0,0,1]
[ ]: 1
7
[ ]: # slicing
a1[2:6]
[ ]: array([2, 3, 4, 5])
[ ]: a2
[ ]: array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
[ ]: a2[2,:]
[ ]: array([ 8, 9, 10, 11])
[ ]: a2[:,3]
[ ]: array([ 3, 7, 11])
[ ]: a2[::2,::3]
[ ]: array([[ 0, 3],
[ 8, 11]])
[ ]: a2[::2,1::2]
[ ]: array([[ 1, 3],
[ 9, 11]])
[ ]: a2[0:2,1:]
[ ]: array([[1, 2, 3],
[5, 6, 7]])
[ ]: for i in np.nditer(a3): # it's convert 3d array to 2d
print(i)
0
1
2
3
4
5
6
7
[ ]:
8
[ ]: # Transpose
np.transpose(a2)
a2.T
[ ]: array([[ 0, 4, 8],
[ 1, 5, 9],
[ 2, 6, 10],
[ 3, 7, 11]])
[ ]: # ravel
a2.ravel()
[ ]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
[ ]:
6 Stacking
[ ]: a4 = np.arange(12).reshape(3,4)
a5 = np.arange(12,24).reshape(3,4)
[ ]: # horizontal stacking
np.hstack((a4,a5))
[ ]: array([[ 0, 1, 2, 3, 12, 13, 14, 15],
[ 4, 5, 6, 7, 16, 17, 18, 19],
[ 8, 9, 10, 11, 20, 21, 22, 23]])
[ ]: # vertical stacking
np.vstack((a4,a5))
[ ]: array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
[ ]:
9
7 Splitting
[ ]: # horizontal splitting
np.hsplit(a4,2)
[ ]: [array([[0, 1],
[4, 5],
[8, 9]]),
array([[ 2, 3],
[ 6, 7],
[10, 11]])]
[ ]: # vertical splitting
np.hsplit(a5,4)
[ ]: [array([[12],
[16],
[20]]),
array([[13],
[17],
[21]]),
array([[14],
[18],
[22]]),
array([[15],
[19],
[23]])]
[ ]:
[ ]:
[ ]:
8 Numpy array vs python list
[ ]: # speed
# list
a = [i for i in range(10000000)]
b = [i for i in range(10000000,20000000)]
c = []
import time
start = time.time()
for i in range(len(a)):
c.append(a[i] + b[i])
print(time.time()-start)
10
2.9204180240631104
[ ]: # numpy
a = np.arange(10000000)
b = np.arange(10000000,20000000)
start = time.time()
c = a+b
print(time.time()-start)
0.14705801010131836
[ ]: 2.9204180240631104/0.14705801010131836
[ ]: 19.858952409671762
[ ]: # memory
# 1 --> when we using python
a = [ i for i in range(10000000)]
import sys
sys.getsizeof(a)
[ ]: 89095160
[ ]: # 2 --> when we using numpy
a = np.arange(10000000,dtype = np.int8)
sys.getsizeof(a)
[ ]: 10000112
[ ]:
[ ]:
[ ]:
9 Advanced Indexing
[ ]: a = np.arange(12).reshape(4,3)
a
[ ]: array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
11
[ ]: # fancy indexing
a[[0,2,3]]
[ ]: array([[ 0, 1, 2],
[ 6, 7, 8],
[ 9, 10, 11]])
[ ]: a[:,[1,2]]
[ ]: array([[ 1, 2],
[ 4, 5],
[ 7, 8],
[10, 11]])
[ ]: # Boolean indexing ---------- (important)
a = np.random.randint(1,100,24).reshape(6,4)
a
[ ]: array([[92, 24, 10, 60],
[91, 44, 7, 6],
[82, 49, 61, 13],
[84, 15, 58, 48],
[84, 93, 50, 25],
[81, 39, 27, 77]])
[ ]: # find all number greater than 50
a[a > 50]
[ ]: array([92, 60, 91, 82, 61, 84, 58, 84, 93, 81, 77])
[ ]: # find out even numbers
a[a%2 == 0]
[ ]: array([92, 24, 10, 60, 44, 6, 82, 84, 58, 48, 84, 50])
[ ]: # find all number greater than 50 and even
a[(a > 50 ) & (a%2 ==0)]
[ ]: array([92, 60, 82, 84, 58, 84])
[ ]: # find all number not divisible by 7
a[a%7!=0]
[ ]: array([92, 24, 10, 60, 44, 6, 82, 61, 13, 15, 58, 48, 93, 50, 25, 81, 39,
27])
[ ]:
12
[ ]:
10 Broadcasting
The term broadcasting how numpy treat array with different shapes during arithmatic operations.
The smaller array is “broadcast” across the larger array so that they have compatible shapes.
[ ]: # same shape
a = np.arange(6).reshape(2,3)
b = np.arange(6,12).reshape(2,3)
print(a)
print("*"*50)
print(b)
print("*"*50)
print(a+b)
[[0 1 2]
[3 4 5]]
**************************************************
[[ 6 7 8]
[ 9 10 11]]
**************************************************
[[ 6 8 10]
[12 14 16]]
[ ]: # diff shape
a = np.arange(6).reshape(2,3)
b = np.arange(3).reshape(1,3)
print(a)
print("*"*50)
print(b)
print("*"*50)
print(a+b)
[[0 1 2]
[3 4 5]]
**************************************************
[[0 1 2]]
**************************************************
[[0 2 4]
[3 5 7]]
[ ]:
11 Broadcasting Rules
1. Make the two arrays have the same number of dimensions.
13
If the numbers of dimensions of the two arrays are different, add new dimensions with size 1
to the head of the array with the smaller dimension.
2. Make each dimension of the two arrays the same size.
If the sizes of each dimension of the two arrays do not match, dimensions with size 1 are
stretched to the size of the other array. If there is a dimension whose size is not 1 in either
of the two arrays, it cannot be broadcasted, and an error is raised.
[ ]: # More examples
a = np.arange(12).reshape(4,3)
b = np.arange(3)
print(a)
print(b)
print(a+b)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
[0 1 2]
[[ 0 2 4]
[ 3 5 7]
[ 6 8 10]
[ 9 11 13]]
[ ]: a = np.arange(3).reshape(1,3)
b = np.arange(3).reshape(3,1)
print(a)
print(b)
print(a+b)
[[0 1 2]]
[[0]
[1]
[2]]
[[0 1 2]
[1 2 3]
[2 3 4]]
[ ]: a = np.arange(3).reshape(1,3)
b = np.arange(4).reshape(4,1)
print(a)
14
print(b)
print(a + b)
[[0 1 2]]
[[0]
[1]
[2]
[3]]
[[0 1 2]
[1 2 3]
[2 3 4]
[3 4 5]]
[ ]: a = np.array([1])
# shape -> (1,1)
b = np.arange(4).reshape(2,2)
# shape -> (2,2)
print(a)
print(b)
print(a+b)
[1]
[[0 1]
[2 3]]
[[1 2]
[3 4]]
[ ]:
[ ]:
12 working with mathematical formulas
[ ]: a = np.arange(10)
np.sin(a)
[ ]: array([ 0. , 0.84147098, 0.90929743, 0.14112001, -0.7568025 ,
-0.95892427, -0.2794155 , 0.6569866 , 0.98935825, 0.41211849])
[ ]: # sigmoid (occur between 0 or 1)
def sigmoid(array):
return 1/(1 + np.exp(-(array)))
15
a = np.arange(10)
sigmoid(a)
[ ]: array([0.5 , 0.73105858, 0.88079708, 0.95257413, 0.98201379,
0.99330715, 0.99752738, 0.99908895, 0.99966465, 0.99987661])
[ ]: # mean squarred error
actual = np.random.randint(1,50,25)
predicted = np.random.randint(1,50,25)
[ ]: predicted
[ ]: array([24, 22, 7, 33, 29, 34, 37, 48, 20, 41, 13, 41, 47, 33, 22, 36, 8,
18, 27, 19, 2, 36, 9, 15, 16])
[ ]: actual
[ ]: array([11, 43, 36, 45, 35, 19, 11, 35, 43, 47, 25, 25, 35, 7, 41, 17, 33,
7, 6, 22, 49, 47, 12, 11, 39])
[ ]: def mse(actual,predicted):
return np.mean((actual - predicted)**2)
mse(actual,predicted)
[ ]: 371.52
[ ]:
13 working with missing values -> np.nan
[ ]: # Working with missing values -> np.nan
a = np.array([1,2,3,4,np.nan,6])
a
[ ]: array([ 1., 2., 3., 4., nan, 6.])
[ ]: a[~np.isnan(a)]
[ ]: array([1., 2., 3., 4., 6.])
[ ]: np.isnan(a).sum()
[ ]: 1
[ ]:
16
[ ]:
14 Some other methods
This is simple type of sorting where we sort any value in incr order or dec order using sort() function
[ ]: # sorting
a = np.random.randint(1,100,12)
np.sort(a)
[ ]: array([ 9, 11, 12, 17, 17, 18, 26, 36, 69, 88, 94, 95])
When we want to sort value, there is two way to sort value
• Using Row where we put axis=0
• Using Column where we put axis=1
[ ]: # sorting
b= np.random.randint(1,100,24).reshape(6,4)
np.sort(b,axis=0)
[ ]: array([[12, 11, 62, 9],
[34, 15, 71, 16],
[47, 25, 78, 25],
[68, 65, 80, 34],
[83, 72, 96, 79],
[93, 89, 98, 87]])
Append values to the end of an array.
[ ]: # append
np.append(a,200)
[ ]: array([ 17, 17, 12, 26, 9, 69, 94, 18, 88, 36, 11, 95, 200])
[ ]: np.append(b,np.random.random((b.shape[0],1)),axis=1)
[ ]: array([[5.20000000e+01, 6.40000000e+01, 5.60000000e+01, 8.00000000e+01,
2.85113919e-02],
[6.00000000e+00, 4.00000000e+01, 1.70000000e+01, 2.30000000e+01,
5.06198386e-01],
[2.30000000e+01, 8.00000000e+01, 7.20000000e+01, 3.00000000e+00,
9.15612288e-01],
[4.20000000e+01, 2.50000000e+01, 7.30000000e+01, 4.30000000e+01,
9.37391228e-01],
[2.90000000e+01, 2.20000000e+01, 8.50000000e+01, 6.10000000e+01,
7.14930956e-01],
[1.30000000e+01, 2.00000000e+01, 2.10000000e+01, 1.60000000e+01,
17
7.51249861e-01]])
[ ]: # concatenate
[ ]:
18