MUTEESA 1 ROYAL UNIVERSITY
FACULTY OF SCIENCE, TECHNOLOGY,
ENGINEERING, ART AND DESIGN
DEPARTMENT: INFORMATION TECHNOLOGY
YEAR: THREE
SEMESTER: ONE
COURSE UNIT: DATA SCIENCE
LECTURER: MR. MATOVU DAVID
NAME: NAYIGA JOVIA
REG NUMBER: 21/U/BIT/0331/K/D
COURSEWORK REPORT
DATA HANDLING USING NUMPY
NUMPY-Numpy stands for “Numeric Python” or “Numerical python”. Numpy is a package that
contains several classes, functions, variables etc. to deal with scientific calculations in Python.
Numpy is useful to create and process single and multi-dimensional arrays.
ARRAYS IN NUMPY
1. 1D ARRAY
This is the one dimensional array and it contains elements only in one dimension.
Examples of 1D arrays in numpy
Example
import numpy
a = numpy.array([10,20,30,40,50])
print(a)
Here we import it as npp
import numpy as npp
a = npp.array([10,20,30,40,50])
print(a)
NB: When we use the statement “from numpy import *” we do not add anything in
front of the array function just as seen below
from numpy import *
a = array([10,20,30,40,50])
print(a)
Output
[10, 20,30,40,50]
IMPLEMENTATION OF 1D ARRAY FUNCTIONS
Using array() function
The default data type is “int”
from numpy import *
Arr=array([10,20,30,40,50],int)
print (Arr)
Output
[10, 20,30,40,50]
Note: If you are creating an array and you include a float value, the output will all
be converted to float data type
from numpy import *
a=array([10,30,40.5,50,100])
print(a)
Output
[10.0,30.0,40.5,50.0,100.0]
Using linspace() function as seen below
It is used to create an array with evenly spaced points between a starting and
ending point
import numpy as np
a=np.linspace(1,10,10)
print(a)
Output
[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
Using arange() Function
This is the same as range() function in python
Below is the format used to create an array using arange() function
Syntax
arange(start,stop,stepsize)
arange(10) – will create an array with values [0,1,2,3,4,5,6,7,8,9]
arange(5,10) – will create an array with values [5, 6,7,8,9]
arange(10,1,-1) will create an array with values[10,9,8,7,6,5,4,3,2]
Examples
import numpy as np
a = np.arange(10)
b = np.arange(5,10)
c = np.arange(10,1,-1)
print(a)
print(b)
print(c)
Output
[0,1,2,3,4,5,6,7,8,9]
[5,6,7,8,9]
[10, 9, 8, 7, 6, 5, 4, 3, 2]
Using ones() and zeros() functions
We can use zeros() function to create an array with all zeros or use ones() function to create an
array with all 1s.
Below is the format
zeros(n,datatype)
ones(n,datatype)
NB: if datatype is missing then the default value will be float.
Example
import numpy as np
k = np.zeros(5)
R = np.ones(5)
print(k)
print(R)
Output
[0.,0.,0.,0.,0.]
[1.,1.,1.,1.,1.]
MATHEMATICAL OPERATIONS ON ARRAYS
We can perform addition, subtraction, division, multiplication and others on elements in an
array.
Examples
import numpy as np
k = np.array([10,20,30,40,50])
k = k+5 (to add 5 to the values in the array)
print(k)
k = k-5 (to subtract 5 from the values in the array)
print(k)
k = k*5 (to multiply each value in the array by 5)
print(k)
k = k/5 (to divide the values in the array by 5)
print(k)
Output
[15 25 35 45 55]
[10 20 30 40 50]
[ 50 100 150 200 250]
[10. 20. 30. 40. 50.]
Aliasing an array
This means creating a new array in reference to the original array in place. It does not make a
new copy of the array defined earlier. It just refers to it.
Example
import numpy as np
k = np.array([3,5,6,7,8])
print(k)
h = k (Another name given to array k)
print(h)
print(k)
k[0] = 45
print(h)
print(k)
Output
[3 5 6 7 8]
[3 5 6 7 8]
[3 5 6 7 8]
[45 5 6 7 8]
[45 5 6 7 8]
Using copy() method
This is used to copy the contents of one array to another.
import numpy as np
k = np.array([3,5,6,7,8])
print(k)
h = k.copy() (to create a copy of array k and name it h)
print(h)
print(k)
k[0]= 45
print(h)
print(k)
Output
[3 5 6 7 8]
[3 5 6 7 8]
[3 5 6 7 8]
[3 5 6 7 8]
[45 5 6 7 8]
2-D ARRAY
This is known as the two dimensional array and it contains more than one row (arranged
horizontally) or column(arranged vertically).
Example
import numpy as np
x =np.array([[2,4,6],[6,8,10]])
print(x)
Output
[ [2 4 6]
[6 8 10] ]
Using the ndim attribute
ndim attribute is used to represent the number of dimensions of axes of the array. The number
of dimensions is also known as “rank”.
Example of using ndim
import numpy as np
A = np.array([5,6,7,8])
R = np.array([[4,5,6],[7,8,9]])
print(A.ndim) (number of rows in array A)
print(R.ndim) (number of rows in array R)
Output
Using the shape attribute
The shape attribute gives the shape of an array. The shape is tuple listing the number of
elements along each dimension. For 1D array, it will display a single value and for a 2D array it
will display two values separated by commas to represent rows and columns.
Example
import numpy as np
k = np.array([1,2,3,4,5])
print(k.shape) (number of elements in array k)
d = np.array([[5,6,7],[7,8,9]])
print(d.shape) (number of rows and columns in array d)
Output
(5,)
(2, 3)
Using the size attribute
This attribute gives the total number of elements in the array
Example
import numpy as np
a1 = np.array([1,2,3,4,5])
print(a1.size)
Output
For a 2D array, the result will be total rows*total columns
import numpy as np
k = np.array([[5,6,7],[7,8,9]])
print(k.size)
Output
Using the itemsize attribute
The itemsize attributes gives the memory size of array elements in bytes.
Example
import numpy as np
a1 = np.array([1,2,3,4,5])
print(a1.itemsize)
Output
Using the reshape() method
The reshape() method is useful to change the shape of an array
NB: The new array should have the same number of elements as in the original array
Example
import numpy as np
d =np.array([[4,5,6,7],[5,6,7,8],[7,8,9,6]])
print(d)
d = d.reshape(6,2) (to reshape array d to 6 rows and 2 columns)
print(d)
d =d.reshape(1,12) (to reshape array d to 1row and 12 columns)
print(d)
d = d.reshape(12,1) (to reshape array d to 12rows and 1 column)
print(d)
Output
[[4 5 6 7]
[5 6 7 8]
[7 8 9 6]]
[[4 5]
[6 7]
[5 6]
[7 8]
[7 8]
[9 6]]
[[4 5 6 7 5 6 7 8 7 8 9 6]]
[[4]
[5]
[6]
[7]
[5]
[6]
[7]
[8]
[7]
[8]
[9]
[6]]
Using the empty() function
The empty() function is used to create the empty array or an uninitialized array of specified
data types and shape.
Example
import numpy as np
x = np.empty([3,2], dtype = int)
y = np.empty([4,4], dtype = float)
print(x)
print(y)
Output
[[0 0]
[0 0]
[0 0]]
[[6.23042070e-307 4.67296746e-307 1.69121096e-306 8.45593934e-307]
[6.23058028e-307 2.22522597e-306 1.33511969e-306 1.37962320e-306]
[9.34604358e-307 9.79101082e-307 1.78020576e-306 1.69119873e-306]
[2.22522868e-306 1.24611809e-306 8.06632139e-308 2.29178686e-312]]
Using the eye() or identity() function
The eye() function creates a 2D array and fills the elements in the diagonal with 1s.
Syntax
eye(n, dtype=datatype)
This function will create an array with n rows and n columns with diagonal elements as 1s.
NB: The default data type is float.
Example
import numpy
a = numpy.eye(3)
print(a)
output
[ [ 1. 0. 0.]
[ 0. 1. 0.]
[0. 0. 1.] ]
Using the zeros() function in 2D array
This function is used to create two dimensional array with the 0 as default value and default
data type is float.
Example
import numpy
Q = numpy.zeros([3,2], dtype = int)
Z = numpy.zeros([4,4], dtype = float)
print(Q)
print(Z)
Output
[[0 0]
[0 0]
[0 0]]
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]]
Using the ones() function in 2D array
This function will be used to create the array with 1 as default value for each of individual
defined element.
Example
import numpy
Q = numpy.ones([3,2],dtype = int)
Z = numpy.ones([4,4],dtype = float)
print(Q)
print(Z)
Output
[[1 1]
[1 1]
[1 1]]
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
JOINING IN ARRAYS
Joining in arrays can be done in three methods;
concatenate()
This is used to join more than one array but the array must be of the same shape
Example
import numpy as np
a=np.array([2,3,4,50])
b=np.array([8,9,10,11,15])
c=np.concatenate([a,b])
print(c)
Output
[2 3 4 50 8 9 10 11 15]
Example 2
import numpy as np
a=np.array([[2,3,4],[4,5,6],[7,8,9]])
b=np.concatenate([a,a],axis=1) (to concatenate array a with array a column wise)
print(b)
Output
[ [2 3 4 2 3 4]
[4 5 6 4 5 6]
[7 8 9 7 8 9] ]
Example 3
import numpy as np
a=np.array([[2,3,4],[4,5,6],[7,8,9]])
b=np.concatenate([a,a],axis=0) ( to concatenate array a with array a row wise)
print(b)
Output
[ [2 3 4]
[4 5 6]
[7 8 9]
[2 3 4]
[4 5 6]
[7 8 9] ]
hstack()
This is another joining method used to join more than one array horizontally or row wise.
Example
import numpy as np
a=np.array([1,2,3])
b=np.array([10,11,12])
c=np.hstack((a,b))
print(c)
Output
[1 2 3 10 11 12]
vstack()
This is the third joining method used to join more than one array vertically or column wise.
Example
import numpy as np
a=np.array([1,2,3])
b=np.array([10,11,12])
c=np.hstack((a,b))
print(c)
Output
[[1 2 3 ]
[10 11 12]]
ARRAY SUBSETS
We can get subsets of a numpy array by using any of the following methods;
Using split()
This is used to split a numpy array into equal or unequal parts.
Example
import numpy as np
x =[1,2,3,99,99,3,2,1]
x1,x2,x3=np.split(x,[3,5]) (to split the array into three subsets)
print(x1,x2,x3)
Output
[1 2 3] [99 99] [3 2 1]
Using hsplit()
This is used to provide the subsets of an array after splitting it horizontally.
Example
import numpy as np
a= np.arange(16).reshape((4,4))
print(a)
Output
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
left,right =np.hsplit(a,2)
print(left)
print(right)
Output
[[ 0 1]
[ 4 5]
[ 8 9]
[12 13]]
[[ 2 3]
[ 6 7]
[10 11]
[14 15]]
Using vsplit()
This is used to provide the subsets of an array after splitting it vertically.
Example
import numpy as np
a= np.arange(16).reshape((4,4))
print(a)
Output
[[ 0, 1, 2, 3]
[ 4, 5, 6, 7]
[ 8, 9, 10, 11]
[12, 13, 14, 15]]
top, bottom =np.vsplit(a,2)
print(top)
print(bottom)
Output
[[0 1 2 3]
[4 5 6 7]]
[[ 8 9 10 11]
[12 13 14 15]]
STATISTICAL FUNCTION IN NUMPY
Below is the description of the functions
np.mean() – arithmetic mean along the specified axis
np.std() – standard deviation along the specified axis
np.var() – variance along the specified axis
np.sum() – sum of array elements over a given axis
np.prod() – product of array elements over a given axis
Examples
import numpy as np
array1 = np.array([[10,20,30],[40,50,60]])
print("Mean: ", np.mean(array1))
print("Std: ", np.std(array1))
print("Var: ", np.var(array1))
print("Sum: ", np.sum(array1))
print("Prod: ", np.prod(array1))
Output
Mean: 35.0
Std: 17.07825127659933
Var: 291.6666666666667
Sum: 210
Prod: 720000000
Covariance
Covariance is a measure of how two variables vary together
Example
import numpy as np
x =np.array([0,1,2])
y =np.array([2,1,0])
print("\nOriginal array1:")
print(x)
print("\nOriginal array2:")
print(y)
print("\nCovarience matrix of the said arrays:\n",np.cov(x,y))
Output
Original array1:
[0 1 2]
Original array2:
[2 1 0]
Covarience matrix of the said arrays:
[[ 1. -1.]
[-1. 1.]]
REFERENCES
CHAPTER 9: HANDLING DATA USING NUMPY (www.python4csip.com)