Unit 3 - Data Computation
Unit 3 - Data Computation
Unit III
By
Team – Essentials of Data Science
School of Computer Engineering,
MIT Academy of Engineering, Alandi(D.)
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS 1
Arithmetic and Statistical Data FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE
Operations
● Arithmetic and statistical data operations are fundamental operations in data
science. Python offers several libraries and modules for performing these
operations. Here are some of the commonly used libraries:
1. NumPy: NumPy is a library for the Python programming language that provides
support for large, multi-dimensional arrays and matrices, as well as functions for
mathematical operations on these arrays.
2. Pandas: Pandas is a library that provides high-performance, easy-to-use data
structures and data analysis tools. It provides a DataFrame object for working with
tabular data.
Output: [5 7 9]
4.Division: To perform division, you can use the '/' operator. For example, to divide
two arrays using NumPy:
import numpy as np
a = np.array([1, 2, 3]) Output: [0.25 0.4 0.5 ]
b = np.array([4, 5, 6])
c=a/b
print(c)
Operations
3.Multiplication: To perform multiplication, you can use the '*' operator. For example,
to multiply two arrays using NumPy:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6]) Output: [ 4 10 18]
c=a*b
print(c)
4.Division: To perform division, you can use the '/' operator. For example, to divide
two arrays using NumPy:
import numpy as np
a = np.array([1, 2, 3]) Output: [0.25 0.4 0.5 ]
b = np.array([4, 5, 6])
c=a/b
print(c)
2.Floor Division: Python provides a floor division operator (//) that returns
the quotient of a division operation, rounded down to the nearest integer.
2.Median: To calculate the median of a set of values, you can use the median()
function of NumPy. For example, to calculate the median of an array:
import numpy as np
a = np.array([1, 2, 3]) Output: 2.0
median_a = np.median(a)
print(median_a)
import numpy as np
a = np.array([1, 2, 3])
std_a = np.std(a)
print(std_a)
Output: 0.816496580927726
Bitwise Operators
● Bitwise operators are used in data science for performing operations on binary
data, such as images or audio signals, and manipulating individual bits within
numerical data types. In Python, bitwise operators can be used with NumPy arrays
to perform bitwise operations on large sets of binary data.
● Here are some of the most commonly used bitwise operators in NumPy:
1. Bitwise AND (&): The bitwise AND operator compares two sets of binary
numbers and returns a new binary number where each bit is 1 only if both input
bits are 1.
import numpy as np
a = np.array([0b1010, 0b1100, 0b1111])
b = np.array([0b1100, 0b1010, 0b0101])
c = np.bitwise_and(a, b)
SCHOOL OF COMPUTER ENGINEERING &
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE
Bitwise Operators
Bitwise OR (|): The bitwise OR operator compares two sets of binary numbers and
returns a new binary number where each bit is 1 if either of the input bits is 1.
a = np.array([0b1010, 0b1100, 0b1111])
b = np.array([0b1100, 0b1010, 0b0101])
c = np.bitwise_or(a, b)
print(c) # Output: [14, 14, 15]
Bitwise XOR (^): The bitwise XOR operator compares two sets of binary numbers
and returns a new binary number where each bit is 1 only if one of the input bits is 1.
a = np.array([0b1010, 0b1100, 0b1111])
b = np.array([0b1100, 0b1010, 0b0101])
c = np.bitwise_xor(a, b)
print(c) # Output: [6, 6, 10]
SCHOOL OF COMPUTER ENGINEERING &
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE
Bitwise Operators
4. Bitwise NOT (~): The bitwise NOT operator inverts each bit in a binary number,
returning its one's complement.
a = np.array([0b1010, 0b1100, 0b1111])
c = np.bitwise_not(a)
print(c) # Output: [-11, -13, -16]
Linear Algebra
● The Linear Algebra module of NumPy offers various methods to apply linear
algebra on any numpy array.
One can find:
• rank, determinant, trace, etc. of an array.
• eigen values of matrices
• matrix and vector products (dot, inner, outer,etc. product), matrix exponentiation
• solve linear or tensor equations and much more!
Linear Algebra
1. Eigendecomposition: Eigendecomposition is used to find the eigenvectors and
eigenvalues of a matrix. Eigenvectors and eigenvalues are used in various data
science techniques such as principal component analysis (PCA), image
compression, and recommender systems.
2. Singular Value Decomposition (SVD): Singular Value Decomposition (SVD) is
used to decompose a matrix into its singular values and singular vectors. SVD is
used in various data science techniques such as image compression, dimensionality
reduction, and recommendation systems.
Linear Algebra
3. Linear Transformations: Linear transformations are used to transform data from
one coordinate system to another. In data science, linear transformations are used in
various techniques such as data augmentation, data normalization, and data
preprocessing.
4. Determinant: The determinant of a matrix is used to find the volume scaling factor
of a linear transformation. The determinant is used in various data science techniques
such as PCA, optimization, and regression analysis.
arr2[0] = 4
print(arr1) # Output: [1 2 3]
print(arr2) # Output: [4 2 3]
arr2[0] = 4
print(arr1) # Output: [4 2 3]
print(arr2) # Output: [4 2 3]
arr2[0] = 4
print(arr1) # Output: [4 2 3]
print(arr2) # Output: [4 2 3]
arr2[0] = 4
print(arr1) # Output: [4 2 3]
print(arr2) # Output: [4 2 3]
Stacking
In NumPy, stacking refers to the process of combining arrays by
adding one on top of the other or side by side. There are several
functions in NumPy that allow you to stack arrays in different ways.
1.numpy.hstack: Stack arrays in sequence horizontally (column-
wise).
import numpy as np
print(arr3) # Output: [1 2 3 4 5 6]
Stacking
2.numpy.vstack: Stack arrays in sequence vertically (row-wise).
import numpy as np
Stacking
import numpy as np
Stacking
4.numpy.row_stack: Stack 1-D arrays as rows into a 2-D array.
import numpy as np
Data Sorting
Sorting means putting elements in an ordered sequence.
Ordered sequence is any sequence that has an order corresponding
to elements, like numeric or alphabetical, ascending or descending.
The NumPy ndarray object has a function called sort(), that will
sort a specified array.
import numpy as np import numpy as np
print(np.sort(arr)) print(np.sort(arr))
Filtering Arrays
Getting some elements out of an existing array and creating a new
array out of them is called filtering.
In NumPy, you filter an array using a boolean index list.
A boolean index list is a list of booleans corresponding to indexes in
the array.
If the value at an index is True that element is contained in the
filtered array, if the value at that index is False that element is
excluded from the filtered array.
import numpy as np
arr = np.array([41, 42, 43, 44])
x = [True, False, True, False]
newarr = arr[x]
print(newarr)
Matrix Operations
● numpy.add() − Add two matrices import numpy as np
# Two matrices
● numpy.subtract() − Subtract two matrices mx1 = np.array([[5, 10], [15, 20]
● numpy.divide() − Divide two matrices mx2 = np.array([[25, 30], [35, 40
print("Matrix1 =\n",mx1)
● numpy.multiply() − Multiply two matrices print("\nMatrix2 =\n",mx2)
# The addition() is used to add m
print ("\nAddition of two matrice
print (np.add(mx1,mx2))
# The subtract() is used to subtr
matrices
print ("\nSubtraction of two matr
print (np.subtract(mx1,mx2))
# The divide() is used to divide
print ("\nMatrix Division: ")
print (np.divide(mx1,mx2))
# The multiply()is used to multip
matrices
SCHOOL OF COMPUTER ENGINEERING & print ("\nMultiplication of two m
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE
Matrix Operations
Summation of matrix elements
The sum() method is used to find the
summation
import numpy as np
# A matrix
mx = np.array([[5, 10], [15, 20]])
print("Matrix =\n",mx)
print ("\nThe summation of elements=")
print (np.sum(mx))
print ("\nThe column wise summation=")
print (np.sum(mx,axis=0))
print ("\nThe row wise summation=")
print (np.sum(mx,axis=1))
Matrix Operations
Transpose a Matrix
The .T property is used to find the
Transpose of a Matrix −
import numpy as np
# A matrix
mx = np.array([[5, 10], [15, 20]])
print("Matrix =\n",mx)
print ("\nThe Transpose =")
print (mx.T)
import numpy as np
data = np.array([[66, 99, 22,11,-1,0,10],[1,2,3,4,5,0,-1]])
res = np.argmin(data)
print(data)
print("Min element's index:", res)
Data counting
np.count_nonzero(x == 2)
np.count_nonzero(x < 6)
np.count_nonzero((x == 2) | (x == 7))
import numpy as np
#create NumPy array
x = np.array([2, 2, 2, 4, 5, 5, 5, 7, 8, 8, 10, 12])
#count number of values in array equal to 2
np.count_nonzero(x == 2)
Linear Algebra
The Linear Algebra module of NumPy offers various methods to apply linear algebra on any numpy array.
One can find:
• rank, determinant, trace, etc. of an array. # Importing numpy as np
• eigen values of matrices import numpy as np
• matrix and vector products (dot, inner, outer,etc. product), A = np.array([[6, 1, 1],
• matrix exponentiation [4, -2, 5],
• solve linear or tensor equations and much more! [2, 8, 7]])
# Rank of a matrix
print("Rank of A:", np.linalg.matrix_rank(A))
# Trace of matrix A
print("\nTrace of A:", np.trace(A))
# Determinant of a matrix
print("\nDeterminant of A:", np.linalg.det(A))
# Inverse of matrix A
print("\nInverse of A:\n", np.linalg.inv(A))
print("\nMatrix A raised to power 3:\n",
np.linalg.matrix_power(A, 3))
Linear Algebra
# Python program explaining
numpy.linalg.eigh(a, UPLO=’L’) : This function is used to
# eigh() function return the eigenvalues and eigenvectors of a complex
Hermitian (conjugate symmetric) or a real symmetric
from numpy import linalg matrix.Returns two objects, a 1-D array containing the
# Creating an array using array
eigenvalues of a, and a 2-D square array or matrix
# function (depending on the input type) of the corresponding
a = np.array([[1, -2j], [2j, 5]]) eigenvectors (in columns).
print("Array is :",a)
# calculating an eigen value
# using eigh() function
c, d = linalg.eigh(a)
print("Eigen value is :", c)
print("Eigen value is :", d)
Linear Algebra
numpy.linalg.eig(a) : This function is used to compute the eigenvalues and right eigenvectors of a square array.
Linear Algebra
FUNCTION DESCRIPTION
matmul() Matrix product of two arrays.
inner() Inner product of two arrays.
outer() Compute the outer product of two vectors.
Compute the dot product of two or more arrays in a single function call, while
linalg.multi_dot()
automatically selecting the fastest evaluation order.
tensordot() Compute tensor dot product along specified axes for arrays >= 1-D.
Broadcasting
The term broadcasting refers to how numpy treats arrays with different Dimension during arithmetic operations which
lead to certain constraints, the smaller array is broadcast across the larger array so that they have compatible shapes.
Broadcasting provides a means of vectorizing array operations s
Mathematical Operations
Structured Data