0% found this document useful (0 votes)
46 views42 pages

Unit 3 - Data Computation

This document discusses arithmetic, statistical, and bitwise operations that are fundamental for data science. It describes commonly used Python libraries like NumPy and Pandas for performing arithmetic operations on arrays like addition, subtraction, multiplication, and division. Statistical operations covered include calculating the mean, median, standard deviation, and correlation. Bitwise operators like AND, OR, XOR, and NOT are also discussed as tools for manipulating binary data.

Uploaded by

Vedant Gavhane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views42 pages

Unit 3 - Data Computation

This document discusses arithmetic, statistical, and bitwise operations that are fundamental for data science. It describes commonly used Python libraries like NumPy and Pandas for performing arithmetic operations on arrays like addition, subtraction, multiplication, and division. Statistical operations covered include calculating the mean, median, standard deviation, and correlation. Bitwise operators like AND, OR, XOR, and NOT are also discussed as tools for manipulating binary data.

Uploaded by

Vedant Gavhane
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

FIRST YEAR B.

TECH COURSE: ESSENTIALS OF DATA SCIENCE

Unit III

By
Team – Essentials of Data Science
School of Computer Engineering,
MIT Academy of Engineering, Alandi(D.)
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS 1
Arithmetic and Statistical Data FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Operations
● Arithmetic and statistical data operations are fundamental operations in data
science. Python offers several libraries and modules for performing these
operations. Here are some of the commonly used libraries:
1. NumPy: NumPy is a library for the Python programming language that provides
support for large, multi-dimensional arrays and matrices, as well as functions for
mathematical operations on these arrays.
2. Pandas: Pandas is a library that provides high-performance, easy-to-use data
structures and data analysis tools. It provides a DataFrame object for working with
tabular data.

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Arithmetic Data Operations


● Arithmetic Operations: 2.Subtraction: To perform subtraction, you
1. Addition: To perform can use the '-' operator. For example, to
addition, you can use the subtract two arrays using NumPy:
'+' operator. For example, import numpy as np
a = np.array([1, 2, 3])
to add two arrays using b = np.array([4, 5, 6])
c=a-b
NumPy: print(c)
import numpy as np
a = np.array([1, 2, 3]) Output: [-3 -3 -3]
b = np.array([4, 5, 6])
c=a+b
print(c)

Output: [5 7 9]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Arithmetic Data Operations


3.Multiplication: To perform multiplication, you can use the '*' operator. For example,
to multiply two arrays using NumPy:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6]) Output: [ 4 10 18]
c=a*b
print(c)

4.Division: To perform division, you can use the '/' operator. For example, to divide
two arrays using NumPy:
import numpy as np
a = np.array([1, 2, 3]) Output: [0.25 0.4 0.5 ]
b = np.array([4, 5, 6])
c=a/b
print(c)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
Arithmetic and Statistical Data FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Operations
3.Multiplication: To perform multiplication, you can use the '*' operator. For example,
to multiply two arrays using NumPy:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6]) Output: [ 4 10 18]
c=a*b
print(c)

4.Division: To perform division, you can use the '/' operator. For example, to divide
two arrays using NumPy:
import numpy as np
a = np.array([1, 2, 3]) Output: [0.25 0.4 0.5 ]
b = np.array([4, 5, 6])
c=a/b
print(c)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Arithmetic Data Operations


1.Exponentiation and Modulus: Python also provides built-in operators for
exponentiation (**) and modulus (%) that can be used for calculations
involving powers and remainders.

2.Floor Division: Python provides a floor division operator (//) that returns
the quotient of a division operation, rounded down to the nearest integer.

3.Absolute Value: Python provides a built-in function called abs() that


returns the absolute value of a given number.

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Statistical Data Operations


Statistical Operations:
1.Mean: To calculate the mean of a set of values, you can use the mean() function of
NumPy. For example, to calculate the mean of an array:
import numpy as np
a = np.array([1, 2, 3])
Output: 2.0
mean_a = np.mean(a)
print(mean_a)

2.Median: To calculate the median of a set of values, you can use the median()
function of NumPy. For example, to calculate the median of an array:
import numpy as np
a = np.array([1, 2, 3]) Output: 2.0
median_a = np.median(a)
print(median_a)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Statistical Data Operations


3.Standard Deviation: To calculate the standard deviation of a set of
values, you can use the std() function of NumPy. For example, to calculate
the standard deviation of an array:

import numpy as np
a = np.array([1, 2, 3])
std_a = np.std(a)
print(std_a)
Output: 0.816496580927726

4.Correlation: To calculate the correlation coefficient between two


variables, you can use the corrcoef() function of NumPy. For example, to
calculate the correlation
import numpy as np between two arrays:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
corr_ab = np.corr
SCHOOL OF COMPUTER ENGINEERING &
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Statistical Data Operations


1.Mode: The mode of a set of numbers can be calculated using the mode() function
provided by the SciPy library. It returns the value that appears most frequently in the
given set of numbers.
2.Variance: The variance of a set of numbers can be calculated using the var()
function provided by the NumPy library. It returns the measure of how much the
given set of numbers vary from the mean.
3.Standard Deviation: The standard deviation of a set of numbers can be calculated
using the std() function provided by the NumPy library. It returns the measure of the
amount of variation or dispersion of the given set of numbers.
4.Correlation: The correlation between two sets of numbers can be calculated using
the corr() function provided by the Pandas library. It returns a correlation
coefficient that measures the degree to which two variables are related.
SCHOOL OF COMPUTER ENGINEERING &
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Bitwise Operators
● Bitwise operators are used in data science for performing operations on binary
data, such as images or audio signals, and manipulating individual bits within
numerical data types. In Python, bitwise operators can be used with NumPy arrays
to perform bitwise operations on large sets of binary data.
● Here are some of the most commonly used bitwise operators in NumPy:
1. Bitwise AND (&): The bitwise AND operator compares two sets of binary
numbers and returns a new binary number where each bit is 1 only if both input
bits are 1.
import numpy as np
a = np.array([0b1010, 0b1100, 0b1111])
b = np.array([0b1100, 0b1010, 0b0101])
c = np.bitwise_and(a, b)
SCHOOL OF COMPUTER ENGINEERING &
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Bitwise Operators
Bitwise OR (|): The bitwise OR operator compares two sets of binary numbers and
returns a new binary number where each bit is 1 if either of the input bits is 1.
a = np.array([0b1010, 0b1100, 0b1111])
b = np.array([0b1100, 0b1010, 0b0101])
c = np.bitwise_or(a, b)
print(c) # Output: [14, 14, 15]
Bitwise XOR (^): The bitwise XOR operator compares two sets of binary numbers
and returns a new binary number where each bit is 1 only if one of the input bits is 1.
a = np.array([0b1010, 0b1100, 0b1111])
b = np.array([0b1100, 0b1010, 0b0101])
c = np.bitwise_xor(a, b)
print(c) # Output: [6, 6, 10]
SCHOOL OF COMPUTER ENGINEERING &
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Bitwise Operators
4. Bitwise NOT (~): The bitwise NOT operator inverts each bit in a binary number,
returning its one's complement.
a = np.array([0b1010, 0b1100, 0b1111])
c = np.bitwise_not(a)
print(c) # Output: [-11, -13, -16]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Linear Algebra
● The Linear Algebra module of NumPy offers various methods to apply linear
algebra on any numpy array.
One can find:
• rank, determinant, trace, etc. of an array.
• eigen values of matrices
• matrix and vector products (dot, inner, outer,etc. product), matrix exponentiation
• solve linear or tensor equations and much more!

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Linear Algebra
1. Eigendecomposition: Eigendecomposition is used to find the eigenvectors and
eigenvalues of a matrix. Eigenvectors and eigenvalues are used in various data
science techniques such as principal component analysis (PCA), image
compression, and recommender systems.
2. Singular Value Decomposition (SVD): Singular Value Decomposition (SVD) is
used to decompose a matrix into its singular values and singular vectors. SVD is
used in various data science techniques such as image compression, dimensionality
reduction, and recommendation systems.

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Linear Algebra
3. Linear Transformations: Linear transformations are used to transform data from
one coordinate system to another. In data science, linear transformations are used in
various techniques such as data augmentation, data normalization, and data
preprocessing.
4. Determinant: The determinant of a matrix is used to find the volume scaling factor
of a linear transformation. The determinant is used in various data science techniques
such as PCA, optimization, and regression analysis.

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Copying and viewing data in arrays


Copying Data: To make a copy of a numpy array, you can use the copy() method.
This method creates a new array that is a copy of the original array
import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = arr1.copy()

arr2[0] = 4

print(arr1) # Output: [1 2 3]
print(arr2) # Output: [4 2 3]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Copying and viewing data in arrays


2. Viewing Data: Views are another way to access the data of a numpy array. Views
are shallow copies of the original array, and changes made to the view are
reflected in the original array.
import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = arr1.view()

arr2[0] = 4

print(arr1) # Output: [4 2 3]
print(arr2) # Output: [4 2 3]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Copying and viewing data in arrays


2. Viewing Data: Views are another way to access the data of a numpy array. Views
are shallow copies of the original array, and changes made to the view are
reflected in the original array.
import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = arr1.view()

arr2[0] = 4

print(arr1) # Output: [4 2 3]
print(arr2) # Output: [4 2 3]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Copying and viewing data in arrays


3. Slicing: Slicing is a way to access a subset of the elements of a numpy array.
Slicing creates a view of the original array.
import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = arr1.view()

arr2[0] = 4

print(arr1) # Output: [4 2 3]
print(arr2) # Output: [4 2 3]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Stacking
In NumPy, stacking refers to the process of combining arrays by
adding one on top of the other or side by side. There are several
functions in NumPy that allow you to stack arrays in different ways.
1.numpy.hstack: Stack arrays in sequence horizontally (column-
wise).
import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = np.array([4, 5, 6])

arr3 = np.hstack((arr1, arr2))

print(arr3) # Output: [1 2 3 4 5 6]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Stacking
2.numpy.vstack: Stack arrays in sequence vertically (row-wise).
import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = np.array([4, 5, 6])

arr3 = np.vstack((arr1, arr2))

print(arr3) # Output: [[1 2 3]


# [4 5 6]]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Stacking

import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = np.array([4, 5, 6])

arr3 = np.column_stack((arr1, arr2))

print(arr3) # Output: [[1 4]


# [2 5]
# [3 6]]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Stacking
4.numpy.row_stack: Stack 1-D arrays as rows into a 2-D array.

import numpy as np

arr1 = np.array([1, 2, 3])


arr2 = np.array([4, 5, 6])

arr3 = np.row_stack((arr1, arr2))

print(arr3) # Output: [[1 2 3]


# [4 5 6]]

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Data Sorting
Sorting means putting elements in an ordered sequence.
Ordered sequence is any sequence that has an order corresponding
to elements, like numeric or alphabetical, ascending or descending.
The NumPy ndarray object has a function called sort(), that will
sort a specified array.
import numpy as np import numpy as np

arr = np.array([3, 2, 0, 1]) arr = np.array([[3, 2, 4], [5, 0, 1]])

print(np.sort(arr)) print(np.sort(arr))

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Filtering Arrays
Getting some elements out of an existing array and creating a new
array out of them is called filtering.
In NumPy, you filter an array using a boolean index list.
A boolean index list is a list of booleans corresponding to indexes in
the array.
If the value at an index is True that element is contained in the
filtered array, if the value at that index is False that element is
excluded from the filtered array.
import numpy as np
arr = np.array([41, 42, 43, 44])
x = [True, False, True, False]
newarr = arr[x]
print(newarr)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Matrix Operations
● numpy.add() − Add two matrices import numpy as np
# Two matrices
● numpy.subtract() − Subtract two matrices mx1 = np.array([[5, 10], [15, 20]
● numpy.divide() − Divide two matrices mx2 = np.array([[25, 30], [35, 40
print("Matrix1 =\n",mx1)
● numpy.multiply() − Multiply two matrices print("\nMatrix2 =\n",mx2)
# The addition() is used to add m
print ("\nAddition of two matrice
print (np.add(mx1,mx2))
# The subtract() is used to subtr
matrices
print ("\nSubtraction of two matr
print (np.subtract(mx1,mx2))
# The divide() is used to divide
print ("\nMatrix Division: ")
print (np.divide(mx1,mx2))
# The multiply()is used to multip
matrices
SCHOOL OF COMPUTER ENGINEERING & print ("\nMultiplication of two m
SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Matrix Operations
Summation of matrix elements
The sum() method is used to find the
summation 
import numpy as np
# A matrix
mx = np.array([[5, 10], [15, 20]])
print("Matrix =\n",mx)
print ("\nThe summation of elements=")
print (np.sum(mx))
print ("\nThe column wise summation=")
print (np.sum(mx,axis=0))
print ("\nThe row wise summation=")
print (np.sum(mx,axis=1))

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Matrix Operations
Transpose a Matrix
The .T property is used to find the
Transpose of a Matrix −
import numpy as np
# A matrix
mx = np.array([[5, 10], [15, 20]])
print("Matrix =\n",mx)
print ("\nThe Transpose =")
print (mx.T)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Data Searching and Indexing


● Types of Indexing
● There are two types of indexing :
● Basic Slicing and indexing : Consider the syntax x[obj] where x is the
array and obj is the index. Slice object is the index in case of basic
slicing. Basic slicing occurs when obj is :
● a slice object that is of the form start : stop : step
● an integer
● or a tuple of slice objects and integers
● All arrays generated by basic slicing are always view of the original
array.

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Data Searching and Indexing


# Python program for basic slicing.
import numpy as np
 
# Arrange elements from 0 to 19
a = np.arange(20)
print("\n Array is:\n ",a)
 
# a[start:stop:step]
print("\n a[-8:17:1]  = ",a[-8:17:1])
 
# The : operator means all elements till the end.
print("\n a[10:]  = ",a[10:])

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Data Searching and Indexing


# Python program for basic slicing
# and indexing
import numpy as np
 
# A 3-Dimensional array
a = np.array([[0, 1, 2, 3, 4, 5]
              [6, 7, 8, 9, 10, 11]
              [12, 13, 14, 15, 16, 17]
              [18, 19, 20, 21, 22, 23]
              [24, 25, 26, 27, 28, 29]
              [30, 31, 32, 33, 34, 35]]
print("\n Array is:\n ",a)
 
# slicing and indexing
print("\n a[0, 3:5]  = ",a[0, 3:5])
 
print("\n a[4:, 4:]  = ",a[4:, 4:])
 
print("\n a[:, 2]  = ",a[:, 2])
 
print("\n a[2:;2, ::2]  = ",a[2:;2, ::2])

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Data Searching and Indexing


Advanced indexing : Advanced indexing is triggered when
obj is –
• an ndarray of type integer or Boolean
• or a tuple with at least one sequence object
• is a non tuple sequence object # Py
imp
Advanced indexing returns a copy of data rather than a view
 
of it. Advanced indexing is of two types integer and a=n
Boolean. prin
Purely integer indexing : When integers are used for
indexing. Each element of first dimension is paired with the
element of the second dimension. So the index of the
elements in this case are (0,0),(1,0),(2,1) and the
corresponding elements are selected.

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Data Searching and Indexing


Boolean Array Indexing:
This indexing has some boolean expression as the index. Those elements are returned which satisfy that Boolean
expression. It is used for filtering the desired element values.
Code #1

# You may wish to select numbers greater than 50


import numpy as np
 
a = np.array([10, 40, 80, 50, 100])
print(a[a>50])

# You may wish to square the multiples of 40


import numpy as np
 
a = np.array([10, 40, 80, 50, 100])
print(a[a%40==0]**2)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Data Searching and Indexing


1. The argmax() function import numpy as np
2. The nanargmax() function data = np.array([[66, 99, 22,11,-1,0,10],[1,2,3,4,5,0,-1]])
3. The argmin() function res =  np.where(data == 2)
4. The nargmin() function print(data)
5. Searching using where() function print("Searched element's index:", res)

import numpy as np import numpy as np


data = np.array([[66, 99, 22,11,-1,0,10],[1,2,3,4,5,0,-1]]) data = np.array([[66, 99, np.nan,11,
res = np.argmax(data) res =  np.nanargmin(data)
print(data) print(data)
print("Max element's index:", res) print("Searched element's index:", r

import numpy as np
data = np.array([[66, 99, 22,11,-1,0,10],[1,2,3,4,5,0,-1]])
res =  np.argmin(data)
print(data)
print("Min element's index:", res)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Data counting
np.count_nonzero(x == 2)

np.count_nonzero(x < 6)

np.count_nonzero((x == 2) | (x == 7))

import numpy as np
#create NumPy array
x = np.array([2, 2, 2, 4, 5, 5, 5, 7, 8, 8, 10, 12])
#count number of values in array equal to 2
np.count_nonzero(x == 2)

#count number of values in array that are less than 6


np.count_nonzero(x < 6)

#count number of values in array that are equal to 2 or 7


np.count_nonzero((x == 2) | (x == 7))

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Linear Algebra
The Linear Algebra module of NumPy offers various methods to apply linear algebra on any numpy array.
One can find:
• rank, determinant, trace, etc. of an array. # Importing numpy as np
• eigen values of matrices import numpy as np
• matrix and vector products (dot, inner, outer,etc. product), A = np.array([[6, 1, 1],
• matrix exponentiation               [4, -2, 5],
• solve linear or tensor equations and much more!               [2, 8, 7]])
# Rank of a matrix
print("Rank of A:", np.linalg.matrix_rank(A))
# Trace of matrix A
print("\nTrace of A:", np.trace(A))
# Determinant of a matrix
print("\nDeterminant of A:", np.linalg.det(A))
# Inverse of matrix A
print("\nInverse of A:\n", np.linalg.inv(A))
print("\nMatrix A raised to power 3:\n",
           np.linalg.matrix_power(A, 3))

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Linear Algebra
# Python program explaining
numpy.linalg.eigh(a, UPLO=’L’) : This function is used to
# eigh() function return the eigenvalues and eigenvectors of a complex
  Hermitian (conjugate symmetric) or a real symmetric
from numpy import linalg matrix.Returns two objects, a 1-D array containing the
 
# Creating an array using array
eigenvalues of a, and a 2-D square array or matrix
# function (depending on the input type) of the corresponding
a = np.array([[1, -2j], [2j, 5]]) eigenvectors (in columns).
 
print("Array is :",a)
 
# calculating an eigen value
# using eigh() function
c, d = linalg.eigh(a)
 
print("Eigen value is :", c)
print("Eigen value is :", d)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Linear Algebra
numpy.linalg.eig(a) : This function is used to compute the eigenvalues and right eigenvectors of a square array.

# Python program explaining


# eig() function
 
from numpy import linalg  
# Creating an array using diag
# function
a = np.diag((1, 2, 3))
 
print("Array is :",a)
 
# calculating an eigen value
# using eig() function
c, d = linalg.eig(a)
 
print("Eigen value is :",c)
print("Eigen value is :",d)

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Linear Algebra
FUNCTION DESCRIPTION
matmul() Matrix product of two arrays.
inner() Inner product of two arrays.
outer() Compute the outer product of two vectors.

Compute the dot product of two or more arrays in a single function call, while
linalg.multi_dot()
automatically selecting the fastest evaluation order.

tensordot() Compute tensor dot product along specified axes for arrays >= 1-D.

einsum() Evaluates the Einstein summation convention on the operands.


Evaluates the lowest cost contraction order for an einsum expression by
einsum_path()
considering the creation of intermediate arrays.
linalg.matrix_po
Raise a square matrix to the (integer) power n.
wer()
kron() Kronecker product of two arrays.

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Broadcasting
The term broadcasting refers to how numpy treats arrays with different Dimension during arithmetic operations which
lead to certain constraints, the smaller array is broadcast across the larger array so that they have compatible shapes. 
Broadcasting provides a means of vectorizing array operations s

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Mathematical Operations

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY
FIRST YEAR B. TECH COURSE: ESSENTIALS OF DATA SCIENCE

Structured Data

SCHOOL OF COMPUTER ENGINEERING &


SCHOOL OF COMPUTER ENGINEERING TEAM -- EDS
SHUBHANGI KALE 11/11/2022 2
TECHNOLOGY

You might also like