0% found this document useful (0 votes)

8 views38 pages

DS - Ex-1

The document outlines the installation process for Python, Jupyter, and various packages including NumPy, SciPy, Statsmodels, and Pandas. It also explores the features of NumPy, detailing its capabilities in array manipulation, random number generation, and universal functions. Additionally, it provides examples of different statistical distributions and arithmetic operations using NumPy.

Uploaded by

vishnupriyapacet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views38 pages

DS - Ex-1

Uploaded by

vishnupriyapacet

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 38

1(a).

Download and install the different packages like NumPy,

SciPy, Jupyter, Statsmodels and Pandas

AIM:
To learn how to download and install the different packages of NumPy, SciPy, Jupyter,
Statsmodels and Pandas.

ALGORITHM:
1. Download Python and Jupyter.
2. Install Python and Jupyter.
3. Install the pack like NumPy, SciPy Satsmodels and Pandas.
4. Verify the proper execution of Python and Jupyter.

Python Installation
 Open the python official web site. (https://www.python.org/)
 Downloads ==> Windows ==> Select Recent Release. (Requires Windows 10
or above versions)
 Install "python-3.10.6-amd64.exe"

Jupyter Installation
 Open command prompt and enter the following to check whether the python
was installed properly or not, “python –version”.
 If installation is proper it returns the version of python
 Enter the following to check whether the pyton package manager was
installed properly or not, “pip –version”
 If installation is proper it returns the version of python package manager
 Enter the following command “pip install jupyterlab”.
 Enter the following command “pip install jupyter notebook”.
 Copy the above command result from path to upgrade command and paste it
and execute for upgrade process.
 Create a folder and name the folder accordingly.
 Open command prompt and enter in to that folder. Enter the following
code “jupyter notebook” and then give enter.
 Now new jupyter notebook will be opened for our use.

pip Installation
Installation of NumPy
 pip install
numpy Installation
of SciPy
 pip install scipy
Installation of
Statsmodels

Page No.
P. A. COLLEGE OF ENGINEERING AND TECHNOLOGY

 pip install
statsmodels Installation
of Pandas
 pip install pandas

Sample Output

RESULT:
NumPy, SciPy, Jupyter, Statsmodels and Pandas packages were installed properly and
the execution also verified.

Page No.
P. A. COLLEGE OF ENGINEERING AND TECHNOLOGY

1(B). EXPLORE THE FEATURES OF NUMPY

AIM:
To learn the different features provided by NumPy package.

ALGORITHM:
1. Install the NumPy package
2. Study all the features of NumPy package.

NumPy
 NumPy is a Python library used for working with arrays.
 It also has functions for working in domain of linear algebra, fourier
transform, and matrices.

Features
These are the important features of NumPy
1. Array 2. Random 3. Universal Functions

1. Arrays
1.1 Array Slicing
 Slicing in python means taking elements from one given index to another
given index.
 We pass slice instead of index like this: [start:end].
 We can also define the step, like this: [start:end:step].
 If we don't pass start its considered 0
 If we don't pass end its considered length of array in that dimension
 If we don't pass step its considered 1

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5:2])

1.2 Array Shape & Reshaping

1.2.1Array Shape
NumPy arrays have an attribute called shape that returns a tuple with eachindex
having the number of corresponding elements.
import numpy as np
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape)

1.2.2Array Reshaping
 Reshaping means changing the shape of an array.
 The shape of an array is the number of elements in each dimension.

Page No.
P. A. COLLEGE OF ENGINEERING AND TECHNOLOGY

 By reshaping we can add or remove dimensions or change number of elements

in each dimension.
 Convert the following 1-D array with 12 elements into a 3-D array.

The outermost dimension will have 2 arrays that contains 3 arrays, each with 2 elements:
import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(2, 3, 2)
print(newarr)

2. Random
Random Permutations
A permutation refers to an arrangement of elements. e.g. [3, 2, 1] is a permutation of
[1, 2, 3] and vice-versa. The NumPy Random module provides two methods for this:
shuffle() andpermutation().
from numpy import random
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
random.shuffle(arr)
print(arr)

2.1 Seaborn
Seaborn is a library that uses Matplotlib underneath to plot graphs. It will be used to
visualize random distributions.
import matplotlib.pyplot as plt
import seaborn as sns
sns.distplot([0, 1, 2, 3, 4, 5])
plt.show()

2.2 Normal (Gaussian) Distribution

It is also called the Gaussian Distribution after the German mathematician Carl
Friedrich Gauss. It fits the probability distribution of many events, eg. IQ Scores, Heartbeat
etc.
It uses the random.normal() method to get a Normal Data Distribution.
It has three parameters:
loc - (Mean) where the peak of the bell exists.
scale - (Standard Deviation) how flat the graph distribution should be.
size - The shape of the returned array.
Generate a random normal distribution of size 2x3 with mean at 1 and standard
deviation of 2:
from numpy import random
x = random.normal(loc=1, scale=2, size=(2,
3)) print(x)

Page No.
P. A. COLLEGE OF ENGINEERING AND TECHNOLOGY

2.3 Binomial Distribution

Binomial Distribution is a Discrete Distribution.
It describes the outcome of binary scenarios, e.g. toss of a coin, it will either be head
or tails.
It has three parameters:
n - number of trials.
p - probability of occurence of each trial (e.g. for toss of a coin 0.5
each). size - The shape of the returned array.
Given 10 trials for coin toss generate 10 data points:
from numpy import random
x = random.binomial(n=10, p=0.5, size=10)
print(x)

2.4 Poisson Distribution

It estimates how many times an event can happen in a specified time. e.g. If someone
eats twice a day what is probability he will eat thrice?
It has two parameters:
lam - rate or known number of occurences e.g. 2 for above
problem. size - The shape of the returned array.
Generate a random 1x10 distribution for occurence
2: from numpy import random
x = random.poisson(lam=2, size=10)
print(x)

2.5 Uniform Distribution

Used to describe probability where every event has equal chances of occuring. E.g.
Generation of random numbers.
It has three parameters:
a - lower bound - default 0 .0.
b - upper bound - default 1.0.
size - The shape of the returned array.
Create a 2x3 uniform distribution sample:
from numpy import random
x = random.uniform(size=(2, 3))
print(x)

2.6 Logistic Distribution

Logistic Distribution is used to describe growth.
Used extensively in machine learning in logistic regression, neural networks etc.
It has three parameters:
loc - mean, where the peak is. Default 0.
scale - standard deviation, the flatness of distribution. Default
1. size - The shape of the returned array.

Page No.
P. A. COLLEGE OF ENGINEERING AND TECHNOLOGY

Draw 2x3 samples from a logistic distribution with mean at 1 and stddev 2.0:
from numpy import random
x = random.logistic(loc=1, scale=2, size=(2, 3))
print(x)

2.7 Multinomial Distribution

Multinomial distribution is a generalization of binomial distribution.
It describes outcomes of multi-nomial scenarios unlike binomial where scenarios
must be only one of two. e.g. Blood type of a population, dice roll outcome.
It has three parameters:
n - number of possible outcomes (e.g. 6 for dice roll).
pvals - list of probabilties of outcomes (e.g. [1/6, 1/6, 1/6, 1/6, 1/6, 1/6] for dice roll).
size - The shape of the returned array.
Draw out a sample for dice roll:
from numpy import random
x = random.multinomial(n=6, pvals=[1/6, 1/6, 1/6, 1/6, 1/6, 1/6])
print(x)

2.8 Exponential Distribution

Exponential distribution is used for describing time till next event e.g. failure/success
etc.
It has two parameters:
scale - inverse of rate ( see lam in poisson distribution ) defaults to 1.0.
size - The shape of the returned array.
Draw out a sample for exponential distribution with 2.0 scale with 2x3 size:
from numpy import random
x = random.exponential(scale=2, size=(2, 3))
print(x)

2.9 Chi Square Distribution

Chi Square distribution is used as a basis to verify the hypothesis.
It has two parameters:
df - (degree of freedom).
size - The shape of the returned array.
Draw out a sample for chi squared distribution with degree of freedom 2 with size 2x3:
from numpy import random
x = random.chisquare(df=2, size=(2, 3))
print(x)

2.10 Rayleigh Distribution

Rayleigh distribution is used in signal processing.
It has two parameters:
scale - (standard deviation) decides how flat the distribution will be default 1.0).
size - The shape of the returned array.

Page No.
P. A. COLLEGE OF ENGINEERING AND TECHNOLOGY

Draw out a sample for rayleigh distribution with scale of 2 with size 2x3:
from numpy import random
x = random.rayleigh(scale=2, size=(2, 3))
print(x)

2.11 Pareto Distribution

A distribution following Pareto's law i.e. 80-20 distribution (20% factors cause
80% outcome).
It has two parameter:
a - shape parameter.
size - The shape of the returned array.
Draw out a sample for pareto distribution with shape of 2 with size 2x3:
from numpy import random
x = random.pareto(a=2, size=(2, 3))
print(x)

2.12 Zipf Distribution

Zipf distritutions are used to sample data based on zipf's law.
Zipf's Law: In a collection the nth common term is 1/n times of the most common
term. E.g. 5th common word in english has occurs nearly 1/5th times as of the most
used word.
It has two parameters:
a - distribution parameter.
size - The shape of the returned array.
Draw out a sample for zipf distribution with distribution parameter 2 with size
2x3: from numpy import random
x = random.zipf(a=2, size=(2,
3)) print(x)

3. Universal Functions
Create Your Own ufunc (Universal)
To create you own ufunc, you have to define a function, like you do with normal
functions in Python, then you add it to your NumPy ufunc library with the frompyfunc()
method.
The frompyfunc() method takes the following arguments:
function - the name of the function.
inputs - the number of input arguments
(arrays). outputs - the number of output arrays.
Create your own ufunc for addition:
import numpy as np
def myadd(x, y):
return x+y
myadd = np.frompyfunc(myadd, 2, 1)
print(myadd([1, 2, 3, 4], [5, 6, 7, 8]))

Page No.
P. A. COLLEGE OF ENGINEERING AND TECHNOLOGY

3.1 Simple Arithmetic

You could use arithmetic operators + - * / directly between NumPy arrays, but this
section discusses an extension of the same where we have functions that can take any array-
like objects e.g. lists, tuples etc. and perform arithmetic conditionally.
Addition
Add the values in arr1 to the values in arr2:
import numpy as np
arr1 = np.array([10, 11, 12, 13, 14, 15])
arr2 = np.array([20, 21, 22, 23, 24, 25])
newarr = np.add(arr1, arr2)
print(newarr)
Subtraction
Subtract the values in arr2 from the values in arr1:
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([20, 21, 22, 23, 24, 25])
newarr = np.subtract(arr1, arr2)
print(newarr)
Multiplication
Multiply the values in arr1 with the values in arr2:
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([20, 21, 22, 23, 24, 25])
newarr = np.multiply(arr1, arr2)
print(newarr)
Division
Divide the values in arr1 with the values in arr2:
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 5, 10, 8, 2,
33]) newarr = np.divide(arr1,
arr2) print(newarr)
Pow
er Raise the valules in arr1 to the power of values in arr2:
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])
arr2 = np.array([3, 5, 6, 8,
2, 33]) newarr =
np.power(arr1, arr2)
print(newarr)
Remainder
Return the remainders:
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60])

Page No.
arr2 = np.array([3, 7, 9, 8, 2, 33])
newarr = np.mod(arr1, arr2)
print(newarr)
Absolute Values
Return the quotient and mod:
import numpy as np
arr = np.array([-1, -2, 1, 2, 3, -4])
newarr = np.absolute(arr)
print(newarr)

3.2 Rounding Decimals

There are primarily five ways of rounding off decimals in NumPy:
 trun  floor
cati  ceil
on
 rou
ndin
g
3.2.1Truncation

Remove the decimals, and return the float number closest to zero. Use
the trunc() and fix() functions.
Truncate elements of following array:
import numpy as np
arr = np.trunc([-
3.1666, 3.6667])
print(arr)

3.2.2Rounding
The around() function increments preceding digit or decimal by 1 if
>=5 else do nothing.
Round off 3.1666 to 2 decimal places:
import numpy as np
arr =
np.around(3.1666,
2) print(arr)

3.2.3Floor
The floor() function rounds off decimal to nearest lower integer.
Floor the elements of following array:
import numpy as np
arr = np.floor([-
3.1666, 3.6667])
print(arr)

3.2.4Ceil
The ceil() function rounds off decimal to nearest upper integer.
Ceil the elements of following array:
import numpy as np
arr = np.ceil([-
3.1666, 3.6667])
print(arr)
3.3 Logs
 NumPy provides functions to perform log at the base 2, e and 10.
 We will also explore how we can take log for any base by creating a custom func. All
of the log functions will place -inf or inf in the elements if the log can not be
computed.
Find log at base 10 of all elements of following
array: import numpy as np
arr = np.arange(1,
10)
print(np.log10(arr))

3.4 Summations
Addition is done between two arguments whereas summation happens over nelements
Add the values in arr1 to the values in arr2:
import numpy as np
arr1 = np.array([1, 2,
3])
arr2 = np.array([1, 2, 3])
newarr = np.add(arr1, arr2)
print(newarr)

3.5 Products
To find the product of the elements in an array, use the prod() function.
Find the product of the elements of this array:
import numpy as np
arr = np.array([1, 2, 3, 4])
x = np.prod(arr)
print(x)

3.6 Differences
 A discrete difference means subtracting two successive elements.
 To find the discrete difference, use the diff() function.
Compute discrete difference of the following array:
import numpy as np
arr = np.array([10, 15, 25, 5])
newarr = np.diff(arr)
print(newarr)

3.7 LCM (Lowest Common Multiple)

The Lowest Common Multiple is the least number that is common multiple of both
of the numbers.
import numpy as np
num1 = 4
num2 = 6
x = np.lcm(num1, num2)
print(x)

3.8 GCD (Greatest Common Denominator)

The GCD (Greatest Common Denominator), also known as HCF (Highest Common
Factor) is the biggest number that is a common factor of both of the numbers.
Find the HCF of the following two numbers:
import numpy as np
num1 = 6
num2 = 9

x = np.gcd(num1, num2)
print(x)

3.9 Trigonometric Functions

NumPy provides the ufuncs sin(), cos() and tan() that take values in radians and
produce the corresponding sin, cos and tan values.
Find sine value of PI/2:
import numpy as np
x = np.sin(np.pi/2)
print(x)

Find sine values for all of the values in arr:

import numpy as np
arr = np.array([np.pi/2, np.pi/3, np.pi/4, np.pi/5])
x = np.sin(arr)
print(x)

3.10 Hyperbolic Functions

NumPy provides the ufuncs sinh(), cosh() and tanh() that take values in radians and
produce the corresponding sinh, cosh and tanh values.
Find sinh value of PI/2:
import numpy as np
x = np.sinh(np.pi/2)
print(x)

Find cosh values for all of the values in arr:

import numpy as np
arr = np.array([np.pi/2, np.pi/3, np.pi/4, np.pi/5])
x = np.cosh(arr)
print(x)

3.11 Set Operations

A set in mathematics is a collection of unique elements.
3.11.1 Create Sets in NumPy
We can use NumPy's unique() method to find unique elements from any array. E.g.
create a set array, but remember that the set arrays should only be 1-D arrays.
Convert following array with repeated elements to a set:
import numpy as np
arr = np.array([1, 1, 1, 2, 3, 4, 5, 5, 6, 7])
x = np.unique(arr)
print(x)
3.11.2 Finding Union
To find the unique values of two arrays, use the union1d() method.
Find union of the following two set arrays:
import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
newarr = np.union1d(arr1, arr2)
print(newarr)

3.11.3 Finding Intersection

To find only the values that are present in both arrays, use the intersect1d() method.
Find intersection of the following two set arrays:
import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])
newarr = np.intersect1d(arr1, arr2, assume_unique=True)
print(newarr)

OUTPUT:
RESULT
Thus the feature study of NumPy has been completed successfully.
1(C). EXPLORE THE FEATURES OF SCIPY

AIM:
To learn the different features provided by SciPy package.

ALGORITHM:
1. Install the SciPy package
2. Study all the features of SciPy package.

SciPy
SciPy stands for Scientific Python, SciPy is a scientific computation library that uses
NumPy underneath.

Features
These are the important features of SciPy
1. Constants 2. Sparse Data 3. Graphs
4. Spatial Data 5. Matlab Arrays 6. Interpolation

1. Constants in SciPy
As SciPy is more focused on scientific implementations, it provides many built-in
scientific constants.
These constants can be helpful when you are working with Data Science.
1.1 Constants in
SciPy Metric
Return the specified
unit in meter
Bina ex:
ry print(constants.mil
li)

Mass Return the

specified unit in
bytes
ex:
Angl print(constants.ki
e bi)

Return the specified unit in kg

Time ex: print(constants.stone)

Return the specified

unit in radians
ex:
print(constants.degre
e)

Return the specified unit in seconds

ex: print(constants.year)
Lengt
h Return the specified unit in meters
ex: print(constants.mile)
Pressure
Return the specified unit in pascals
ex: print(constants.bar)
Are
a
Return the specified unit in square meters
ex: print(constants.hectare)
V
o
l
u
m
e
Return the specified unit in cubic meters
ex: print(constants.litre)
Spee
d Return the specified unit in meters per second
ex: print(constants.kmh)
Temperature
Return the specified unit in Kelvin
ex: print(constants.zero_Celsius)
Energy
Return the specified unit in joules
ex: print(constants.calorie)
Pow
er Return the specified unit in watts
ex: print(constants.hp)

Forc Return the specified unit in newton

e
ex: print(constants.pound_force)

2. Sparse Data
Sparse data is data that has mostly unused elements (elements that don't
carry any information).
It can be an array like this one:
[1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 0]
Sparse Data: is a data set where most of the item values are zero.
Dense Array: is the opposite of a sparse array: most of the values are not
zero.

2.1 CSR(Compressed Sparse Row) Matrix

We can create CSR matrix by passing an arrray
into function scipy.sparse.csr_matrix().
Create a CSR matrix from an array:
import numpy as np
from scipy.sparse
import csr_matrix arr
= np.array([0, 0, 0, 0, 0,
1, 1, 0, 2])
print(csr_matrix(arr))
3. Graphs
Graphs are an essential data structure.
SciPy provides us with the module scipy.sparse.csgraph for working with
such data structures.
Adjacency Matrix
Adjacency matrix is a nxn matrix where n is the number of elements in a graph.
The values represents the connection between the elements.

3.1 Dijkstra
Use the dijkstra method to find the shortest path in a graph from one element to
another.
It takes following arguments:
return_predecessors: boolean (True to return whole path of traversal otherwise False).
indices: index of the element to return all paths from that element only.
limit: max weight of path.
Find the shortest path from element 1 to 2:
import numpy as np
from scipy.sparse.csgraph import dijkstra
from scipy.sparse import csr_matrix
arr =
np.array([ [0, 1,
2],
[1, 0, 0],
[2, 0, 0]
])
newarr = csr_matrix(arr)
print(dijkstra(newarr, return_predecessors=True, indices=0))

3.2 Depth First Order

The depth_first_order() method returns a depth first traversal from a node.
This function takes following arguments:
the graph.
the starting element to traverse graph from.

Traverse the graph depth first for given adjacency matrix:

import numpy as np
from scipy.sparse.csgraph import depth_first_order
from scipy.sparse import csr_matrix
arr =
np.array([ [0, 1,
0, 1],
[1, 1, 1, 1],
[2, 1, 1, 0],
[0, 1, 0, 1]
])
newarr = csr_matrix(arr)
print(depth_first_order(newarr, 1))
3.3 Breadth First Order
The breadth_first_order() method returns a breadth first traversal from a node.
This function takes following arguments:
the graph.
the starting element to traverse graph from.
Traverse the graph breadth first for given adjacency matrix:
import numpy as np
from scipy.sparse.csgraph import breadth_first_order
from scipy.sparse import csr_matrix
arr =
np.array([ [0, 1,
0, 1],
[1, 1, 1, 1],
[2, 1, 1, 0],
[0, 1, 0, 1]
])
newarr = csr_matrix(arr)
print(breadth_first_order(newarr, 1))

4. Spatial Data
Spatial data refers to data that is represented in a geometric space.
E.g. points on a coordinate system.
We deal with spatial data problems on many tasks.
E.g. finding if a point is inside a boundary or not.

4.1 Triangulation
A Triangulation of a polygon is to divide the polygon into multiple triangles with
which we can compute an area of the polygon.
A Triangulation with points means creating surface composed triangles in which all of
the given points are on at least one vertex of any triangle in the surface.
One method to generate these triangulations through points is the Delaunay()
Triangulation.
Example:
Create a triangulation from following points:
import numpy as np
from scipy.spatial import Delaunay
import matplotlib.pyplot as plt
points = np.array([
[2, 4],
[3, 4],
[3, 0],
[2, 2],
[4, 1]
])
simplices = Delaunay(points).simplices
plt.triplot(points[:, 0], points[:, 1], simplices)
plt.scatter(points[:, 0], points[:, 1], color='r')
plt.show()

4.2 Convex Hull

A convex hull is the smallest polygon that covers all of the given points.
Use the ConvexHull() method to create a Convex Hull.
Example
Create a convex hull for following points:
import numpy as np
from scipy.spatial import ConvexHull
import matplotlib.pyplot as plt
points =
np.array([ [2, 4],
[3, 4],
[3, 0],
[2, 2],
[4, 1],
[1, 2],
[5, 0],
[3, 1],
[1, 2],
[0, 2] ])
hull = ConvexHull(points)
hull_points = hull.simplices
plt.scatter(points[:,0], points[:,1])
for simplex in hull_points:
plt.plot(points[simplex,0], points[simplex,1], 'k-')
plt.show()

4.3 KDTrees
KDTrees are a datastructure optimized for nearest neighbor queries.
E.g. in a set of points using KDTrees we can efficiently ask which points are nearest
to a certain given point.
The KDTree() method returns a KDTree object.
The query() method returns the distance to the nearest neighbor and the location of the
neighbors.
Example
Find the nearest neighbor to point (1,1):
from scipy.spatial import KDTree
points = [(1, -1), (2, 3), (-2, 3), (2, -3)]
kdtree = KDTree(points)
res = kdtree.query((1, 1))
print(res)
4.4 Distance Matrix
There are many Distance Metrics used to find various types of distances between two
points in data science, Euclidean distsance, cosine distsance etc.
The distance between two vectors may not only be the length of straight line between
them, it can also be the angle between them from origin, or number of unit steps required
etc.
Many of the Machine Learning algorithm's performance depends greatly on distance
metrices. E.g. "K Nearest Neighbors", or "K Means" etc.
Let us look at some of the Distance Metrices:

4.4.1Euclidean Distance
Find the euclidean distance between given points A and B.
Example
Find the euclidean distance between given points.
from scipy.spatial.distance import euclidean
p1 = (1, 0)
p2 = (10, 2)
res = euclidean(p1, p2)
print(res)

4.4.2Cosine Distance
Is the value of cosine angle between the two points A and B.
Example
Find the cosine distsance between given points:
from scipy.spatial.distance import cosine
p1 = (1, 0)
p2 = (10, 2)
res = cosine(p1, p2)
print(res)

Hamming Distance
Is the proportion of bits where two bits are difference.
It's a way to measure distance for binary sequences.
Example
Find the hamming distance between given points:
from scipy.spatial.distance import hamming
p1 = (True, False, True)
p2 = (False, True, True)
res = hamming(p1, p2)
print(res)

5. Matlab Arrays
We know that NumPy provides us with methods to persist the data in readable
formats for Python. But SciPy provides us with interoperability with Matlab as well.
Working With Matlab Arrays
 We know that NumPy provides us with methods to persist the data in readable
formats for Python. But SciPy provides us with interoperability with Matlab as well.
 Exporting Data in Matlab Format
 The savemat() function allows us to export data in Matlab format.
 The method takes the following parameters:
filename - the file name for saving
data. mdict - a dictionary containing
the data.
do_compression - a boolean value that specifies whether to compress the
result or not. Default False.
Example
Export the following array as variable name "vec" to a mat file:
from scipy import io
import numpy as np
arr = np.arange(10)
io.savemat('arr.mat', {"vec": arr})

Import Data from Matlab Format

o The loadmat() function allows us to import data from a Matlab file.
The function takes one required parameter:
filename - the file name of the saved data.
o It will return a structured array whose keys are the variable names, and the
corresponding values are the variable values.
Example
Import the array from following mat file.:
from scipy import io
import numpy as np
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9,])
# Export:
io.savemat('arr.mat', {"vec":
arr}) # Import:
mydata =
io.loadmat('arr.mat')
print(mydata)

6. Interpolation
 Interpolation is a method for generating points between given points.
For example: for points 1 and 2, we may interpolate and find points 1.33 and 1.66.
 Interpolation has many usages, in Machine Learning we often deal with missing data in
a dataset, interpolation is often used to substitute those values. This method of filling
values is called imputation.
 Apart from imputation, interpolation is often used where we need to smooth the
discrete points in a dataset.
6.1 1D Interpolation
The function interp1d() is used to interpolate a distribution with 1 variable.
It takes x and y points and returns a callable function that can be called with new x
and returns corresponding y.
Example
For given xs and ys interpolate values from 2.1, 2.2... to
2.9: from scipy.interpolate import interp1d
import numpy as np
xs = np.arange(10)
ys = 2*xs + 1
interp_func = interp1d(xs, ys)
newarr = interp_func(np.arange(2.1, 3, 0.1))
print(newarr)

6.2 Spline Interpolation

In 1D interpolation the points are fitted for a single curve whereas in Spline
interpolation the points are fitted against a piecewise function defined with polynomials
called splines.
The UnivariateSpline() function takes xs and ys and produce a callable funciton that
can be called with new xs.
Example
Find univariate spline interpolation for 2.1, 2.2....2.9 for the following non linear points:
from scipy.interpolate import UnivariateSpline
import numpy as np
xs = np.arange(10)
ys = xs**2 + np.sin(xs) + 1
interp_func = UnivariateSpline(xs,
ys)
newarr = interp_func(np.arange(2.1, 3, 0.1))
print(newarr)

OUTPUT:
RESULT
Thus the feature study of SciPy was completed successfully.
1(D). EXPLORE THE FEATURES OF PANDAS

AIM:
To learn the different features provided by Pandas package.

ALGORITHM:
1. Install the Pandas package
2. Study all the features of Pandas package.

Pandas
 Pandas is a Python library used for working with data sets.
 It has functions for analyzing, cleaning, exploring, and manipulating data.
 Pandas allows us to analyze big data and make conclusions based on statistical
theories.
 Pandas can clean messy data sets, and make them readable and relevant.

Features
These are the important features of Pandas.
1. Series 2. DataFrames 3. Read CSV
4. Read JSON 5. Viewing the Data 6. Data Cleaning
7. Plotting

1. Series
 A Pandas Series is like a column in a table.
 It is a one-dimensional array holding data of any type.
 Create a simple Pandas Series from a list:

import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)

1.1 Create Labels

With the index argument, you can name your own labels.
Example
Create you own labels:
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a, index = ["x", "y", "z"])
print(myvar)

1.2 Key/Value Objects as Series

You can also use a key/value object, like a dictionary, when creating a Series.
Example
Create a simple Pandas Series from a dictionary:
import pandas as pd
calories = {"day1": 420, "day2": 380, "day3": 390}
myvar = pd.Series(calories)
print(myvar)

2. DataFrames
A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a
table with rows and columns.
Example
Create a simple Pandas DataFrame:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df =
pd.DataFrame(data)
print(df)

3. Read CSV
A simple way to store big data sets is to use CSV files (comma separated files). CSV
files contains plain text and is a well know format that can be read by everyone
including Pandas.
Example
To print maximum rows in a CSV file
import pandas as pd
pd.options.display.max_rows =
9999 df = pd.read_csv('data.csv')
print(df)

4. Read JSON
 Big data sets are often stored, or extracted as JSON.
 JSON is plain text, but has the format of an object, and is well known in the
world of programming, including Pandas.
Load the JSON file into a DataFrame:
import pandas as pd
df = pd.read_json('data.json')
print(df.to_string())
5. Viewing the Data
One of the most used method for getting a quick overview of the DataFrame, is the
head() method. The head() method returns the headers and a specified number of rows,
starting from the top.

5.1 Info About the Data

The DataFrames object has a method called info(), that gives you more information
about the data set.
Example
Print information about the data:
import pandas as pd
df =
pd.read_csv('data.csv')
print(df.info())

6. Data Cleaning
Data cleaning means fixing bad data in your data set.
Bad data could be:
 Empty cells
 Data in wrong format
 Wrong data
 Duplicates

6.1 Empty Cells

6.1.1Remove Rows
One way to deal with empty cells is to remove rows that contain empty cells.
This is usually OK, since data sets can be very big, and removing a few rows will not
have a big impact on the result.
Example
Return a new Data Frame with no empty cells:
import pandas as pd
df =
pd.read_csv('data.csv')
new_df = df.dropna()
print(new_df.to_string())
inplace() method
It remove all rows with NULL values:
import pandas as pd
df = pd.read_csv('data.csv')
df.dropna(inplace = True)
print(df.to_string())

6.1.2Replace Empty Values

Another way of dealing with empty cells is to insert a new value instead.
Example
Replace NULL values with the number 130:
import pandas as pd
df = pd.read_csv('data.csv')
df.fillna(130, inplace = True)

6.1.3Replace Using Mean, Median, or Mode

A common way to replace empty cells, is to calculate the mean, median or mode
value of the column.
Pandas uses the mean() median() and mode() methods to calculate the respective
values for a specified column:
mean()
import pandas as pd
df =
pd.read_csv('da
ta.csv') x =
df["Calories"].
median mean()
() df["Calories"].fillna(x,
inplace = True)
print(df.to_string())

import pandas as pd
mode() df =
pd.read_csv('da
ta.csv') x =
df["Calories"].
median()
df["Calories"].fillna(x, inplace = True)

import pandas as pd
df =
pd.read_csv('da
ta.csv') x =
df["Calories"].
mode()[0]
df["Calories"].fillna(x, inplace = True)

6.2 Data of Wrong Format

Cells with data of wrong format can make it difficult, or even
impossible, to analyze
data.
To fix it, you have two options: remove the rows, or convert all cells in
the columns
into the same format.
Example
import pandas as pd
df = pd.read_csv('data.csv')
df['Date'] =
pd.to_datetime(df['Date'])
print(df.to_string())

6.2.1Removing Rows
Remove rows with a NULL value in the
"Date" column: import pandas
as pd
df = pd.read_csv('data.csv')
df['Date'] = pd.to_datetime(df['Date'])
df.dropna(subset=['Date'], inplace = True)
print(df.to_string())

6.3 Fixing Wrong Data

6.3.1Wrong Data
"Wrong data" does not have to be "empty cells" or "wrong format", it can just be
wrong, like if someone registered "199" instead of "1.99".
Sometimes you can spot wrong data by looking at the data set, because you have an
expectation of what it should be.

6.3.2Replacing Values
One way to fix wrong values is to replace them with something else.
Example
Set "Duration" = 45 in row 7:
import pandas as pd
df =
pd.read_csv('data.csv')
df.loc[7,'Duration'] = 45
print(df.to_string())

6.3.3Removing Rows
Another way of handling wrong data is to remove the rows that contains wrong data.
Example
Delete rows where "Duration" is higher than 120:
import pandas as pd
df =
pd.read_csv('data.csv') for
x in df.index:
if df.loc[x, "Duration"] > 120:
df.drop(x, inplace = True)
print(df.to_string())

6.4 Removing Duplicates

6.4.1Discovering Duplicates
Duplicate rows are rows that have been registered more than one time.
duplicated() method
import pandas as pd
df =
pd.read_csv('data.csv')
print(df.duplicated())

6.4.2Removing Duplicates
To remove duplicates, use the drop_duplicates() method.
import pandas as pd
df = pd.read_csv('data.csv')
df.drop_duplicates(inplace =
True) print(df.to_string())
7. Plotting
We can use Pyplot, a submodule of the Matplotlib library to visualize the diagram on
the screen.
Pandas uses the plot() method to create diagrams.

7.1 Scatter Plot

Specify that you want a scatter plot with the kind argument:
kind = 'scatter'
Example
import sys
import
matplotlib
matplotlib.use('Agg')
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.csv')
df.plot(kind = 'scatter', x = 'Duration', y = 'Maxpulse')
plt.show()
plt.savefig(sys.stdout.buffer)
sys.stdout.flush()

7.2 Histogram
Use the kind argument to specify that you want a histogram:
kind = 'hist'
Example
import sys
import
matplotlib
matplotlib.use('Agg')
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.csv')
df["Duration"].plot(kind =
'hist') plt.show()
plt.savefig(sys.stdout.buffer)
sys.stdout.flush()
OUTPUT
RESULT
Thus the feature study of Pandas has been completed successfully.
1(E). EXPLORE THE FEATURES OF STATSMODELS

AIM:
To learn the different features provided by statsmodels package.

ALGORITHM:
1. Install the statsmodels package
2. Study all the features of statsmodels package.

Statsmodels
statsmodels is a Python module that provides classes and functions for the
estimation of many different statistical models, as well as for conducting statistical tests,
and statistical data exploration.

Features
These are the important features of statsmodels
1. Linear regression models
2. Survival analysis

1. Linear regression models

Linear regression analysis is a statistical technique for predicting the value of one
variable(dependent variable) based on the value of another(independent variable).
In simple linear regression, there’s one independent variable used to predict a single
dependent variable. In the case of multilinear regression, there’s more than one independent
variable.
The independent variable is the one you’re using to forecast the value of the other
variable. The statsmodels.regression.linear_model.OLS method is used to perform linear
regression. Linear equations are of the form:
Y=mX+C (m=slope; c=constant)
Syntax:
statsmodels.regression.linear_model.OLS(endog, exog=None, missing=’none’,
hasconst=None, **kwargs)
Parameters:
 endog: array like object.
 exog: array like object.
 missing: str. None, decrease, and raise are the available alternatives. If the value is
‘none,’ no nan testing is performed. Any observations with nans are dropped if ‘drop’
is selected. An error is raised if ‘raise’ is used. ‘none’ is the default.
 hasconst: None or Bool. Indicates whether a user-supplied constant is included in the
RHS. If True, k constant is set to 1 and all outcome statistics are calculated as if a
constant is present. If False, k constant is set to 0 and no constant is verified.
 **kwargs: When using the formula interface, additional arguments are utilised to set
model characteristics.

Step 1: Import packages.

Importing the required packages is the first step of modeling. The pandas, NumPy,
and stats model packages are imported.
import numpy as np
import pandas as pd
import statsmodels.api as
sm Step 2: Loading data
To access the CSV file click here. The CSV file is read using pandas.read_csv()
method. The head or the first five rows of the dataset is returned by using the head() method.
Head size and Brain weight are the columns.
df =
pd.read_csv('headbrain1.csv')
df.head()
Visualizing the data:
By using the matplotlib and seaborn packages, we visualize the data. sns.regplot()
function helps us create a regression plot.
# import packages
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('headbrain1.csv')
sns.regplot('Head Size(cm^3)', 'Brain Weight(grams)', data=df)
plt.show()
Step 3: Setting a hypothesis.
Null hypothesis (H0): There is no relationship between head size and brain weight.
Alternative hypothesis (Ha): There is a relationship between head size and brain
weight.
Step 4: Fitting the model
statsmodels.regression.linear_model.OLS() method is used to get ordinary least
squares, and fit() method is used to fit the data in it.
The ols method takes in the data and performs linear regression. we provide the
dependent and independent columns in this format :
inpendent_columns ~ dependent_column:
left side of the ~ operator contains the independent variables and right side of the
operator contains the name of the dependent variable or the predicted column.
df.columns = ['Head_size', 'Brain_weight']
model = sm.ols(formula='Head_size ~ Brain_weight',
data=df).fit() Step 5: Summary of the model.
All the summary statistics of the linear regression model are returned by the
model.summary() method. The p-value and many other values/statistics are known by this
method. Predictions about the data are found by the model.summary() method.
print(model.summary())
2. Survival analysis
The statsmodels.api.SurvfuncRight class can be used to estimate survival
functionsusing data that may be censored to the right. SurvfuncRight implements several
inference methods, including confidence intervals for survival quantiles, pointwise
simultaneous confidence intervals for survival functions, and plotting methods. The
duration.survdiff function provides a test procedure for comparing survival distributions.
Here we are creating a SurvfuncRight object using the data from the Moore study
available from the R dataset repository. Adjust the survival distribution for 'low' fcategory
subjects only.

Example:
# Importing libraries
import statsmodels.api as sm
X = sm.datasets.get_rdataset("Moore", "carData").data
# Filtering data of low fcategory
X = X[X['fcategory'] == "low"]
# Creating SurvfuncRight
model
model = sm.SurvfuncRight(X["conformity"], X["fscore"])
# Model Summary
model.summary()

Sample Output

Linear regression models

Survival analysis

RESULT
Thus the few important features of study statsmodels has been completed
successfully.

Chapter 4 Zica
No ratings yet
Chapter 4 Zica
62 pages
Reliability of Electric Generation With Transmission Constraints
No ratings yet
Reliability of Electric Generation With Transmission Constraints
215 pages
Data Science-Lab-080424manual With Header
No ratings yet
Data Science-Lab-080424manual With Header
78 pages
Data Science Lab (To Write)
No ratings yet
Data Science Lab (To Write)
64 pages
CS3361 - Data Science Lab Record
No ratings yet
CS3361 - Data Science Lab Record
76 pages
Fds Lab Final 2nd Year
No ratings yet
Fds Lab Final 2nd Year
75 pages
Fods (1) - Merged (1) - 1
No ratings yet
Fods (1) - Merged (1) - 1
100 pages
CS3362 - Data Science Laboratory - Manual - Final-1
No ratings yet
CS3362 - Data Science Laboratory - Manual - Final-1
76 pages
CS3361 Data Science MANUAL
No ratings yet
CS3361 Data Science MANUAL
78 pages
Random Numpy
No ratings yet
Random Numpy
29 pages
Workshop 5: PDF Sampling and Statistics: Preview: Generating Random Numbers
No ratings yet
Workshop 5: PDF Sampling and Statistics: Preview: Generating Random Numbers
10 pages
HKU - 7001 - 3.2 Managing Data II
No ratings yet
HKU - 7001 - 3.2 Managing Data II
67 pages
DSF Lab Exp Full
No ratings yet
DSF Lab Exp Full
88 pages
3 IntroToPython-PythonLibraries
No ratings yet
3 IntroToPython-PythonLibraries
36 pages
Fds Lab Record
No ratings yet
Fds Lab Record
84 pages
Statistics and Risk Modelling Using Python
No ratings yet
Statistics and Risk Modelling Using Python
99 pages
Description and The First Use of Numpy Library
No ratings yet
Description and The First Use of Numpy Library
7 pages
Group 2 Practical
No ratings yet
Group 2 Practical
9 pages
Numpy
No ratings yet
Numpy
4 pages
DS4 1
No ratings yet
DS4 1
5 pages
Value Added Course: Programming in Python and Machine Learning UNIT-2
No ratings yet
Value Added Course: Programming in Python and Machine Learning UNIT-2
41 pages
Teste 3
No ratings yet
Teste 3
3 pages
Lab Manual Fds
No ratings yet
Lab Manual Fds
44 pages
Cs3361-Data Science Lab Manual
No ratings yet
Cs3361-Data Science Lab Manual
44 pages
Final Fds Manual
No ratings yet
Final Fds Manual
77 pages
Introduction To NumPy
No ratings yet
Introduction To NumPy
27 pages
Batch2 FDS Printout
No ratings yet
Batch2 FDS Printout
38 pages
ML Lab - Manual
No ratings yet
ML Lab - Manual
15 pages
Exp-4 Abhayraj Singh
No ratings yet
Exp-4 Abhayraj Singh
11 pages
Numpy in Python
No ratings yet
Numpy in Python
7 pages
Poisson Distribution
No ratings yet
Poisson Distribution
1 page
Numpy
No ratings yet
Numpy
4 pages
Unit Vi
No ratings yet
Unit Vi
60 pages
Module 4
No ratings yet
Module 4
4 pages
CS3361 Data Science Lab Manual
No ratings yet
CS3361 Data Science Lab Manual
43 pages
Fds Record
No ratings yet
Fds Record
69 pages
Num Py
No ratings yet
Num Py
4 pages
Final Fds Manual Print
No ratings yet
Final Fds Manual Print
55 pages
Random Data Generation With NumPy
No ratings yet
Random Data Generation With NumPy
3 pages
Random Module
No ratings yet
Random Module
14 pages
ML Lab Manual
No ratings yet
ML Lab Manual
37 pages
CS3361 - Data Science
No ratings yet
CS3361 - Data Science
56 pages
23CS302 - Dslab - Experiment 1
No ratings yet
23CS302 - Dslab - Experiment 1
5 pages
MP2 Exercise 01 - Numpy Arrays
No ratings yet
MP2 Exercise 01 - Numpy Arrays
6 pages
Data Science Experiments
No ratings yet
Data Science Experiments
31 pages
Numpy Python
No ratings yet
Numpy Python
36 pages
Grace Python Numpy MB
No ratings yet
Grace Python Numpy MB
56 pages
Python Unit 4
No ratings yet
Python Unit 4
43 pages
Machine Learning Lab Word 12-1-2025. Document
No ratings yet
Machine Learning Lab Word 12-1-2025. Document
68 pages
6.lab Activity
No ratings yet
6.lab Activity
23 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
ML Programs
No ratings yet
ML Programs
41 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Numpy
No ratings yet
Numpy
15 pages
Data Science Using Python Lab Manual
No ratings yet
Data Science Using Python Lab Manual
68 pages
Grace Python Numpy MB Final
No ratings yet
Grace Python Numpy MB Final
55 pages
Numpy
No ratings yet
Numpy
5 pages
Lab Description File
No ratings yet
Lab Description File
11 pages
Unit 5 PythonPackages (Matplotlib)
No ratings yet
Unit 5 PythonPackages (Matplotlib)
24 pages
10 Numpy Functions You Should Know - by Amanda Iglesias Moreno - Towards Data Science
No ratings yet
10 Numpy Functions You Should Know - by Amanda Iglesias Moreno - Towards Data Science
14 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Io System Calls
No ratings yet
Io System Calls
2 pages
UUD Assignment 3 Answer Key
No ratings yet
UUD Assignment 3 Answer Key
9 pages
Ex 2
No ratings yet
Ex 2
1 page
UUD Assignment 3
No ratings yet
UUD Assignment 3
1 page
UUD Assignment 2
No ratings yet
UUD Assignment 2
1 page
OS Assignment 3
No ratings yet
OS Assignment 3
1 page
OS Assignment 1
No ratings yet
OS Assignment 1
1 page
Semaphores
No ratings yet
Semaphores
2 pages
Os Assignment 1 Answer Key
No ratings yet
Os Assignment 1 Answer Key
9 pages
Os Assignment 3 Answer Key
No ratings yet
Os Assignment 3 Answer Key
7 pages
OS LAB MANUAL (Integrated Lab)
No ratings yet
OS LAB MANUAL (Integrated Lab)
14 pages
Unit I OS
No ratings yet
Unit I OS
37 pages
Frequency Distributions: Essentials of Statistics For The Behavioral Sciences
No ratings yet
Frequency Distributions: Essentials of Statistics For The Behavioral Sciences
45 pages
Maxwell Boltzmann Distribution
No ratings yet
Maxwell Boltzmann Distribution
8 pages
Equations Inequalities and Partial Fractions
No ratings yet
Equations Inequalities and Partial Fractions
76 pages
Stats Probability Cheat Sheat Exam 2
100% (1)
Stats Probability Cheat Sheat Exam 2
2 pages
Assignment Class X Statistics CRPF
33% (3)
Assignment Class X Statistics CRPF
4 pages
6 Normal Probability Distributions
No ratings yet
6 Normal Probability Distributions
18 pages
Assignment 4 Questions
No ratings yet
Assignment 4 Questions
4 pages
Frequency Distribution Table Examples
100% (1)
Frequency Distribution Table Examples
5 pages
Lecture 5 - 6-260
No ratings yet
Lecture 5 - 6-260
10 pages
STAT301 Notes
No ratings yet
STAT301 Notes
168 pages
Univariate Random Variables - Probability Distributions PDF
No ratings yet
Univariate Random Variables - Probability Distributions PDF
12 pages
Reflection Paper (Chapter 3) in Statistics and Probability Jan D. Unay
No ratings yet
Reflection Paper (Chapter 3) in Statistics and Probability Jan D. Unay
3 pages
L1 - Roadmap To Passing The CFA Exam
No ratings yet
L1 - Roadmap To Passing The CFA Exam
18 pages
Percentile Worksheet
No ratings yet
Percentile Worksheet
3 pages
(Original PDF) Introductory Statistics, 9th Edition by Prem S. Mann PDF Download
100% (2)
(Original PDF) Introductory Statistics, 9th Edition by Prem S. Mann PDF Download
53 pages
Probability and Statistics: Unit - I
No ratings yet
Probability and Statistics: Unit - I
7 pages
Econ 41 Syllabus
No ratings yet
Econ 41 Syllabus
3 pages
In-Class Practices - Session 1 - Answers
No ratings yet
In-Class Practices - Session 1 - Answers
19 pages
Presentation of Data (Statistics)
No ratings yet
Presentation of Data (Statistics)
9 pages
JNTUK R20 ML UNIT-I Final
No ratings yet
JNTUK R20 ML UNIT-I Final
22 pages
QA Pastpapers 2015-2023
No ratings yet
QA Pastpapers 2015-2023
91 pages
Introduction To Model & Simulation-7th Sem
No ratings yet
Introduction To Model & Simulation-7th Sem
56 pages
Understanding Basic Statistics 8th Edition by Charles Henry
No ratings yet
Understanding Basic Statistics 8th Edition by Charles Henry
312 pages
Chapter 6 Normal
No ratings yet
Chapter 6 Normal
25 pages
413D - Business Statistics PDF
No ratings yet
413D - Business Statistics PDF
21 pages
Probability & Stats Notes
No ratings yet
Probability & Stats Notes
17 pages
Introduction To Probability Theory
100% (1)
Introduction To Probability Theory
207 pages
Continuous Random Variables and Probability Distributions
No ratings yet
Continuous Random Variables and Probability Distributions
45 pages

DS - Ex-1

Uploaded by

DS - Ex-1

Uploaded by

1(a).

Download and install the different packages like NumPy,

1(B). EXPLORE THE FEATURES OF NUMPY

1.2 Array Shape & Reshaping

 By reshaping we can add or remove dimensions or change number of elements

2.2 Normal (Gaussian) Distribution

2.3 Binomial Distribution

2.4 Poisson Distribution

2.5 Uniform Distribution

2.6 Logistic Distribution

2.7 Multinomial Distribution

2.8 Exponential Distribution

2.9 Chi Square Distribution

2.10 Rayleigh Distribution

2.11 Pareto Distribution

2.12 Zipf Distribution

3.1 Simple Arithmetic

3.2 Rounding Decimals

3.7 LCM (Lowest Common Multiple)

3.8 GCD (Greatest Common Denominator)

3.9 Trigonometric Functions

Find sine values for all of the values in arr:

3.10 Hyperbolic Functions

Find cosh values for all of the values in arr:

3.11 Set Operations

3.11.3 Finding Intersection

Mass Return the

Return the specified unit in kg

Return the specified

Return the specified unit in seconds

Forc Return the specified unit in newton

2.1 CSR(Compressed Sparse Row) Matrix

3.2 Depth First Order

Traverse the graph depth first for given adjacency matrix:

4.2 Convex Hull

Import Data from Matlab Format

6.2 Spline Interpolation

1.1 Create Labels

1.2 Key/Value Objects as Series

5.1 Info About the Data

6.1 Empty Cells

6.1.2Replace Empty Values

6.1.3Replace Using Mean, Median, or Mode

6.2 Data of Wrong Format

6.3 Fixing Wrong Data

6.4 Removing Duplicates

7.1 Scatter Plot

1. Linear regression models

Step 1: Import packages.

Linear regression models

You might also like